Blind Submission by November • UCTopic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining
Meta Review of Paper620 by Area Chair WTjX
The paper proposes a method to pre-train a phrase encoder using an intuition that phrases of the same entity should have similar encodings and phrases of different entities should have different encodings. Furthermore, a clustering approach known as cluster assisted contrastive learning in reducing noisy in-batch negatives were proposed for improving the negative samples. Evaluation of the encoder on entity clustering and topical phrase mining shows the effectiveness of the proposed approach compared to other methods.
The paper includes extensive experiments on the downstream tasks where the proposed model significantly outperforms other methods. In addition, a detailed discussion is also provided for different datasets to further analyse the effectiveness of the proposed method with respect to informativeness, diversity of the constructed phrases, and the source of phrase semantics. The method used for topic identification based on phrase encodings clustering works well, especially when the encoder is further fine-tuned (i.e., original fine-tuning process into a topic-specific fine-tuning process) on the task.
The explanation in one of the figures on phrase semantics determined by their context and phrases that have the same mentions have the same semantics are not very clear. As for the examples in Figure 1, the semantic of the phrase “ United States” is fixed and not influenced by the context. The authors could provide a more detailed description of how to construct the pre-training phrases, the presence of pre-training phrases, and how these phrases overlapped with the downstream tasks.
Official Review of Paper620 by Reviewer Km9g
The paper proposes a contrastive learning-based method for phrase representations and topic mining. Results show that the method achieves the best performance on topic mining and phrase representations. The proposed model can extract more diverse phrases.
- The paper applies unsupervised contrastive learning to topic modeling, which is suitable for the unsupervised task. Results show that the proposed model can extract more diverse phrases.
- The authors also find that in the finetuning process, in-batch negative samples have a bad influence on the performance. So they propose a topic-assist contrastive learning method to reduce noises and turn the original finetuning process into a topic-specific finetuning process.
- Experiemnts results show that the proposed model achieves good performance on several datasets.
- It seems that the biggest contribution of the paper is to apply contrastive learning to topic modeling, which is of limited novelty.
- Batch is a sampling method, so what is the major difference between batch and the proposed one? This limits the performance significantly.
Suggestions:
- It is a little bit confusing in the assumptions in section 1. First of all, “The phrase semantics are determined by their context.”. As for the examples in Figure 1, the semantic of phrase “ United States” is fixed and not influenced by the context. I think the writers want to express: if we mask “United States”, we can still infer the mask phrase by its context.
- Writing should be strengthened.
Official Review of Paper620 by Reviewer kjKn
This paper proposes a contrastive learning framework to learn phrase representations in an unsupervised way. Cluster-assisted contrastive learning is proposed to reduce noisy in-batch negatives by selecting negatives from clusters. Extensive experiments on entity clustering and topical phrase mining show the effectiveness of the proposed methods. Case studies are also provided that demonstrate coherent and diverse topical phrases can be founded by UCTopic without supervision.
- The paper is well-written and clearly presented.
- The proposed cluster-assisted contrastive learning objective is well-motivated and effective when finetuning the encoder on the target task further. Extensive experimental results are provided to show the significance of the proposed UCTopic versus baseline methods.
- Detailed discussion is also provided for different datasets to further analyze the effectiveness of the proposed method with respect to informativeness, diversity of the constructed phrases, and the source of phrase semantics.
- The details of how to apply K-Means methods to obtain the pseudo labels when using the CCL and how the number of clusters will affect the final performance is missing.
- Given that sentence length might affect in Table 1, additional statistics of the pre-trained sentence length versus the might be good to provide.
- Could the author provide a more detailed description of how to construct the pre-training phrases and how many pre-training phrases are present and how these phrases overlapped with the downstream tasks?
Official Review of Paper620 by Reviewer Hsyr
Problem as proposed by the paper - There is a need for good quality phrase representations for topic mining problems. Existing techniques simply combine unigrams into n-grams or rely on extensive annotations. Solution Proposed - The paper proposes a contrastive learning based approach to learn phrase representation. Paper proposes techniques for getting positive and negative samples without annotated data. Paper also provides a clustering based approach for improving the negative samples. All of these put together is called cluster assisted contrastive learning. Paper shows improvement over existing methods for entity clustering and topical phrase mining.
- Problem of learning phrase representations is very relevant to the entire field of NLP and not just for topic mining.
- Solutions proposed is novel and has potential to be expanded beyond the scope of topic mining proposed in the paper.
- Experiments are extensive (for the problems under consideration )and results show improvement over existing methods
- The assumptions made for picking positive instances ( context of the same mention will be the same ) could have been explored well.
- Paper in general and experiments in particular limits itself to a few specific topic mining problems ( entity clustering and topic clustering ). It leaves the general applicability of the technique unclear.
Description of what is fine-tuning vs pretraining is confusing. Why is UCTopic w/o CCL retraining and CCL fine-tuning. This may be unimportant in itself but it tends to confuse any further fine-tuning tasks needed for entity clustering.
Official Review of Paper620 by Reviewer jJXS
The paper proposes a method to pre-train a phrase encoder using an intuition that phrases of the same entity should have similar encodings and phrases of different entities should have different encodings. Moreover, the paper evaluates the encoder on entity clustering and topical phrase mining showing superior results compared to other methods. Finally, the paper shows a way to identify topics without supervision by clustering phrase encodings in the corpus. Furthermore, it fine-tunes the encoder to separate encodings of phrases from different clusters further from each other.
The paper includes extensive experiments on the downstream tasks where the proposed model significantly outperforms other methods. Moreover, the method to identify topics by phrase encodings clustering seems to work pretty well, especially if the encoder is further fine-tuned on the task.
I'm a little concerned about the novelty of the pre-training method for the phrase encoder.
In particular, (FitzGerald et al., 2021) have used the same intuition to train the phrase encoding for the entity linking task. In another paper, Soares et al., 2019 trained a relation extraction model with an intuition that sentences with the same pair of entities express similar relation between entities. There might be some differences between the methods, but I wonder how significant they are. Therefore, it's essential to discuss similarities and differences between these prior works in the paper.
Nicholas FitzGerald, Daniel M. Bikel, Jan A. Botha, Daniel Gillick, Tom Kwiatkowski, and Andrew McCallum. MOLEMAN: mention-only linking of entities with a mention annotation network. ACL-IJCNLP 2021. https://aclanthology.org/2021.acl-short.37.pdf
Baldini Soares, L., FitzGerald, N., Ling, J., and Kwiatkowski, T. Matching the blanks: Distributional similarity for relation learning. ACL 2019. https://aclanthology.org/P19-1279.pdf
I'm a bit confused about what in-batch negatives are for the topic modeling task. Do you mean phrases from different documents?
Supplementary Materials by Program Chairs
Supplementary Materials by Program Chairs