Weighted Contrastive Learning With False Negative Control to Help Long-tailed Product Classification
Abstract
Item categorization (IC) aims to classify product descriptions into leaf nodes in a categorical taxonomy, which is a key technology used in a wide range of applications. Along with the fact that most datasets often has a long-tailed distribution, classification performances on tail labels tend to be poor due to scarce supervision, causing many issues in real-life applications. To address IC task’s long-tail issue, K-positive contrastive loss (KCL) is proposed on image classification task and can be applied on the IC task when using text-based contrastive learning, e.g., SimCSE. However, one shortcoming of using KCL has been neglected in previous research: false negative (FN) instances may harm the KCL’s representation learning. To address the FN issue in the KCL, we proposed to re-weight the positive pairs in the KCL loss with a regularization that the sum of weights should be constrained to K+1 as close as possible. After controlling FN instances with the proposed method, IC performance has been further improved and is superior to other LT-addressing methods.- Anthology ID:
- 2023.acl-industry.55
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 574–580
- Language:
- URL:
- https://aclanthology.org/2023.acl-industry.55
- DOI:
- Cite (ACL):
- Tianqi Wang, Lei Chen, Xiaodan Zhu, Younghun Lee, and Jing Gao. 2023. Weighted Contrastive Learning With False Negative Control to Help Long-tailed Product Classification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 574–580, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Weighted Contrastive Learning With False Negative Control to Help Long-tailed Product Classification (Wang et al., ACL 2023)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2023.acl-industry.55.pdf