Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages
Xu Han, Yuqi Luo, Weize Chen, Zhiyuan Liu, Maosong Sun, Zhou Botong, Hao Fei, Suncong Zheng
Abstract
Fine-grained entity typing (FGET) aims to classify named entity mentions into fine-grained entity types, which is meaningful for entity-related NLP tasks. For FGET, a key challenge is the low-resource problem — the complex entity type hierarchy makes it difficult to manually label data. Especially for those languages other than English, human-labeled data is extremely scarce. In this paper, we propose a cross-lingual contrastive learning framework to learn FGET models for low-resource languages. Specifically, we use multi-lingual pre-trained language models (PLMs) as the backbone to transfer the typing knowledge from high-resource languages (such as English) to low-resource languages (such as Chinese). Furthermore, we introduce entity-pair-oriented heuristic rules as well as machine translation to obtain cross-lingual distantly-supervised data, and apply cross-lingual contrastive learning on the distantly-supervised data to enhance the backbone PLMs. Experimental results show that by applying our framework, we can easily learn effective FGET models for low-resource languages, even without any language-specific human-labeled data. Our code is also available at https://github.com/thunlp/CrossET.- Anthology ID:
- 2022.acl-long.159
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2241–2250
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.159
- DOI:
- 10.18653/v1/2022.acl-long.159
- Cite (ACL):
- Xu Han, Yuqi Luo, Weize Chen, Zhiyuan Liu, Maosong Sun, Zhou Botong, Hao Fei, and Suncong Zheng. 2022. Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2241–2250, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages (Han et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2022.acl-long.159.pdf
- Code
- thunlp/crosset
- Data
- Few-NERD, Open Entity