Joint Representation Learning of Cross-lingual Words and Entities via Attentive Distant Supervision

Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Chengjiang Li, Xu Chen, Tiansi Dong


Abstract
Jointly representation learning of words and entities benefits many NLP tasks, but has not been well explored in cross-lingual settings. In this paper, we propose a novel method for joint representation learning of cross-lingual words and entities. It captures mutually complementary knowledge, and enables cross-lingual inferences among knowledge bases and texts. Our method does not require parallel corpus, and automatically generates comparable data via distant supervision using multi-lingual knowledge bases. We utilize two types of regularizers to align cross-lingual words and entities, and design knowledge attention and cross-lingual attention to further reduce noises. We conducted a series of experiments on three tasks: word translation, entity relatedness, and cross-lingual entity linking. The results, both qualitative and quantitative, demonstrate the significance of our method.
Anthology ID:
D18-1021
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
227–237
Language:
URL:
https://aclanthology.org/D18-1021
DOI:
10.18653/v1/D18-1021
Bibkey:
Cite (ACL):
Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Chengjiang Li, Xu Chen, and Tiansi Dong. 2018. Joint Representation Learning of Cross-lingual Words and Entities via Attentive Distant Supervision. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 227–237, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Joint Representation Learning of Cross-lingual Words and Entities via Attentive Distant Supervision (Cao et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/D18-1021.pdf