Cross-lingual Short-text Entity Linking: Generating Features for Neuro-Symbolic Methods

Qiuhao Lu, Sairam Gurajada, Prithviraj Sen, Lucian Popa, Dejing Dou, Thien Nguyen


Abstract
Entity linking (EL) on short text is crucial for a variety of industrial applications. Compared with general long-text EL, short-text EL poses particular challenges as the limited context restricts the clues one can leverage to disambiguate textual mentions. On the other hand, existing studies mostly focus on black-box neural methods and thus lack interpretability, which is critical to industrial applications in certain areas. In this study, we extend upon LNN-EL, a monolingual short-text EL method based on interpretable first-order logic, by incorporating three sets of multilingual features to enable disambiguating mentions written in languages other than English. More specifically, we use multilingual autoencoding language models (i.e., mBERT) to capture the similarities between the mention with its context and the candidate entity; we use multilingual sequence-to-sequence language models (i.e., mBART and mT5) to represent the likelihood of the text given the candidate entity. We also propose a word-level context feature to capture the semantic evidence of the co-occurring mentions. We evaluate the proposed xLNN-EL approach on the QALD-9-multilingual dataset and demonstrate the cross-linguality of the model and the effectiveness of the features.
Anthology ID:
2022.dash-1.2
Volume:
Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Venue:
DaSH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8–14
Language:
URL:
https://aclanthology.org/2022.dash-1.2
DOI:
Bibkey:
Cite (ACL):
Qiuhao Lu, Sairam Gurajada, Prithviraj Sen, Lucian Popa, Dejing Dou, and Thien Nguyen. 2022. Cross-lingual Short-text Entity Linking: Generating Features for Neuro-Symbolic Methods. In Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances), pages 8–14, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Cross-lingual Short-text Entity Linking: Generating Features for Neuro-Symbolic Methods (Lu et al., DaSH 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.dash-1.2.pdf