Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories
Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu
Abstract
Word Sense Disambiguation (WSD) aims to automatically identify the exact meaning of one word according to its context. Existing supervised models struggle to make correct predictions on rare word senses due to limited training data and can only select the best definition sentence from one predefined word sense inventory (e.g., WordNet). To address the data sparsity problem and generalize the model to be independent of one predefined inventory, we propose a gloss alignment algorithm that can align definition sentences (glosses) with the same meaning from different sense inventories to collect rich lexical knowledge. We then train a model to identify semantic equivalence between a target word in context and one of its glosses using these aligned inventories, which exhibits strong transfer capability to many WSD tasks. Experiments on benchmark datasets show that the proposed method improves predictions on both frequent and rare word senses, outperforming prior work by 1.2% on the All-Words WSD Task and 4.3% on the Low-Shot WSD Task. Evaluation on WiC Task also indicates that our method can better capture word meanings in context.- Anthology ID:
- 2021.emnlp-main.610
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7741–7751
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.610
- DOI:
- 10.18653/v1/2021.emnlp-main.610
- Cite (ACL):
- Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, and Dong Yu. 2021. Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7741–7751, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories (Yao et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2021.emnlp-main.610.pdf
- Code
- tencent-ailab/EMNLP21_SemEq + additional community code
- Data
- WiC, Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison