Multitasking Framework for Unsupervised Simple Definition Generation
Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, Erhong Yang
Abstract
The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner’s dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.- Anthology ID:
- 2022.acl-long.409
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5934–5943
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.409
- DOI:
- 10.18653/v1/2022.acl-long.409
- Cite (ACL):
- Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, and Erhong Yang. 2022. Multitasking Framework for Unsupervised Simple Definition Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5934–5943, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Multitasking Framework for Unsupervised Simple Definition Generation (Kong et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.acl-long.409.pdf
- Code
- blcuicall/simpdefiner + additional community code