Multitasking Framework for Unsupervised Simple Definition Generation

Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, Erhong Yang


Abstract
The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner’s dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.
Anthology ID:
2022.acl-long.409
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5934–5943
Language:
URL:
https://aclanthology.org/2022.acl-long.409
DOI:
10.18653/v1/2022.acl-long.409
Bibkey:
Cite (ACL):
Cunliang Kong, Yun Chen, Hengyuan Zhang, Liner Yang, and Erhong Yang. 2022. Multitasking Framework for Unsupervised Simple Definition Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5934–5943, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Multitasking Framework for Unsupervised Simple Definition Generation (Kong et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.acl-long.409.pdf
Software:
 2022.acl-long.409.software.zip
Code
 blcuicall/simpdefiner +  additional community code