Abstract
We introduce three language resources for Japanese lexical simplification: 1) a large-scale word complexity lexicon, 2) the first synonym lexicon for converting complex words to simpler ones, and 3) the first toolkit for developing and benchmarking Japanese lexical simplification system. Our word complexity lexicon is expanded to a broader vocabulary using a classifier trained on a small, high-quality word complexity lexicon created by Japanese language teachers. Based on this word complexity estimator, we extracted simplified word pairs from a large-scale synonym lexicon and constructed a simplified synonym lexicon useful for lexical simplification. In addition, we developed a Python library that implements automatic evaluation and key methods in each subtask to ease the construction of a lexical simplification pipeline. Experimental results show that the proposed method based on our lexicon achieves the highest performance of Japanese lexical simplification. The current lexical simplification is mainly studied in English, which is rich in language resources such as lexicons and toolkits. The language resources constructed in this study will help advance the lexical simplification system in Japanese.- Anthology ID:
- 2020.lrec-1.381
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 3114–3120
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.381
- DOI:
- Cite (ACL):
- Daiki Nishihara and Tomoyuki Kajiwara. 2020. Word Complexity Estimation for Japanese Lexical Simplification. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3114–3120, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Word Complexity Estimation for Japanese Lexical Simplification (Nishihara & Kajiwara, LREC 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2020.lrec-1.381.pdf