Multi-Word Lexical Simplification

Piotr Przybyła, Matthew Shardlow


Abstract
In this work we propose the task of multi-word lexical simplification, in which a sentence in natural language is made easier to understand by replacing its fragment with a simpler alternative, both of which can consist of many words. In order to explore this new direction, we contribute a corpus (MWLS1), including 1462 sentences in English from various sources with 7059 simplifications provided by human annotators. We also propose an automatic solution (Plainifier) based on a purpose-trained neural language model and evaluate its performance, comparing to human and resource-based baselines.
Anthology ID:
2020.coling-main.123
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1435–1446
Language:
URL:
https://aclanthology.org/2020.coling-main.123
DOI:
10.18653/v1/2020.coling-main.123
Bibkey:
Cite (ACL):
Piotr Przybyła and Matthew Shardlow. 2020. Multi-Word Lexical Simplification. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1435–1446, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Multi-Word Lexical Simplification (Przybyła & Shardlow, COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2020.coling-main.123.pdf
Code
 piotrmp/mwls1 +  additional community code