A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification

Mounica Maddela, Wei Xu


Abstract
Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment. We create a human-rated word-complexity lexicon of 15,000 English words and propose a novel neural readability ranking model with a Gaussian-based feature vectorization layer that utilizes these human ratings to measure the complexity of any given word or phrase. Our model performs better than the state-of-the-art systems for different lexical simplification tasks and evaluation datasets. Additionally, we also produce SimplePPDB++, a lexical resource of over 10 million simplifying paraphrase rules, by applying our model to the Paraphrase Database (PPDB).
Anthology ID:
D18-1410
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3749–3760
Language:
URL:
https://aclanthology.org/D18-1410
DOI:
10.18653/v1/D18-1410
Bibkey:
Cite (ACL):
Mounica Maddela and Wei Xu. 2018. A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3749–3760, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification (Maddela & Xu, EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/D18-1410.pdf
Video:
 https://vimeo.com/306116474
Code
 mounicam/lexical_simplification