Abstract
Current lexical simplification approaches rely heavily on heuristics and corpus level features that do not always align with human judgment. We create a human-rated word-complexity lexicon of 15,000 English words and propose a novel neural readability ranking model with a Gaussian-based feature vectorization layer that utilizes these human ratings to measure the complexity of any given word or phrase. Our model performs better than the state-of-the-art systems for different lexical simplification tasks and evaluation datasets. Additionally, we also produce SimplePPDB++, a lexical resource of over 10 million simplifying paraphrase rules, by applying our model to the Paraphrase Database (PPDB).- Anthology ID:
- D18-1410
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3749–3760
- Language:
- URL:
- https://aclanthology.org/D18-1410
- DOI:
- 10.18653/v1/D18-1410
- Cite (ACL):
- Mounica Maddela and Wei Xu. 2018. A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3749–3760, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification (Maddela & Xu, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/D18-1410.pdf
- Code
- mounicam/lexical_simplification