Controllable Sentence Simplification
Louis Martin, Éric de la Clergerie, Benoît Sagot, Antoine Bordes
Abstract
Text simplification aims at making a text easier to read and understand by simplifying grammar and structure while keeping the underlying information identical. It is often considered an all-purpose generic task where the same simplification is suitable for all; however multiple audiences can benefit from simplified text in different ways. We adapt a discrete parametrization mechanism that provides explicit control on simplification systems based on Sequence-to-Sequence models. As a result, users can condition the simplifications returned by a model on attributes such as length, amount of paraphrasing, lexical complexity and syntactic complexity. We also show that carefully chosen values of these attributes allow out-of-the-box Sequence-to-Sequence models to outperform their standard counterparts on simplification benchmarks. Our model, which we call ACCESS (as shorthand for AudienCe-CEntric Sentence Simplification), establishes the state of the art at 41.87 SARI on the WikiLarge test set, a +1.42 improvement over the best previously reported score.- Anthology ID:
- 2020.lrec-1.577
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 4689–4698
- Language:
- English
- URL:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2020.lrec-1.577/
- DOI:
- Cite (ACL):
- Louis Martin, Éric de la Clergerie, Benoît Sagot, and Antoine Bordes. 2020. Controllable Sentence Simplification. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4689–4698, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Controllable Sentence Simplification (Martin et al., LREC 2020)
- PDF:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2020.lrec-1.577.pdf
- Code
- facebookresearch/access + additional community code
- Data
- ASSET, Newsela, TurkCorpus, WikiLarge