Data-Driven Sentence Simplification: Survey and Benchmark

Fernando Alva-Manchego, Carolina Scarton, Lucia Specia


Abstract
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand. In order to do so, several rewriting transformations can be performed such as replacement, reordering, and splitting. Executing these transformations while keeping sentences grammatical, preserving their main idea, and generating simpler output, is a challenging and still far from solved problem. In this article, we survey research on SS, focusing on approaches that attempt to learn how to simplify using corpora of aligned original-simplified sentence pairs in English, which is the dominant paradigm nowadays. We also include a benchmark of different approaches on common data sets so as to compare them and highlight their strengths and limitations. We expect that this survey will serve as a starting point for researchers interested in the task and help spark new ideas for future developments.
Anthology ID:
2020.cl-1.4
Volume:
Computational Linguistics, Volume 46, Issue 1 - March 2020
Month:
Year:
2020
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
135–187
Language:
URL:
https://aclanthology.org/2020.cl-1.4
DOI:
10.1162/coli_a_00370
Bibkey:
Cite (ACL):
Fernando Alva-Manchego, Carolina Scarton, and Lucia Specia. 2020. Data-Driven Sentence Simplification: Survey and Benchmark. Computational Linguistics, 46(1):135–187.
Cite (Informal):
Data-Driven Sentence Simplification: Survey and Benchmark (Alva-Manchego et al., CL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2020.cl-1.4.pdf