Learning Simplifications for Specific Target Audiences

Carolina Scarton, Lucia Specia


Abstract
Text simplification (TS) is a monolingual text-to-text transformation task where an original (complex) text is transformed into a target (simpler) text. Most recent work is based on sequence-to-sequence neural models similar to those used for machine translation (MT). Different from MT, TS data comprises more elaborate transformations, such as sentence splitting. It can also contain multiple simplifications of the same original text targeting different audiences, such as school grade levels. We explore these two features of TS to build models tailored for specific grade levels. Our approach uses a standard sequence-to-sequence architecture where the original sequence is annotated with information about the target audience and/or the (predicted) type of simplification operation. We show that it outperforms state-of-the-art TS approaches (up to 3 and 12 BLEU and SARI points, respectively), including when training data for the specific complex-simple combination of grade levels is not available, i.e. zero-shot learning.
Anthology ID:
P18-2113
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
712–718
Language:
URL:
https://aclanthology.org/P18-2113
DOI:
10.18653/v1/P18-2113
Bibkey:
Cite (ACL):
Carolina Scarton and Lucia Specia. 2018. Learning Simplifications for Specific Target Audiences. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 712–718, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Learning Simplifications for Specific Target Audiences (Scarton & Specia, ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/P18-2113.pdf
Note:
 P18-2113.Notes.pdf
Presentation:
 P18-2113.Presentation.pdf
Video:
 https://vimeo.com/285806034
Data
Newsela