A Syntax-Aware Edit-based System for Text Simplification

Oscar M. Cumbicus-Pineda, Itziar Gonzalez-Dios, Aitor Soroa


Abstract
Edit-based text simplification systems have attained much attention in recent years due to their ability to produce simplification solutions that are interpretable, as well as requiring less training examples compared to traditional seq2seq systems. Edit-based systems learn edit operations at a word level, but it is well known that many of the operations performed when simplifying text are of a syntactic nature. In this paper we propose to add syntactic information into a well known edit-based system. We extend the system with a graph convolutional network module that mimics the dependency structure of the sentence, thus giving the model an explicit representation of syntax. We perform a series of experiments in English, Spanish and Italian, and report improvements of the state of the art in four out of five datasets. Further analysis shows that syntactic information is always beneficial, and suggest that syntax is more helpful in complex sentences.
Anthology ID:
2021.ranlp-1.38
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
324–334
Language:
URL:
https://aclanthology.org/2021.ranlp-1.38
DOI:
Bibkey:
Cite (ACL):
Oscar M. Cumbicus-Pineda, Itziar Gonzalez-Dios, and Aitor Soroa. 2021. A Syntax-Aware Edit-based System for Text Simplification. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 324–334, Held Online. INCOMA Ltd..
Cite (Informal):
A Syntax-Aware Edit-based System for Text Simplification (Cumbicus-Pineda et al., RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.ranlp-1.38.pdf
Data
NewselaTurkCorpusWikiLarge