Text Simplification Tools for Spanish

Stefan Bott, Horacio Saggion, Simon Mille


Abstract
In this paper we describe the development of a text simplification system for Spanish. Text simplification is the adaptation of a text to the special needs of certain groups of readers, such as language learners, people with cognitive difficulties and elderly people, among others. There is a clear need for simplified texts, but manual production and adaptation of existing texts is labour intensive and costly. Automatic simplification is a field which attracts growing attention in Natural Language Processing, but, to the best of our knowledge, there are no simplification tools for Spanish. We present a prototype for automatic simplification, which shows that the most important structural simplification operations can be successfully treated with an approach based on rules which can potentially be improved by statistical methods. For the development of this prototype we carried out a corpus study which aims at identifying the operations a text simplification system needs to carry out in order to produce an output similar to what human editors produce when they simplify texts.
Anthology ID:
L12-1446
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1665–1671
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/762_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Stefan Bott, Horacio Saggion, and Simon Mille. 2012. Text Simplification Tools for Spanish. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1665–1671, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Text Simplification Tools for Spanish (Bott et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/762_Paper.pdf