ALEXSIS: A Dataset for Lexical Simplification in Spanish

Daniel Ferrés, Horacio Saggion


Abstract
Lexical Simplification is the process of reducing the lexical complexity of a text by replacing difficult words with easier to read (or understand) expressions while preserving the original information and meaning. In this paper we introduce ALEXSIS, a new dataset for this task, and we use ALEXSIS to benchmark Lexical Simplification systems in Spanish. The paper describes the evaluation of three kind of approaches to Lexical Simplification, a thesaurus-based approach, a single transformers-based approach, and a combination of transformers. We also report state of the art results on a previous Lexical Simplification dataset for Spanish.
Anthology ID:
2022.lrec-1.383
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3582–3594
Language:
URL:
https://aclanthology.org/2022.lrec-1.383
DOI:
Bibkey:
Cite (ACL):
Daniel Ferrés and Horacio Saggion. 2022. ALEXSIS: A Dataset for Lexical Simplification in Spanish. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3582–3594, Marseille, France. European Language Resources Association.
Cite (Informal):
ALEXSIS: A Dataset for Lexical Simplification in Spanish (Ferrés & Saggion, LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.383.pdf
Code
 lastus-taln-upf/alexsis