HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French
Amalia Todirascu, Rodrigo Wilkens, Eva Rolin, Thomas François, Delphine Bernhard, Núria Gala
Abstract
Reducing the complexity of texts by applying an Automatic Text Simplification (ATS) system has been sparking interest inthe area of Natural Language Processing (NLP) for several years and a number of methods and evaluation campaigns haveemerged targeting lexical and syntactic transformations. In recent years, several studies exploit deep learning techniques basedon very large comparable corpora. Yet the lack of large amounts of corpora (original-simplified) for French has been hinderingthe development of an ATS tool for this language. In this paper, we present our system, which is based on a combination ofmethods relying on word embeddings for lexical simplification and rule-based strategies for syntax and discourse adaptations. We present an evaluation of the lexical, syntactic and discourse-level simplifications according to automatic and humanevaluations. We discuss the performances of our system at the lexical, syntactic, and discourse levels- Anthology ID:
- 2022.lrec-1.493
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 4620–4630
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.493
- DOI:
- Cite (ACL):
- Amalia Todirascu, Rodrigo Wilkens, Eva Rolin, Thomas François, Delphine Bernhard, and Núria Gala. 2022. HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4620–4630, Marseille, France. European Language Resources Association.
- Cite (Informal):
- HECTOR: A Hybrid TExt SimplifiCation TOol for Raw Texts in French (Todirascu et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.493.pdf