RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models
Ignacio Sastre, Leandro Alfonso, Facundo Fleitas, Federico Gil, Andrés Lucas, Tomás Spoturno, Santiago Góngora, Aiala Rosá, Luis Chiruzzo
Abstract
In this paper we present the participation of the RETUYT-INCO team at the BEA-MLSP 2024 shared task. We followed different approaches, from Multilayer Perceptron models with word embeddings to Large Language Models fine-tuned on different datasets: already existing, crowd-annotated, and synthetic.Our best models are based on fine-tuning Mistral-7B, either with a manually annotated dataset or with synthetic data.- Anthology ID:
- 2024.bea-1.56
- Volume:
- Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Ekaterina Kochmar, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 618–626
- Language:
- URL:
- https://aclanthology.org/2024.bea-1.56
- DOI:
- Cite (ACL):
- Ignacio Sastre, Leandro Alfonso, Facundo Fleitas, Federico Gil, Andrés Lucas, Tomás Spoturno, Santiago Góngora, Aiala Rosá, and Luis Chiruzzo. 2024. RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models. In Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), pages 618–626, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models (Sastre et al., BEA 2024)
- PDF:
- https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.bea-1.56.pdf