Xabier Arregi
Also published as: X. Arregi, X Arregi
2026
Automatic Essay Scoring and Feedback Generation in Basque Language Learning
Ekhi Azurmendi | Xabier Arregi | Oier Lopez de Lacalle
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Ekhi Azurmendi | Xabier Arregi | Oier Lopez de Lacalle
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This paper introduces the first publicly available dataset for Automatic Essay Scoring (AES) and feedback generation in Basque, targeting the CEFR C1 proficiency level. The dataset comprises 3,200 essays from HABE, each annotated by expert evaluators with criterion specific scores covering correctness, richness, coherence, cohesion, and task alignment enriched with detailed feedback and error examples. We fine-tune open-source models, including RoBERTa-EusCrawl and Latxa 8B/70B, for scoring. We focused on correctness criteria for the explanation generation, adapting Latxa to correctly predict both, scores and explanations. Our experiments show that encoder models remain highly reliable for AES, while supervised fine-tuning (SFT) of Latxa significantly enhances performance, surpassing state-of-the-art (SoTA) closed-source systems such as GPT-5 and Claude Sonnet 4.5 in scoring consistency and feedback quality. We also propose a novel evaluation methodology for assessing feedback generation, combining automatic consistency metrics with expert-based validation of extracted learner errors. Results demonstrate that the fine-tuned Latxa model produces criterion-aligned, pedagogically meaningful feedback and identifies a wider range of error types than proprietary models. This resource and benchmark establish a foundation for transparent, reproducible, and educationally grounded NLP research in low-resource languages such as Basque. The dataset, models and manual evalution results are available here: https://huggingface.co/collections/EkhiAzur/habe-hitz-c1
2017
Enriching Basque Coreference Resolution System using Semantic Knowledge sources
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza
Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)
In this paper we present a Basque coreference resolution system enriched with semantic knowledge. An error analysis carried out revealed the deficiencies that the system had in resolving coreference cases in which semantic or world knowledge is needed. We attempt to improve the deficiencies using two semantic knowledge sources, specifically Wikipedia and WordNet.
2016
Coreference Resolution for the Basque Language with BART
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza | Mijail Kabadjov | Massimo Poesio
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2016)
Ander Soraluze | Olatz Arregi | Xabier Arregi | Arantza Díaz de Ilarraza | Mijail Kabadjov | Massimo Poesio
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2016)
2011
Recognition and Classification of Numerical Entities in Basque
Ander Soraluze | Iñaki Alegria | Olatz Ansa | Olatz Arregi | Xabier Arregi
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011
Ander Soraluze | Iñaki Alegria | Olatz Ansa | Olatz Arregi | Xabier Arregi
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011
Query Expansion for IR using Knowledge-Based Relatedness
Arantxa Otegi | Xabier Arregi | Eneko Agirre
Proceedings of 5th International Joint Conference on Natural Language Processing
Arantxa Otegi | Xabier Arregi | Eneko Agirre
Proceedings of 5th International Joint Conference on Natural Language Processing
2010
Document Expansion Based on WordNet for Robust IR
Eneko Agirre | Xabier Arregi | Arantxa Otegi
Coling 2010: Posters
Eneko Agirre | Xabier Arregi | Arantxa Otegi
Coling 2010: Posters
2008
Strategies for sustainable MT for Basque: incremental design, reusability, standardization and open-source
I. Alegria | X. Arregi | X. Artola | A. Diaz de Ilarraza | G. Labaka | M. Lersundi | A. Mayor | K. Sarasola
Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages
I. Alegria | X. Arregi | X. Artola | A. Diaz de Ilarraza | G. Labaka | M. Lersundi | A. Mayor | K. Sarasola
Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages
2000
A Word-level Morphosyntactic Analyzer for Basque
I. Aduriz | E. Agirre | I. Aldezabal | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
I. Aduriz | E. Agirre | I. Aldezabal | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
A word-grammar based morphological analyzer for agglutinative languages
I. Aduriz | E. Agirre | I. Aldezabal | I. Alegria | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
I. Aduriz | E. Agirre | I. Aldezabal | I. Alegria | X. Arregi | J. M. Arriola | X. Artola | K. Gojenola | A. Maritxalar | K. Sarasola | M. Urkia
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics
1994
Lexical, Knowledge Representation in an Intelligent Dictionary Help System
E. Agirre | X. Arregi | X. Artola | A. Diaz de Ilarraza | K. Sarasola
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics
E. Agirre | X. Arregi | X. Artola | A. Diaz de Ilarraza | K. Sarasola
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics
1993
A Morphological Analysis Based Method for Spelling Correction
I. Aduriz | E. Agirre | I. Alegria | X. Arregi | J.M Arriola | X. Artola | A. Diaz de Ilarraza | N. Ezeiza | M. Maritxalar | K. Sarasola | M. Urkia
Sixth Conference of the European Chapter of the Association for Computational Linguistics
I. Aduriz | E. Agirre | I. Alegria | X. Arregi | J.M Arriola | X. Artola | A. Diaz de Ilarraza | N. Ezeiza | M. Maritxalar | K. Sarasola | M. Urkia
Sixth Conference of the European Chapter of the Association for Computational Linguistics
1992
Search
Fix author
Co-authors
- Eneko Agirre 7
- Xabier Artola 6
- Arantza Díaz de Ilarraza 6
- Kepa Sarasola 6
- Iñaki Alegría 5
- Miriam Urkia 4
- Itziar Aduriz 3
- Olatz Arregi 3
- Jose Mari Arriola 3
- Ander Soraluze 3
- Izaskun Aldezabal 2
- Koldo Gojenola 2
- Alberto Maritxalar 2
- Montse Maritxalar 2
- Arantxa Otegi 2
- Olatz Ansa 1
- Ekhi Azurmendi 1
- Nerea Ezeiza 1
- Mijail Kabadjov 1
- Gorka Labaka 1
- Mikel Lersundi 1
- Oier Lopez de Lacalle 1
- Aingeru Mayor 1
- Massimo Poesio 1