uogal at BEA 2026 Shared Task 1: Ensemble of Multilingual Encoders with NMT Augmentation for L1-Aware Vocabulary Difficulty Prediction

Bernardo Stearns, John P. McCrae, Thomas Gaillat, Jefkine Kafunah


Abstract
We submit a system for the closed track of the BEA 2026 shared task on L1-aware vocabulary difficulty prediction (Spanish, German, Mandarin Chinese). We compared three families of approaches: hand-crafted tabular features with tree-based regressors, fine-tuned multilingual encoders, and decoder-based artificial learner simulation using LoRA-tuned Pythia models, each evaluated with and without NMT-augmented English context. Our best system is an ensemble of four base and four NMT-augmented multilingual encoders combined through per-language stacking (Nelder-Mead and ElasticNet meta-learner), which placed 2nd in the closed track across all three languages. We also report a monotonic scaling study of the decoder-based artificial learner simulation pipeline.
Anthology ID:
2026.bea-1.75
Volume:
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
Venues:
BEA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1065–1076
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.75/
DOI:
Bibkey:
Cite (ACL):
Bernardo Stearns, John P. McCrae, Thomas Gaillat, and Jefkine Kafunah. 2026. uogal at BEA 2026 Shared Task 1: Ensemble of Multilingual Encoders with NMT Augmentation for L1-Aware Vocabulary Difficulty Prediction. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 1065–1076, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
uogal at BEA 2026 Shared Task 1: Ensemble of Multilingual Encoders with NMT Augmentation for L1-Aware Vocabulary Difficulty Prediction (Stearns et al., BEA 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.75.pdf