EduNLP at BEA 2026 Shared Task 1: Multi-Model Ensemble with Feature-Augmented Transformers for Vocabulary Difficulty Prediction

Avinash Kumar Sharma


Abstract
We describe our system submitted to the BEA 2026 Shared Task on Vocabulary Difficulty Prediction for English Learners. Our approach combines handcrafted linguistic features with fine-tuned XLM-RoBERTa transformers in a multi-model ensemble, participating in both the closed and open tracks. Our system outperforms the baselines on both tracks across all three L1s, with best RMSEs of 1.058 (closed, CN) and 0.992 (open, CN). Post-hoc error analysis reveals that polysemous words in rare senses and nominalized -ing forms constitute the primary failure mode.
Anthology ID:
2026.bea-1.71
Volume:
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
Venues:
BEA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1024–1028
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.71/
DOI:
Bibkey:
Cite (ACL):
Avinash Kumar Sharma. 2026. EduNLP at BEA 2026 Shared Task 1: Multi-Model Ensemble with Feature-Augmented Transformers for Vocabulary Difficulty Prediction. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 1024–1028, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
EduNLP at BEA 2026 Shared Task 1: Multi-Model Ensemble with Feature-Augmented Transformers for Vocabulary Difficulty Prediction (Sharma, BEA 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.71.pdf