Jinnie Shin

2026

Jinnie’s Lab at BEA 2026 Shared Task 1: Precalibration of Vocabulary Item Difficulty with Multilingual Transformers and Multi-Task Learning
Zhe Li | Pauline Aguinalde | Jinnie Shin
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)

This paper describes our submission to the BEA 2026 shared task 1 on vocabulary item difficulty prediction in multilingual settings. We investigated whether transformer-based representations learned directly from item content can support the prediction of vocabulary item difficulty across different L1 groups. Our approach adopted a multilingual BERT-based architecture, specifically the mmBERT, with representation augmentation at both the layer and token levels, followed by a multi-task cascade learning that incorporates part-of-speech information as an auxiliary structural signal. Results showed that multi-task mmBERT consistently outperforms the shared-task XLM-RoBERTa baseline across languages, while gains from more complex aggregation are not uniform. The findings showed that strong multilingual representations provide a competitive foundation for vocabulary item difficulty prediction, while the benefits of additional architectural complexity depend on the language and training setting.

Co-authors

Pauline Aguinalde 1
Zhe Li 1

Venues

BEA1
WS1

Fix author