TOEBM at BEA 2026 Shared Task 1: Improving Lexical Difficulty Prediction with Context-Aligned Contrastive Learning and Ridge Ensembling
wicaksono M., Joanito Lopo, Tsamarah Nugraha, Ahmad Adi, Muhamad Nurfajri
Abstract
Lexical difficulty prediction is a fundamental problem in language learning and readability assessment, requiring models to estimate word difficulty across different first-language (L1) backgrounds. However, existing approaches rely on regression-only training with scalar supervision, which does not explicitly structure the representation space, limiting their ability to capture cross-lingual alignment and ordinal difficulty. To mitigate these issues, we propose Context-Aligned Contrastive Regression, which integrates Ridge regression ensemble with two complementary objectives, i.e., Cross-View Context and Ordinal Soft Contrastive Learning. Experiments on three L1 datasets show that (i) contrastive objectives improve cross-lingual representation alignment while preserving language-specific nuances, (ii) the learned representations capture the ordinal structure of lexical difficulty, and (iii) the ensemble effectively mitigates systematic biases of individual models, leading to more stable performance across difficulty levels.- Anthology ID:
- 2026.bea-1.81
- Volume:
- Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
- Venues:
- BEA | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1124–1136
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.81/
- DOI:
- Cite (ACL):
- wicaksono M., Joanito Lopo, Tsamarah Nugraha, Ahmad Adi, and Muhamad Nurfajri. 2026. TOEBM at BEA 2026 Shared Task 1: Improving Lexical Difficulty Prediction with Context-Aligned Contrastive Learning and Ridge Ensembling. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 1124–1136, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- TOEBM at BEA 2026 Shared Task 1: Improving Lexical Difficulty Prediction with Context-Aligned Contrastive Learning and Ridge Ensembling (M. et al., BEA 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.81.pdf