Khaleesiyali at SemEval-2026 Task 2: Lexicon-Augmented RoBERTa for Valence–Arousal Regression on Ecological Essays

Eleale Tee


Abstract
This paper presents a lexicon-augmentedRoBERTa system for the SemEval-2026 Task2 valence–arousal regression challenge. Themodel integrates deep contextual embeddingswith a 6-dimensional feature vector derivedfrom the NRC VAD lexicon, achieving a hightoken coverage rate of 72.05%. Under officialuser-aware evaluation, the system reached acompetitive average composite correlation of0.547, significantly outperforming the ridgeregressionbaseline. The system demonstratedparticular robustness in valence (r = 0.656)and achieved strong generalization to unseenusers (rarousal = 0.519). These findings indicatethat lightweight lexicon-based statisticsprovide valuable complementary cues for longitudinalemotion modeling in modern transformerarchitectures.
Anthology ID:
2026.semeval-1.75
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
522–527
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.75/
DOI:
Bibkey:
Cite (ACL):
Eleale Tee. 2026. Khaleesiyali at SemEval-2026 Task 2: Lexicon-Augmented RoBERTa for Valence–Arousal Regression on Ecological Essays. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 522–527, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Khaleesiyali at SemEval-2026 Task 2: Lexicon-Augmented RoBERTa for Valence–Arousal Regression on Ecological Essays (Tee, SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.75.pdf
Supplementarymaterial:
 2026.semeval-1.75.SupplementaryMaterial.tex