Eleale Tee


2026

This paper presents a lexicon-augmentedRoBERTa system for the SemEval-2026 Task2 valence–arousal regression challenge. Themodel integrates deep contextual embeddingswith a 6-dimensional feature vector derivedfrom the NRC VAD lexicon, achieving a hightoken coverage rate of 72.05%. Under officialuser-aware evaluation, the system reached acompetitive average composite correlation of0.547, significantly outperforming the ridgeregressionbaseline. The system demonstratedparticular robustness in valence (r = 0.656)and achieved strong generalization to unseenusers (rarousal = 0.519). These findings indicatethat lightweight lexicon-based statisticsprovide valuable complementary cues for longitudinalemotion modeling in modern transformerarchitectures.
Search
Co-authors
    Venues
    Fix author