Tatiana Khaidukova
2026
Narrative Team at SemEval-2026 Task 4: Two-Stage Contrastive Learning for Narrative Similarity Assessment
Tatiana Khaidukova | Ana Ciobanu | Daniela Gifu | Diana Trandabat
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Tatiana Khaidukova | Ana Ciobanu | Daniela Gifu | Diana Trandabat
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
For SemEval-2026 Task 4, we introduce a unified two-stage framework based on a RoBERTa-large encoder. Stage 1 performs contrastive pre-training on synthetic triplets to learn general narrative similarity patterns. Stage 2 fine-tunes the model with a ranking-based objective tailored to Track A.The resulting encoder supports both binary similarity classification (Track A) and narrative embedding generation (Track B) without architectural changes. Our system achieves an accuracy of 0.64 on Track A and 0.69 on Track B, outperforming single-stage baselines and demonstrating that combining synthetic contrastive supervision with task-specific ranking yields stable and reusable narrative representations.
Narrative Team at SemEval-2026 Task 5: Rating Plausibility of Word Senses in Ambiguous Sentences through Narrative Understanding
Valentin Istrate | Mocanu Octavian | Tatiana Khaidukova
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Valentin Istrate | Mocanu Octavian | Tatiana Khaidukova
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes our system for SemEval-2026 Task 5, which focuses on predicting the plausibility of word senses in ambiguous narrative contexts. The task requires assigning a real-valued plausibility score to candidate word senses based on aggregated human judgments. Our approach compares two modeling paradigms: (i) a pretrained transformer-based regression model using DistilBERT fine-tuned on the task data, and (ii) a lightweight neural baseline based on a bidirectional LSTM trained either from scratch or initialized with GloVe embeddings. Input representations combine a candidate sense definition with the narrative context and target sentence, separated by a special token. On the official test set, the DistilBERT model achieves the strongest result among our submissions, with an Acc@SD score of 0.54 and Spearman correlation of 0.17, while the best BiLSTM submission reaches 0.52 Acc@SD and 0.02 Spearman correlation. Although DistilBERT performs best in our experiments, the recurrent baseline remains competitive under the tolerance-based metric. We discuss model variants, reproducibility details, and limitations of our analysis.