Shreya Ashar
2026
MoodMetric at SemEval-2026 Task 4:Narrative Story Similarity and Narrative Representation Learning
Samanvitha Bolisetty | Shreya Ashar | Nishchay Mittal | Pruthwik Mishra
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Samanvitha Bolisetty | Shreya Ashar | Nishchay Mittal | Pruthwik Mishra
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper presents our system for narrative similarity modeling in SemEval Task 4, focusing on transformer-based dense embedding approaches. Modeling similarity between long-form narratives is particularly challenging due to the need to capture event progression, causal structure, character dynamics, and thematic coherence beyond surface-level lexical overlap.We evaluate multiple pretrained encoder-only architectures, including DeBERTa-v3, BGE-Base, BGE-Large, and E5-Large, fine-tuned using triplet margin and contrastive objectives. In addition, we implement a hybrid lexical–semantic baseline combining TF-IDF and SBERT features. Our experiments analyze the impact of model scale, pooling strategies, layer freezing, training duration, and embedding-level ensembling under low-resource conditions (approximately 1,900 training triplets, with additional synthetic augmentation).Results show that larger contrastively pretrained embedding models consistently outperform smaller variants, with BGE-Large achieving the strongest standalone performance. However, performance saturates quickly, and moderate fine-tuning (4–5 epochs) yields optimal validation accuracy, while extended training leads to overfitting. Instruction-tuned embeddings do not demonstrate significant advantages over contrastively aligned alternatives for this task. Finally, arithmetic averaging of embeddings from diverse models produces the most robust representations, achieving approximately 65% validation accuracy.