Ana Meštrović


2026

This paper presents the SemTechLab system submitted to SemEval-2026 Task 5: Rating Plausibility of Word Senses in Ambiguous Sentences through Narrative Understanding. The task involves predicting the plausibility of a specific word sense given a short story context. Our approach (HINTS) utilizes a hybrid Transformer architecture based on nli-mpnet-base-v2. Unlike standard Cross-Encoders that rely solely on the [CLS] token, HINTS extracts span-specific embeddings for the target homonym from both the narrative context and the sense definition. We compute interaction features (concatenation, difference, and element-wise product) between these spans to explicitly model the semantic alignment between the story and the proposed sense. The model is trained using Kullback-Leibler Divergence to predict the full distribution of human ratings. For the official submission phase, scores were rounded to integers (1–5). However, subsequent analysis and ablation studies detailed in this paper utilize continuous (float) scores derived from the expected value for improved metric sensitivity. On the test set, our best configuration, which relies exclusively on local homonym features, achieved a Spearman correlation of 0.603 and an accuracy of 75.8%.

2024

The choice of plausible alternatives (COPA) task requires selecting the most plausible outcome from two choices based on understanding the causal relationships presented in a given text.This paper outlines several approaches and model adaptation strategies to the VarDial 2024 DIALECT-COPA shared task, focusing on causal commonsense reasoning in South-Slavic dialects. We utilize and evaluate the GPT-4 model in combination with various prompts engineering and the Retrieval-Augmented Generation (RAG) technique. Initially, we test and compare the performance of GPT-4 with simple and advanced prompts on the COPA task across three dialects: Cerkno, Chakavian and Torlak. Next, we enhance prompts using the RAG technique specifically for the Chakavian and Cerkno dialect. This involves creating an extended Chakavian-English and Cerkno-Slovene lexical dictionary and integrating it into the prompts. Our findings indicate that the most complex approach, which combines an advanced prompt with an injected dictionary, yields the highest performance on the DIALECT-COPA task.