Roko Šimpraga


2026

We tackle classifying evasive political answerswithin the context of SemEval-2026 Task 6 andcompare three modeling strategies: a flat base-line, a hierarchical cascade, and a multitasklearning approach. Our experiments demon-strate that a hierarchical RoBERTa-base modelachieves the best performance, particularly byleveraging the distinctiveness of the class ClearNon-Reply. Conversely, we find that stan-dard multitask learning frequently producesstructurally invalid label combinations in a sig-nificant fraction of predictions. Our demon-strations show that applying a constrained in-ference mask eliminates these errors entirelywhile improving F1 performance, whereas afully joint training approach underperforms dueto data sparsity. Finally, we employ datasetcartography to compare training dynamics be-tween the hierarchical and multitask approach.