Ilinca Vandici
2026
Multi-Label Polarization Classification with twHIN-BERT and SCUT Threshold Optimization
Ilinca Vandici | Ådne Jøssing | Lukas Viestädt
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Ilinca Vandici | Ådne Jøssing | Lukas Viestädt
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Tackling task 2, we fine tune a BERT-style encoder with classification heads added on top. We first try out different pre-trained encoder models, before settling on the Twhin-bert multilingual model, since its pretraining corpus (mainly tweets) provides a suitable starting point for our task. To resolve the issue of diverging label annotation styles, we apply the S-Cut algorithm, in order to calibrate thresholds for label selection, and examine its impact. We take a look at the resulting hidden representations in a reduced dimensional space, and examine the linguistic information encoded by our model after fine-tuning using linguistic probing.
2024
Team Bolaca at SemEval-2024 Task 6: Sentence-transformers are all you need
Béla Rösener | Hong-bo Wei | Ilinca Vandici
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Béla Rösener | Hong-bo Wei | Ilinca Vandici
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Our team tackled the SemEval-2024 Task 6, focusing on identifying fluent over-generation hallucinations in NLP outputs. We proposed a pragmatic solution using a logistic regression classifier and a feed-forward ANN, harnessing SBERT embeddings for feature extraction.