Veronika Bajt
2026
Thesis Proposal: Measuring Prejudice at Scale
Zoran Fijavž | Senja Pollak | Veronika Bajt
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Zoran Fijavž | Senja Pollak | Veronika Bajt
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
This thesis proposal addresses methodological gaps in applying NLP to social science by shifting from categorical classification to comparative scaling of grounded constructs. We first extend predictive capacity on existing specialized political datasets with prompt optimization and distillation approaches. We then develop an active learning framework for efficient comparative annotation to scale latent dimensions from large corpora. Finally, we apply this pipeline to measure benevolent sexism in Slovenian media and migration threat perception in parliamentary discourse. This work establishes a scalable workflow for moving NLP from ad-hoc classification to theoretically grounded comparative measurement.
2024
Comparing News Framing of Migration Crises using Zero-Shot Classification
Nikola Ivačič | Matthew Purver | Fabienne Lind | Senja Pollak | Hajo Boomgaarden | Veronika Bajt
Proceedings of the First Workshop on Reference, Framing, and Perspective @ LREC-COLING 2024
Nikola Ivačič | Matthew Purver | Fabienne Lind | Senja Pollak | Hajo Boomgaarden | Veronika Bajt
Proceedings of the First Workshop on Reference, Framing, and Perspective @ LREC-COLING 2024
We present an experiment on classifying news frames in a language unseen by the learner, using zero-shot cross-lingual transfer learning. We used two pre-trained multilingual Transformer Encoder neural network models and tested with four specific news frames, investigating two approaches to the resulting multi-label task: Binary Relevance (treating each frame independently) and Label Power-set (predicting each possible combination of frames). We train our classifiers on an available annotated multilingual migration news dataset and test on an unseen Slovene language migration news corpus, first evaluating performance and then using the classifiers to analyse how media framed the news during the periods of Syria and Ukraine conflict-related migrations.