Alexander Fraser

Papers on this page may belong to the following people: Alexander Fraser, Alexander Fraser


2026

Effective counter-narratives (CNs) are essential for combating online hate speech, yet generic responses often fail to address the specific needs of targeted groups. This paper proposes a target-aware CN generation framework that incorporates demographic-specific tokens into transformer-based models. Our approach enhances the contextual relevance by introducing target-group tokens into the model’s vocabulary. To assess CN quality, we employ a multifaceted evaluation framework, including automatic metrics and LLM as Judges (JudgeLM). Evaluation with a wide range of language models demonstrates that target group tokens markedly improve contextual relevance of generated CN, particularly in small and medium models, with measurable gains in validity as CN and contextual relevance. Even for large instruction-tuned models, such as LLaMA-3, incorporating target-specific information proves effective in enhancing contextual relevance of generated responses. Warning: This paper contains offensive texts that are only used for combating online hate.
As parallel corpora for low-resource languages are scarce, and automatic approaches to mine sentence pairs can lead to noisy datasets, parallel sentence filtering aims to detect only actual translations. We study here two language pairs: Upper Sorbian–German and Czech–German to represent both high and low availability of data resources. To evaluate filtering performance, we generate synthetic datasets by combining existing parallel corpora with synthetic non-parallel pairs, notably with five types of local semantic changes on the German side, such as negation or modality transformations. We represent sentences using three multilingual language models, XLM-R, Glot500m, and LaBSE, and train classifiers for the task. All three model representations led to worse filtering quality when pairs were altered more subtly, such as an antonym replacement. We still observed that a language model pre-trained on the considered language achieves more robust classification performance when sentence pairs are more ambiguous. We also evaluated a cross-lingual approach where the classifier is trained on the Czech–German pair and then applied to the Upper Sorbian–German pair. Such a language transfer paves the way for filtering other low-resource language pairs in the future.
Semantic Textual Similarity (STS) is a crucial component of many Natural Language Processing (NLP) applications. However, existing approaches typically reduce semantic nuances to a single score, limiting interpretability. To address this, we introduce the task of Dissimilar Span Detection (DSD), which aims to identify semantically differing spans between pairs of texts. This can help users understand which particular words or tokens negatively affect the similarity score, or be used to improve performance in STS-dependent downstream tasks. Furthermore, we release a new dataset suitable for the task, the Span Similarity Dataset (SSD), developed through a semi-automated pipeline combining large language models (LLMs) with human verification. We propose and evaluate different baseline methods for DSD, both unsupervised—based on LIME, SHAP, LLMs, and our own method—as well as an additional supervised approach. While LLMs and supervised models achieve the highest performance, overall results remain low, highlighting the complexity of the task. Finally, we set up an additional experiment that shows how DSD can lead to increased performance in the specific task of paraphrase detection.

2015