Maria Alexandra Roussopoulou


2026

This paper presents a methodology that uses LLMs to align multilingual offensive lexicons at the sense level. Lexicons of different structures and origins in Arabic, Bulgarian, Modern Greek, French, and Italian have been aligned directly without pivoting through English. The Modern Greek lexicon is LLM-generated, and the other four lexicons are WordNet-compatible. For inter-language alignment of senses, an LLM-as-a-judge rubric was used over lemma–definition–example triples. The LLM makes 2.87M pairwise comparisons and yields 31 strict global-sense categories. The paper discusses the challenges involved in sense alignment tasks. The resource is available to support downstream applications such as Machine Translation and cross-lingual hate-speech detection.