Northern European Journal of Language Technology (2025)

Volumes

Northern European Journal of Language Technology, Volume 11 4 papers

pdf (full)
bib (full) Northern European Journal of Language Technology, Volume 11

Northern European Journal of Language Technology, Volume 11
Marcel Bollmann

Controlling Language and Style of Multi-lingual Generative Language Models with Control Vectors
Julius Leino | Jussi Karlgren

Control vectors have recently gained popularity as a method for steering transformer-based generative language models. This paper contributes to this path of research by evaluating the robustness of these control vectors in multi- and cross-lingual question-answering settings mimicking the real-world deployment scenario, where models are expected to generate answers to challenging questions. We present a set of experiments to demonstrate that a control vector approach can be used to shift the output of a generative language model from one language to another, and to exercise stylistic control of the output across languages. Overall, we find that the control vector approach offers a relatively lightweight and effective path for developing methods to control the output of multilingual language models with multiple design choices affecting the real-world control performance.

pdf bib abs

Hybrid Human-LLM Corpus Construction and LLM Evaluation for the Caused-Motion Construction
Leonie Weissweiler | Abdullatif Köksal | Hinrich Schütze

The caused-motion construction (CMC, “She sneezed the foam off her cappuccino”) is one of the most well-studied constructions in Construction Grammar (CxG). It is a prime example for describing how constructions must carry meaning, as otherwise the fact that “sneeze” in this context takes two arguments and causes motion cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art Large Language Models (LLMs), for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at a statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and LLMs can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate OpenAI, Gemma3, Llama3, OLMo2, Mistral and Aya models for their understanding of the CMC using the newly collected corpus. We find that most models struggle with understanding the motion component that the CMC adds to a sentence.

pdf bib abs

Implicit and Indirect: Detecting Face-threatening and Paired Actions in Asynchronous Online Conversations
Henna Paakki | Pihla Toivanen | Kaisla Kajava

This paper presents an approach to computationally detecting face-threatening and paired actions in asynchronous online conversations. Action detection has been widely studied for synchronous chats. However, there are fewer models or datasets for asynchronous conversations, and they have not included some of the face-threatening actions central to online conversations involving misbehavior like trolling. We examine asynchronous crisis news related online conversations in Finnish, providing an annotation scheme for identifying central actions used in this conversational context. An important contribution is to include face-threatening actions in the scheme, and training computational classifiers for their detection with improved performance compared to prior work. We illustrate that face-threatening actions are important for analyzing conversations related to crisis news. We show that for computational action detection, it is essential to be able to represent how multiple actions may be performed within one comment, and how ambiguity in the expression of actions often leads to multiple possible label interpretations. Annotating actions using scores helps to reflect these characteristics. We also find that an ensemble of models trained on individual annotators’ annotations can best represent multiple potential interpretations of action labels. These are especially relevant for face-threatening actions.