2024
pdf
bib
abs
GroningenAnnotatesGaza at the FIGNEWS 2024 Shared Task: Analyzing Bias in Conflict Narratives
Khalid Khatib
|
Sara Gemelli
|
Saskia Heisterborg
|
Pritha Majumdar
|
Gosse Minnema
|
Arianna Muti
|
Noa Solissa
Proceedings of the Second Arabic Natural Language Processing Conference
In this paper we report the development of our annotation methodology for the shared task FIGNEWS 2024. The objective of the shared task is to look into the layers of bias in how the war on Gaza is represented in media narrative. Our methodology follows the prescriptive paradigm, in which guidelines are detailed and refined through an iterative process in which edge cases are discussed and converged. Our IAA score (Krippendorffâs đŒ) is 0.420, highlighting the challenging and subjective nature of the task. Our results show that 52% of posts were unbiased, 42% biased against Palestine, 5% biased against Israel, and 3% biased against both. 16% were unclear or not applicable.
pdf
bib
abs
Manosphrames: exploring an Italian incel community through the lens of NLP and Frame Semantics
Sara Gemelli
|
Gosse Minnema
Proceedings of the First Workshop on Reference, Framing, and Perspective @ LREC-COLING 2024
We introduce a large corpus of comments extracted from an Italian online incel (âinvoluntary incelibateâ) forum, a community of men who build a collective identity and anti-feminist ideology centered around their inability to find a sexual or romantic partner and who frequently use explicitly misogynistic language. Our corpus consists of 2.4K comments that have been manually collected, analyzed and annotated with topic labels, and a further 32K threads (300K comments) that have been automatically scraped and automatically annotated with FrameNet annotations. We show how large-scale frame semantic analysis can shed a light on what is discussed in the community, and introduce incel topic classification as a new NLP task and benchmark.
2023
pdf
bib
abs
Responsibility Perspective Transfer for Italian Femicide News
Gosse Minnema
|
Huiyuan Lai
|
Benedetta Muscato
|
Malvina Nissim
Findings of the Association for Computational Linguistics: ACL 2023
Different ways of linguistically expressing the same real-world event can lead to different perceptions of what happened. Previous work has shown that different descriptions of gender-based violence (GBV) influence the readerâs perception of who is to blame for the violence, possibly reinforcing stereotypes which see the victim as partly responsible, too. As a contribution to raise awareness on perspective-based writing, and to facilitate access to alternative perspectives, we introduce the novel task of automatically rewriting GBV descriptions as a means to alter the perceived level of blame on the perpetrator. We present a quasi-parallel dataset of sentences with low and high perceived responsibility levels for the perpetrator, and experiment with unsupervised (mBART-based), zero-shot and few-shot (GPT3-based) methods for rewriting sentences. We evaluate our models using a questionnaire study and a suite of automatic metrics.
2022
pdf
bib
abs
Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports
Gosse Minnema
|
Sara Gemelli
|
Chiara Zanchi
|
Tommaso Caselli
|
Malvina Nissim
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Different linguistic expressions can conceptualize the same event from different viewpoints by emphasizing certain participants over others. Here, we investigate a case where this has social consequences: how do linguistic expressions of gender-based violence (GBV) influence who we perceive as responsible? We build on previous psycholinguistic research in this area and conduct a large-scale perception survey of GBV descriptions automatically extracted from a corpus of Italian newspapers. We then train regression models that predict the salience of GBV participants with respect to different dimensions of perceived responsibility. Our best model (fine-tuned BERT) shows solid overall performance, with large differences between dimensions and participants: salient _focus_ is more predictable than salient _blame_, and perpetratorsâ salience is more predictable than victimsâ salience. Experiments with ridge regression models using different representations show that features based on linguistic theory similarly to word-based features. Overall, we show that different linguistic choices do trigger different perceptions of responsibility, and that such perceptions can be modelled automatically. This work can be a core instrument to raise awareness of the consequences of different perspectivizations in the general public and in news producers alike.
pdf
bib
abs
SocioFillmore: A Tool for Discovering Perspectives
Gosse Minnema
|
Sara Gemelli
|
Chiara Zanchi
|
Tommaso Caselli
|
Malvina Nissim
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
SOCIOFILLMORE is a multilingual tool which helps to bring to the fore the focus or the perspective that a text expresses in depicting an event. Our tool, whose rationale we also support through a large collection of human judgements, is theoretically grounded on frame semantics and cognitive linguistics, and implemented using the LOME frame semantic parser. We describe SOCIOFILLMOREâs development and functionalities, show how non-NLP researchers can easily interact with the tool, and present some example case studies which are already incorporated in the system, together with the kind of analysis that can be visualised.
2021
pdf
bib
Improving DRS Parsing with Separately Predicted Semantic Roles
Tatiana Bladier
|
Gosse Minnema
|
Rik van Noord
|
Kilian Evang
Proceedings of the ESSLLI 2021 Workshop on Computing Semantics with Types, Frames and Related Structures
pdf
bib
abs
Breeding Fillmoreâs Chickens and Hatching the Eggs: Recombining Frames and Roles in Frame-Semantic Parsing
Gosse Minnema
|
Malvina Nissim
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
Frame-semantic parsers traditionally predict predicates, frames, and semantic roles in a fixed order. This paper explores the âchicken-or-eggâ problem of interdependencies between these components theoretically and practically. We introduce a flexible BERT-based sequence labeling architecture that allows for predicting frames and roles independently from each other or combining them in several ways. Our results show that our setups can approximate more complex traditional modelsâ performance, while allowing for a clearer view of the interdependencies between the pipelineâs components, and of how frame and role prediction models make different use of BERTâs layers.
2020
pdf
bib
abs
Towards Reference-Aware FrameNet Annotation
Levi Remijnse
|
Gosse Minnema
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet
In this paper, we introduce the task of using FrameNet to link structured information about real-world events to the conceptual frames used in texts describing these events. We show that frames made relevant by the knowledge of the real-world event can be captured by complementing standard lexicon-driven FrameNet annotations with frame annotations derived through pragmatic inference. We propose a two-layered annotation scheme with a âstrictâ FrameNet-compatible lexical layer and a âlooseâ layer capturing frames that are inferred from referential data.
pdf
bib
abs
Large-scale Cross-lingual Language Resources for Referencing and Framing
Piek Vossen
|
Filip Ilievski
|
Marten Postma
|
Antske Fokkens
|
Gosse Minnema
|
Levi Remijnse
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this article, we lay out the basic ideas and principles of the project Framing Situations in the Dutch Language. We provide our first results of data acquisition, together with the first data release. We introduce the notion of cross-lingual referential corpora. These corpora consist of texts that make reference to exactly the same incidents. The referential grounding allows us to analyze the framing of these incidents in different languages and across different texts. During the project, we will use the automatically generated data to study linguistic framing as a phenomenon, build framing resources such as lexicons and corpora. We expect to capture larger variation in framing compared to traditional approaches for building such resources. Our first data release, which contains structured data about a large number of incidents and reference texts, can be found at
http://dutchframenet.nl/data-releases/.
pdf
bib
abs
Machine Translation for EnglishâInuktitut with Segmentation, Data Acquisition and Pre-Training
Christian Roest
|
Lukas Edman
|
Gosse Minnema
|
Kevin Kelly
|
Jennifer Spenader
|
Antonio Toral
Proceedings of the Fifth Conference on Machine Translation
Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the EnglishâInuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed.
2019
pdf
bib
abs
From Brain Space to Distributional Space: The Perilous Journeys of fMRI Decoding
Gosse Minnema
|
Aurélie Herbelot
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Recent work in cognitive neuroscience has introduced models for predicting distributional word meaning representations from brain imaging data. Such models have great potential, but the quality of their predictions has not yet been thoroughly evaluated from a computational linguistics point of view. Due to the limited size of available brain imaging datasets, standard quality metrics (e.g. similarity judgments and analogies) cannot be used. Instead, we investigate the use of several alternative measures for evaluating the predicted distributional space against a corpus-derived distributional space. We show that a state-of-the-art decoder, while performing impressively on metrics that are commonly used in cognitive neuroscience, performs unexpectedly poorly on our metrics. To address this, we propose strategies for improving the modelâs performance. Despite returning promising results, our experiments also demonstrate that much work remains to be done before distributional representations can reliably be predicted from brain data.
pdf
bib
abs
Toward Dialogue Modeling: A Semantic Annotation Scheme for Questions and Answers
MarĂa Andrea Cruz BlandĂłn
|
Gosse Minnema
|
Aria Nourbakhsh
|
Maria Boritchev
|
Maxime Amblard
Proceedings of the 13th Linguistic Annotation Workshop
The present study proposes an annotation scheme for classifying the content and discourse contribution of question-answer pairs. We propose detailed guidelines for using the scheme and apply them to dialogues in English, Spanish, and Dutch. Finally, we report on initial machine learning experiments for automatic annotation.