Diego Frassinelli


2021

pdf bib
KonTra at CMCL 2021 Shared Task: Predicting Eye Movements by Combining BERT with Surface, Linguistic and Behavioral Information
Qi Yu | Aikaterini-Lida Kalouli | Diego Frassinelli
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

This paper describes the submission of the team KonTra to the CMCL 2021 Shared Task on eye-tracking prediction. Our system combines the embeddings extracted from a fine-tuned BERT model with surface, linguistic and behavioral features, resulting in an average mean absolute error of 4.22 across all 5 eye-tracking measures. We show that word length and features representing the expectedness of a word are consistently the strongest predictors across all 5 eye-tracking measures.

pdf bib
Regression Analysis of Lexical and Morpho-Syntactic Properties of Kiezdeutsch
Diego Frassinelli | Gabriella Lapesa | Reem Alatrash | Dominik Schlechtweg | Sabine Schulte im Walde
Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects

Kiezdeutsch is a variety of German predominantly spoken by teenagers from multi-ethnic urban neighborhoods in casual conversations with their peers. In recent years, the popularity of Kiezdeutsch has increased among young people, independently of their socio-economic origin, and has spread in social media, too. While previous studies have extensively investigated this language variety from a linguistic and qualitative perspective, not much has been done from a quantitative point of view. We perform the first large-scale data-driven analysis of the lexical and morpho-syntactic properties of Kiezdeutsch in comparison with standard German. At the level of results, we confirm predictions of previous qualitative analyses and integrate them with further observations on specific linguistic phenomena such as slang and self-centered speaker attitude. At the methodological level, we provide logistic regression as a framework to perform bottom-up feature selection in order to quantify differences across language varieties.

2020

pdf bib
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension
Ekta Sood | Simon Tannert | Diego Frassinelli | Andreas Bulling | Ngoc Thang Vu
Proceedings of the 24th Conference on Computational Natural Language Learning

While neural networks with attention mechanisms have achieved superior performance on many natural language processing tasks, it remains unclear to which extent learned attention resembles human visual attention. In this paper, we propose a new method that leverages eye-tracking data to investigate the relationship between human visual attention and neural attention in machine reading comprehension. To this end, we introduce a novel 23 participant eye tracking dataset - MQA-RC, in which participants read movie plots and answered pre-defined questions. We compare state of the art networks based on long short-term memory (LSTM), convolutional neural models (CNN) and XLNet Transformer architectures. We find that higher similarity to human attention and performance significantly correlates to the LSTM and CNN models. However, we show this relationship does not hold true for the XLNet models – despite the fact that the XLNet performs best on this challenging task. Our results suggest that different architectures seem to learn rather different neural attention strategies and similarity of neural to human attention does not guarantee best performance.

2019

pdf bib
Distributional Interaction of Concreteness and Abstractness in Verb–Noun Subcategorisation
Diego Frassinelli | Sabine Schulte im Walde
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

In recent years, both cognitive and computational research has provided empirical analyses of contextual co-occurrence of concrete and abstract words, partially resulting in inconsistent pictures. In this work we provide a more fine-grained description of the distributional nature in the corpus-based interaction of verbs and nouns within subcategorisation, by investigating the concreteness of verbs and nouns that are in a specific syntactic relationship with each other, i.e., subject, direct object, and prepositional object. Overall, our experiments show consistent patterns in the distributional representation of subcategorising and subcategorised concrete and abstract words. At the same time, the studies reveal empirical evidence why contextual abstractness represents a valuable indicator for automatic non-literal language identification.

2018

pdf bib
Quantitative Semantic Variation in the Contexts of Concrete and Abstract Words
Daniela Naumann | Diego Frassinelli | Sabine Schulte im Walde
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics

Across disciplines, researchers are eager to gain insight into empirical features of abstract vs. concrete concepts. In this work, we provide a detailed characterisation of the distributional nature of abstract and concrete words across 16,620 English nouns, verbs and adjectives. Specifically, we investigate the following questions: (1) What is the distribution of concreteness in the contexts of concrete and abstract target words? (2) What are the differences between concrete and abstract words in terms of contextual semantic diversity? (3) How does the entropy of concrete and abstract word contexts differ? Overall, our studies show consistent differences in the distributional representation of concrete and abstract words, thus challenging existing theories of cognition and providing a more fine-grained description of their nature.

2017

pdf bib
Contextual Characteristics of Concrete and Abstract Words
Diego Frassinelli | Daniela Naumann | Jason Utt | Sabine Schulte m Walde
IWCS 2017 — 12th International Conference on Computational Semantics — Short papers

pdf bib
Exploring Multi-Modal Text+Image Models to Distinguish between Abstract and Concrete Nouns
Sai Abishek Bhaskar | Maximilian Köper | Sabine Schulte Im Walde | Diego Frassinelli
Proceedings of the IWCS workshop on Foundations of Situated and Multimodal Communication