Ionut Sorodoc

Also published as: Ionut-Teodor Sorodoc


Challenges in including extra-linguistic context in pre-trained language models
Ionut Sorodoc | Laura Aina | Gemma Boleda
Proceedings of the Third Workshop on Insights from Negative Results in NLP

To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.


Controlled tasks for model analysis: Retrieving discrete information from sequences
Ionut-Teodor Sorodoc | Gemma Boleda | Marco Baroni
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

In recent years, the NLP community has shown increasing interest in analysing how deep learning models work. Given that large models trained on complex tasks are difficult to inspect, some of this work has focused on controlled tasks that emulate specific aspects of language. We propose a new set of such controlled tasks to explore a crucial aspect of natural language processing that has not received enough attention: the need to retrieve discrete information from sequences. We also study model behavior on the tasks with simple instantiations of Transformers and LSTMs. Our results highlight the beneficial role of decoder attention and its sometimes unexpected interaction with other components. Moreover, we show that, for most of the tasks, these simple models still show significant difficulties. We hope that the community will take up the analysis possibilities that our tasks afford, and that a clearer understanding of model behavior on the tasks will lead to better and more transparent models.

pdf bib
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Ionut-Teodor Sorodoc | Madhumita Sushil | Ece Takmaz | Eneko Agirre
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop


Probing for Referential Information in Language Models
Ionut-Teodor Sorodoc | Kristina Gulordava | Gemma Boleda
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Language models keep track of complex information about the preceding context – including, e.g., syntactic relations in a sentence. We investigate whether they also capture information beneficial for resolving pronominal anaphora in English. We analyze two state of the art models with LSTM and Transformer architectures, via probe tasks and analysis on a coreference annotated corpus. The Transformer outperforms the LSTM in all analyses. Our results suggest that language models are more successful at learning grammatical constraints than they are at learning truly referential information, in the sense of capturing the fact that we use language to refer to entities in the world. However, we find traces of the latter aspect, too.


What do Entity-Centric Models Learn? Insights from Entity Linking in Multi-Party Dialogue
Laura Aina | Carina Silberer | Ionut-Teodor Sorodoc | Matthijs Westera | Gemma Boleda
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Humans use language to refer to entities in the external world. Motivated by this, in recent years several models that incorporate a bias towards learning entity representations have been proposed. Such entity-centric models have shown empirical success, but we still know little about why. In this paper we analyze the behavior of two recently proposed entity-centric models in a referential task, Entity Linking in Multi-party Dialogue (SemEval 2018 Task 4). We show that these models outperform the state of the art on this task, and that they do better on lower frequency entities than a counterpart model that is not entity-centric, with the same model size. We argue that making models entity-centric naturally fosters good architectural decisions. However, we also show that these models do not really build entity representations and that they make poor use of linguistic context. These negative results underscore the need for model analysis, to test whether the motivations for particular architectures are borne out in how models behave when deployed.


AMORE-UPF at SemEval-2018 Task 4: BiLSTM with Entity Library
Laura Aina | Carina Silberer | Ionut-Teodor Sorodoc | Matthijs Westera | Gemma Boleda
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes our winning contribution to SemEval 2018 Task 4: Character Identification on Multiparty Dialogues. It is a simple, standard model with one key innovation, an entity library. Our results show that this innovation greatly facilitates the identification of infrequent characters. Because of the generic nature of our model, this finding is potentially relevant to any task that requires the effective learning from sparse or imbalanced data.

Comparatives, Quantifiers, Proportions: a Multi-Task Model for the Learning of Quantities from Vision
Sandro Pezzelle | Ionut-Teodor Sorodoc | Raffaella Bernardi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

The present work investigates whether different quantification mechanisms (set comparison, vague quantification, and proportional estimation) can be jointly learned from visual scenes by a multi-task computational model. The motivation is that, in humans, these processes underlie the same cognitive, non-symbolic ability, which allows an automatic estimation and comparison of set magnitudes. We show that when information about lower-complexity tasks is available, the higher-level proportional task becomes more accurate than when performed in isolation. Moreover, the multi-task model is able to generalize to unseen combinations of target/non-target objects. Consistently with behavioral evidence showing the interference of absolute number in the proportional task, the multi-task model no longer works when asked to provide the number of target objects in the scene.


Multimodal Topic Labelling
Ionut Sorodoc | Jey Han Lau | Nikolaos Aletras | Timothy Baldwin
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Topics generated by topic models are typically presented as a list of topic terms. Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics. Traditionally, topic label systems focus on a single label modality, e.g. textual labels. In this work we propose a multimodal approach to topic labelling using a simple feedforward neural network. Given a topic and a candidate image or textual label, our method automatically generates a rating for the label, relative to the topic. Experiments show that this multimodal approach outperforms single-modality topic labelling systems.


“Look, some Green Circles!”: Learning to Quantify from Images
Ionut Sorodoc | Angeliki Lazaridou | Gemma Boleda | Aurélie Herbelot | Sandro Pezzelle | Raffaella Bernardi
Proceedings of the 5th Workshop on Vision and Language


Aggregation methods for efficient collocation detection
Anca Dinu | Liviu Dinu | Ionut Sorodoc
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this article we propose a rank aggregation method for the task of collocations detection. It consists of applying some well-known methods (e.g. Dice method, chi-square test, z-test and likelihood ratio) and then aggregating the resulting collocations rankings by rank distance and Borda score. These two aggregation methods are especially well suited for the task, since the results of each individual method naturally forms a ranking of collocations. Combination methods are known to usually improve the results, and indeed, the proposed aggregation method performs better then each individual method taken in isolation.