Aurélie Herbelot

Also published as: Aurelie Herbelot

2021

pdf
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Alexis Palmer | Nathan Schneider | Natalie Schluter | Guy Emerson | Aurelie Herbelot | Xiaodan Zhu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

pdf
How to marry a star: Probabilistic constraints for meaning in context
Katrin Erk | Aurélie Herbelot
Proceedings of the Society for Computation in Linguistics 2021

2020

pdf abs
Recurrent babbling: evaluating the acquisition of grammar from limited input data
Ludovica Pannitto | Aurélie Herbelot
Proceedings of the 24th Conference on Computational Natural Language Learning

Recurrent Neural Networks (RNNs) have been shown to capture various aspects of syntax from raw linguistic input. In most previous experiments, however, learning happens over unrealistic corpora, which do not reflect the type and amount of data a child would be exposed to. This paper remedies this state of affairs by training an LSTM over a realistically sized subset of child-directed input. The behaviour of the network is analysed over time using a novel methodology which consists in quantifying the level of grammatical abstraction in the model’s generated output (its ‘babbling’), compared to the language it has been exposed to. We show that the LSTM indeed abstracts new structures as learning proceeds.

pdf abs
Re-solve it: simulating the acquisition of core semantic competences from small data
Aurélie Herbelot
Proceedings of the 24th Conference on Computational Natural Language Learning

Many tasks are considered to be ‘solved’ in the computational linguistics literature, but the corresponding algorithms operate in ways which are radically different from human cognition. I illustrate this by coming back to the notion of semantic competence, which includes basic linguistic skills encompassing both referential phenomena and generic knowledge, in particular a) the ability to denote, b) the mastery of the lexicon, or c) the ability to model one’s language use on others. Even though each of those faculties has been extensively tested individually, there is still no computational model that would account for their joint acquisition under the conditions experienced by a human. In this paper, I focus on one particular aspect of this problem: the amount of linguistic data available to the child or machine. I show that given the first competence mentioned above (a denotation function), the other two can in fact be learned from very limited data (2.8M token), reaching state-of-the-art performance. I argue that both the nature of the data and the way it is presented to the system matter to acquisition.

2019

pdf abs
From Brain Space to Distributional Space: The Perilous Journeys of fMRI Decoding
Gosse Minnema | Aurélie Herbelot
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Recent work in cognitive neuroscience has introduced models for predicting distributional word meaning representations from brain imaging data. Such models have great potential, but the quality of their predictions has not yet been thoroughly evaluated from a computational linguistics point of view. Due to the limited size of available brain imaging datasets, standard quality metrics (e.g. similarity judgments and analogies) cannot be used. Instead, we investigate the use of several alternative measures for evaluating the predicted distributional space against a corpus-derived distributional space. We show that a state-of-the-art decoder, while performing impressively on metrics that are commonly used in cognitive neuroscience, performs unexpectedly poorly on our metrics. To address this, we propose strategies for improving the model’s performance. Despite returning promising results, our experiments also demonstrate that much work remains to be done before distributional representations can reliably be predicted from brain data.

pdf abs
Towards Incremental Learning of Word Embeddings Using Context Informativeness
Alexandre Kabbach | Kristina Gulordava | Aurélie Herbelot
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

In this paper, we investigate the task of learning word embeddings from very sparse data in an incremental, cognitively-plausible way. We focus on the notion of ‘informativeness’, that is, the idea that some content is more valuable to the learning process than other. We further highlight the challenges of online learning and argue that previous systems fall short of implementing incrementality. Concretely, we incorporate informativeness in a previously proposed model of nonce learning, using it for context selection and learning rate modulation. We test our system on the task of learning new words from definitions, as well as on the task of learning new words from potentially uninformative contexts. We demonstrate that informativeness is crucial to obtaining state-of-the-art performance in a truly incremental setup.

pdf abs
Evaluating the Consistency of Word Embeddings from Small Data
Jelke Bloem | Antske Fokkens | Aurélie Herbelot
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, specifically, philosophical text. Specifically, we inspect the behaviour of models using a pre-trained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learnt terms, both in the background space and in the in-domain data of interest.

pdf abs
Distributional Semantics in the Real World: Building Word Vector Representations from a Truth-Theoretic Model
Elizaveta Kuzmenko | Aurélie Herbelot
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

Distributional semantics models (DSMs) are known to produce excellent representations of word meaning, which correlate with a range of behavioural data. As lexical representations, they have been said to be fundamentally different from truth-theoretic models of semantics, where meaning is defined as a correspondence relation to the world. There are two main aspects to this difference: a) DSMs are built over corpus data which may or may not reflect ‘what is in the world’; b) they are built from word co-occurrences, that is, from lexical types rather than entities and sets. In this paper, we inspect the properties of a distributional model built over a set-theoretic approximation of ‘the real world’. To achieve this, we take the annotation a large database of images marked with objects, attributes and relations, convert the data into a representation akin to first-order logic and build several distributional models using various combinations of features. We evaluate those models over both relatedness and similarity datasets, demonstrating their effectiveness in standard evaluations. This allows us to conclude that, despite prior claims, truth-theoretic models are good candidates for building graded lexical representations of meaning.

2018

pdf abs
Butterfly Effects in Frame Semantic Parsing: impact of data processing on model ranking
Alexandre Kabbach | Corentin Ribeyre | Aurélie Herbelot
Proceedings of the 27th International Conference on Computational Linguistics

Knowing the state-of-the-art for a particular task is an essential component of any computational linguistics investigation. But can we be truly confident that the current state-of-the-art is indeed the best performing model? In this paper, we study the case of frame semantic parsing, a well-established task with multiple shared datasets. We show that in spite of all the care taken to provide a standard evaluation resource, small variations in data processing can have dramatic consequences for ranking parser performance. This leads us to propose an open-source standardized processing pipeline, which can be shared and reused for robust model comparison.

2017

pdf
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)
Nancy Ide | Aurélie Herbelot | Lluís Màrquez
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

pdf abs
Projection Aléatoire Non-Négative pour le Calcul de Word Embedding / Non-Negative Randomized Word Embedding
Behrang Qasemizadeh | Laura Kallmeyer | Aurelie Herbelot
Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 - Articles longs

Non-Negative Randomized Word Embedding We propose a word embedding method which is based on a novel random projection technique. We show that weighting methods such as positive pointwise mutual information (PPMI) can be applied to our models after their construction and at a reduced dimensionality. Hence, the proposed technique can efficiently transfer words onto semantically discriminative spaces while demonstrating high computational performance, besides benefits such as ease of update and a simple mechanism for interoperability. We report the performance of our method on several tasks and show that it yields competitive results compared to neural embedding methods in monolingual corpus-based setups.

pdf abs
FOIL it! Find One mismatch between Image and Language caption
Ravi Shekhar | Sandro Pezzelle | Yauhen Klimovich | Aurélie Herbelot | Moin Nabi | Enver Sangineto | Raffaella Bernardi
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities. To this end, we propose an extension of the MS-COCO dataset, FOIL-COCO, which associates images with both correct and ‘foil’ captions, that is, descriptions of the image that are highly similar to the original ones, but contain one single mistake (‘foil word’). We show that current LaVi models fall into the traps of this data and perform badly on three tasks: a) caption classification (correct vs. foil); b) foil word detection; c) foil word correction. Humans, in contrast, have near-perfect performance on those tasks. We demonstrate that merely utilising language cues is not enough to model FOIL-COCO and that it challenges the state-of-the-art by requiring a fine-grained understanding of the relation between text and image.

pdf abs
High-risk learning: acquiring new word vectors from tiny data
Aurélie Herbelot | Marco Baroni
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn ‘a good vector’ for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences’ worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.

2016

pdf
You and me... in a vector space: modelling individual speakers with distributional semantics
Aurélie Herbelot | Behrang QasemiZadeh
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics

pdf abs
Many speakers, many worlds: Interannotator variations in the quantification of feature norms
Aurélie Herbelot | Eva Maria Vecchi
Linguistic Issues in Language Technology, Volume 13, 2016

Quantification (see e.g. Peters and Westerst ̊ahl, 2006) is probably one of the most extensively studied phenomena in formal semantics. But because of the specific representation of meaning assumed by modeltheoretic semantics (one where a true model of the world is a priori available), research in the area has primarily focused on one question: what is the relation of a quantifier to the truth value of a sentence? In contrast, relatively little has been said about the way the underlying model comes about, and its relation to individual speakers’ conceptual knowledge. In this paper, we make a first step in investigating how native speakers of English model relations between non-grounded sets, by observing how they quantify simple statements. We first give some motivation for our task, from both a theoretical linguistic and computational semantic point of view (§2). We then describe our annotation setup (§3) and follow on with an analysis of the produced dataset, conducting a quantitative evaluation which includes inter-annotator agreement for different classes of predicates (§4). We observe that there is significant agreement between speakers but also noticeable variations. We posit that in settheoretic terms, there are as many worlds as there are speakers (§5), but the overwhelming use of underspecified quantification in ordinary language covers up the individual differences that might otherwise be observed.

pdf abs
‘Calling on the classical phone’: a distributional model of adjective-noun errors in learners’ English
Aurélie Herbelot | Ekaterina Kochmar
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper we discuss three key points related to error detection (ED) in learners’ English. We focus on content word ED as one of the most challenging tasks in this area, illustrating our claims on adjective–noun (AN) combinations. In particular, we (1) investigate the role of context in accurately capturing semantic anomalies and implement a system based on distributional topic coherence, which achieves state-of-the-art accuracy on a standard test set; (2) thoroughly investigate our system’s performance across individual adjective classes, concluding that a class-dependent approach is beneficial to the task; (3) discuss the data size bottleneck in this area, and highlight the challenges of automatic error generation for content words.

pdf abs
Predictability of Distributional Semantics in Derivational Word Formation
Sebastian Padó | Aurélie Herbelot | Max Kisselew | Jan Šnajder
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Compositional distributional semantic models (CDSMs) have successfully been applied to the task of predicting the meaning of a range of linguistic constructions. Their performance on semi-compositional word formation process of (morphological) derivation, however, has been extremely variable, with no large-scale empirical investigation to date. This paper fills that gap, performing an analysis of CDSM predictions on a large dataset (over 30,000 German derivationally related word pairs). We use linear regression models to analyze CDSM performance and obtain insights into the linguistic factors that influence how predictable the distributional context of a derived word is going to be. We identify various such factors, notably part of speech, argument structure, and semantic regularity.

pdf
Formal Distributional Semantics: Introduction to the Special Issue
Gemma Boleda | Aurélie Herbelot
Computational Linguistics, Volume 42, Issue 4 - December 2016