Lucie Barque

2020

French, as many languages, lacks semantically annotated corpus data. Our aim is to provide the linguistic and NLP research communities with a gold standard sense-annotated corpus of French, using WordNet Unique Beginners as semantic tags, thus allowing for interoperability. In this paper, we report on the first phase of the project, which focused on the annotation of common nouns. The resulting dataset consists of more than 12,000 French noun occurrences which were annotated in double blind and adjudicated according to a carefully redefined set of supersenses. The resource is released online under a Creative Commons Licence.

pdf abs
SLICE: Supersense-based Lightweight Interpretable Contextual Embeddings
Cindy Aloui | Carlos Ramisch | Alexis Nasr | Lucie Barque
Proceedings of the 28th International Conference on Computational Linguistics

Contextualised embeddings such as BERT have become de facto state-of-the-art references in many NLP applications, thanks to their impressive performances. However, their opaqueness makes it hard to interpret their behaviour. SLICE is a hybrid model that combines supersense labels with contextual embeddings. We introduce a weakly supervised method to learn interpretable embeddings from raw corpora and small lists of seed words. Our model is able to represent both a word and its context as embeddings into the same compact space, whose dimensions correspond to interpretable supersenses. We assess the model in a task of supersense tagging for French nouns. The little amount of supervision required makes it particularly well suited for low-resourced scenarios. Thanks to its interpretability, we perform linguistic analyses about the predicted supersenses in terms of input word and context representations.

2019

pdf abs
Demonette2 - Une base de données dérivationnelle du français à grande échelle : premiers résultats (Demonette2 – A large scale derivational database for French: first results)
Fiammetta Namer | Lucie Barque | Olivier Bonami | Pauline Haas | Nabil Hathout | Delphine Tribout
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

Cet article présente la conception et le développement de Demonette2, une base de données dérivationnelle à grande échelle du français, développée dans le cadre du projet ANR Démonext (ANR-17-CE23-0005). L’article décrit les objectifs du projet, la structure de la base et expose les premiers résultats du projet, en mettant l’accent sur un enjeu crucial : la question du codage sémantique des entrées et des relations.

2016

pdf abs
Improvement of VerbNet-like resources by frame typing
Laurence Danlos | Matthieu Constant | Lucie Barque
Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

Verbenet is a French lexicon developed by “translation” of its English counterpart — VerbNet (Kipper-Schuler, 2005)—and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a “virtuous circle”. We present the principles underlying a program we developed and used to automatically type frames in VerbeNet. We also show that our system is portable to other languages.

2014

The Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the first part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain.

2012

pdf
Dictionary-ontology cross-enrichment
Emmanuel Eckard | Lucie Barque | Alexis Nasr | Benoît Sagot
Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon

pdf
Extracting a Semantic Lexicon of French Adjectives from a Large Lexicographic Dictionary
Selja Seppälä | Lucie Barque | Alexis Nasr
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2010

pdf abs
Building a Lexicon of French Deverbal Nouns from a Semantically Annotated Corpus
Antonio Balvet | Lucie Barque | Rafael Marín
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper presents project Nomage, which aims at describing the aspectual properties of deverbal nouns in an empirical way. It is centered on the development of two resources: a semantically annotated corpus of deverbal nouns, and an electronic lexicon. They are both presented in this paper, and emphasize how the semantic annotations of the corpus allow the lexicographic description of deverbal nouns to be validated, in particular their polysemy. Nominalizations have occupied a central place in grammatical analysis, with a focus on morphological and syntactic aspects. More recently, researchers have begun to address a specific issue often neglected before, i.e. the semantics of nominalizations, and its implications for Natural Language Processing applications such as electronic ontologies or Information Retrieval. We focus on precisely this issue in the research project NOMAGE, funded by the French National Research Agency (ANR-07-JCJC-0085-01). In this paper, we present the Nomage corpus and the annotations we make on deverbal nouns (section 2). We then show how we build our lexicon with the semantically annotated corpus and illustrate the kind of generalizations we can make from such data (section 3).

2008

pdf abs
La polysémie régulière dans WordNet
Lucie Barque | François-Régis Chaumartin
Actes de la 15ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cette étude propose une analyse et une modélisation des relations de polysémie dans le lexique électronique anglais WordNet. Elle exploite pour cela la hiérarchie des concepts (représentés par des synsets), et la définition associée à chacun de ces concepts. Le résultat est constitué d’un ensemble de règles qui nous ont permis d’identifier d’une façon largement automatisée, avec une précision voisine de 91%, plus de 2100 paires de synsets liés par une relation de polysémie régulière. Notre méthode permet aussi une désambiguïsation lexicale partielle des mots de la définition associée à ces synsets.

2005

pdf bib abs
Application du métalangage de la BDéf au traitement formel de la polysémie
Lucie Barque | Alain Polguère
Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Cet article a pour objet le métalangage définitionnel de la base de données lexicale BDéf, plus précisément l’utilisation de ce métalangage dans la modélisation des structures polysémiques du français. La Bdéf encode sous forme de définitions lexicographiques les sens lexicaux d’un sous-ensemble représentatif du lexique du français parmi lequel on compte environ 500 unités polysémiques appartenant aux principales parties du discours. L’article comprend deux sections. La première présente le métalangage de la BDéf et le situe par rapport aux différents types de définitions lexicales, qu’elles soient ou non formelles, qu’elles visent ou non l’informatisation. La seconde section présente une application de la BDéf qui vise à terme à rendre compte de la polysémie régulière du français. On y présente, à partir d’un cas spécifique, la notion de patron de polysémie.

2004

pdf bib abs
De la lexie au vocable : la représentation formelle des liens de polysémie
Lucie Barque
Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues (Posters)

Cet article s’intéresse aux définitions formalisées de la base de données BDéf et montre en quoi la structure formelle de ces définitions est à même d’offrir une représentation originale de la polysémie lexicale.

Co-authors

Venues

gramlex1

coling1