Olivier Hamon

Also published as: O. Hamon

2022

pdf abs
Combination of Contextualized and Non-Contextualized Layers for Lexical Substitution in French
Kévin Espasa | Emmanuel Morin | Olivier Hamon
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Lexical substitution task requires to substitute a target word by candidates in a given context. Candidates must keep meaning and grammatically of the sentence. The task, introduced in the SemEval 2007, has two objectives. The first objective is to find a list of substitutes for a target word. This list of substitutes can be obtained with lexical resources like WordNet or generated with a pre-trained language model. The second objective is to rank these substitutes using the context of the sentence. Most of the methods use vector space models or more recently embeddings to rank substitutes. Embedding methods use high contextualized representation. This representation can be over contextualized and in this way overlook good substitute candidates which are more similar on non-contextualized layers. SemDis 2014 introduced the lexical substitution task in French. We propose an application of the state-of-the-art method based on BERT in French and a novel method using contextualized and non-contextualized layers to increase the suggestion of words having a lower probability in a given context but that are more semantically similar. Experiments show our method increases the BERT based system on the OOT measure but decreases on the BEST measure in the SemDis 2014 benchmark.

2019

pdf abs
SylNews, un agréfilter multilingue (SylNews, a multilingual aggrefilter)
Olivier Hamon | Kévin Espasa | Sara Quispe
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume IV : Démonstrations

Depuis plusieurs années, Syllabs intègre de nombreux composants au sein d’un agréfilter, utilisant des technologies d’extraction d’information développées en interne et dans un contexte multilingue. Originellement conçu pour agréger des contenus issus de la presse, SylNews peut être utilisé à des fins de veille, pour explorer des contenus, ou pour identifier d’une manière plus globale les sujets chauds de l’ensemble ou d’une partie des contenus stockés.

2018

pdf bib abs
Construction de patrons lexico-syntaxiques d’extraction pour l’acquisition de connaissances à partir du web (Relation pattern extraction and information extraction from the web)
Chloé Monnin | Olivier Hamon
Actes de la Conférence TALN. Volume 2 - Démonstrations, articles des Rencontres Jeunes Chercheurs, ateliers DeFT

Cet article présente une méthode permettant de collecter sur le web des informations complémentaires à une information prédéfinie, afin de remplir une base de connaissances. Notre méthode utilise des patrons lexico-syntaxiques, servant à la fois de requêtes de recherche et de patrons d’extraction permettant l’analyse de documents non structurés. Pour ce faire, il nous a fallu définir au préalable les critères pertinents issus des analyses dans l’objectif de faciliter la découverte de nouvelles valeurs.

pdf abs
Syllabs@DEFT2018 : combinaison de méthodes de classification supervisées (Syllabs@DEFT2018: Combination of Supervised Classification Methods)
Chloé Monnin | Olivier Querné | Olivier Hamon
Actes de la Conférence TALN. Volume 2 - Démonstrations, articles des Rencontres Jeunes Chercheurs, ateliers DeFT

Nous présentons la participation de Syllabs à la tâche de classification de tweets dans le domaine du transport lors de DEFT 2018. Pour cette première participation à une campagne DEFT, nous avons choisi de tester plusieurs algorithmes de classification état de l’art. Après une étape de prétraitement commune à l’ensemble des algorithmes, nous effectuons un apprentissage sur le seul contenu des tweets. Les résultats étant somme toute assez proches, nous effectuons un vote majoritaire sur les trois algorithmes ayant obtenus les meilleurs résultats.

2014

pdf abs
Rediscovering 15 Years of Discoveries in Language Resources and Evaluation: The LREC Anthology Analysis
Joseph Mariani | Patrick Paroubek | Gil Francopoulo | Olivier Hamon
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper aims at analyzing the content of the LREC conferences contained in the ELRA Anthology over the past 15 years (1998-2013). It follows similar exercises that have been conducted, such as the survey on the IEEE ICASSP conference series from 1976 to 1990, which served in the launching of the ESCA Eurospeech conference, a survey of the Association of Computational Linguistics (ACL) over 50 years of existence, which was presented at the ACL conference in 2012, or a survey over the 25 years (1987-2012) of the conferences contained in the ISCA Archive, presented at Interspeech 2013. It contains first an analysis of the evolution of the number of papers and authors over time, including the study of their gender, nationality and affiliation, and of the collaboration among authors. It then studies the funding sources of the research investigations that are reported in the papers. It conducts an analysis of the evolution of the research topics within the community over time. It finally looks at reuse and plagiarism in the papers. The survey shows the present trends in the conference series and in the Language Resources and Evaluation scientific community. Conducting this survey also demonstrated the importance of a clear and unique identification of authors, papers and other sources to facilitate the analysis. This survey is preliminary, as many other aspects also deserve attention. But we hope it will help better understanding and forging our community in the global village.

This paper presents META-SHARE (www.meta-share.eu), an open language resource infrastructure, and its usage since its Europe-wide deployment in early 2013. META-SHARE is a network of repositories that store language resources (data, tools and processing services) documented with high-quality metadata, aggregated in central inventories allowing for uniform search and access. META-SHARE was developed by META-NET (www.meta-net.eu) and aims to serve as an important component of a language technology marketplace for researchers, developers, professionals and industrial players, catering for the full development cycle of language technology, from research through to innovative products and services. The observed usage in its initial steps, the steadily increasing number of network nodes, resources, users, queries, views and downloads are all encouraging and considered as supportive of the choices made so far. In tandem, take-up activities like direct linking and processing of datasets by language processing services as well as metadata transformation to RDF are expected to open new avenues for data and resources linking and boost the organic growth of the infrastructure while facilitating language technology deployment by much wider research communities and industrial sectors.

2012

pdf abs
Towards a User-Friendly Platform for Building Language Resources based on Web Services
Marc Poch | Antonio Toral | Olivier Hamon | Valeria Quochi | Núria Bel
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents the platform developed in the PANACEA project, a distributed factory that automates the stages involved in the acquisition, production, updating and maintenance of Language Resources required by Machine Translation and other Language Technologies. We adopt a set of tools that have been successfully used in the Bioinformatics field, they are adapted to the needs of our field and used to deploy web services, which can be combined to build more complex processing chains (workflows). This paper describes the platform and its different components (web services, registry, workflows, social network and interoperability). We demonstrate the scalability of the platform by carrying out a set of massive data experiments. Finally, a validation of the platform across a set of required criteria proves its usability for different types of users (non-technical users and providers).

pdf abs
META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools
Christian Federmann | Ioanna Giannopoulou | Christian Girardi | Olivier Hamon | Dimitris Mavroeidis | Salvatore Minutoli | Marc Schröder
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe META-SHARE which aims at providing an open, distributed, secure, and interoperable infrastructure for the exchange of language resources, including both data and tools. The application has been designed and is developed as part of the T4ME Network of Excellence. We explain the underlying motivation for such a distributed repository for metadata storage and give a detailed overview on the META-SHARE application and its various components. This includes a discussion of the technical architecture of the system as well as a description of the component-based metadata schema format which has been developed in parallel. Development of the META-SHARE infrastructure adopts state-of-the-art technology and follows an open-source approach, allowing the general community to participate in the development process. The META-SHARE software package including full source code has been released to the public in March 2012. We look forward to present an up-to-date version of the META-SHARE software at the conference.

pdf abs
Using the International Standard Language Resource Number: Practical and Technical Aspects
Khalid Choukri | Victoria Arranz | Olivier Hamon | Jungyeul Park
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the International Standard Language Resource Number (ISLRN), a new identification schema for Language Resources where a Language Resource is provided with a unique and universal name using a standardized nomenclature. This will ensure that Language Resources be identified, accessed and disseminated in a unique manner, thus allowing them to be recognized with proper references in all activities concerning Human Language Technologies as well as in all documents and scientific papers. This would allow, for instance, the formal identification of potentially repeated resources across different repositories, the formal referencing of language resources and their correct use when different versions are processed by tools.

pdf abs
On the Way to a Legal Sharing of Web Applications in NLP
Victoria Arranz | Olivier Hamon
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

For some years now, web services have been employed in Natural Language Processing (NLP) for a number of uses and within a number of sub-areas. Web services allow users to gain access to distant applications without having the need to install them on their local machines. A large paradigm of advantages can be obtained from a practical and development point of view. However, the legal aspects behind this sharing should not be neglected and should be openly discussed so as to understand the implications behind such data exchanges and tool uses. In the framework of PANACEA, this paper highlights the different points involved and describes the work done in order to handle all the legal aspects behind those points.

2011

pdf
Proposal for the International Standard Language Resource Number
Khalid Choukri | Jungyeul Park | Olivier Hamon | Victoria Arranz
Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm

pdf
Evaluation Methodology and Results for English-to-Arabic MT
Olivier Hamon | Khalid Choukri
Proceedings of Machine Translation Summit XIII: Papers

pdf abs
Protocol and lessons learnt from the production of parallel corpora for the evaluation of speech translation systems
Victoria Arranz | Olivier Hamon | Karim Boudahmane | Martine Garnier-Rizet
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign

Machine translation evaluation campaigns require the production of reference corpora to automatically measure system output. This paper describes recent efforts to create such data with the objective of measuring the quality of the systems participating in the Quaero evaluations. In particular, we focus on the protocols behind such production as well as all the issues raised by the complexity of the transcription data handled.

2010

Question Answering (QA) technology aims at providing relevant answers to natural language questions. Most Question Answering research has focused on mining document collections containing written texts to answer written questions. In addition to written sources, a large (and growing) amount of potentially interesting information appears in spoken documents, such as broadcast news, speeches, seminars, meetings or telephone conversations. The QAST track (Question-Answering on Speech Transcripts) was introduced in CLEF to investigate the problem of question answering in such audio documents. This paper describes in detail the evaluation protocol and tools designed and developed for the CLEF-QAST evaluation campaigns that have taken place between 2007 and 2009. We first remind the data, question sets, and submission procedures that were produced or set up during these three campaigns. As for the evaluation procedure, the interface that was developed to ease the assessors work is described. In addition, this paper introduces a methodology for a semi-automatic evaluation of QAST systems based on time slot comparisons. Finally, the QAST Evaluation Package 2007-2009 resulting from these evaluation campaigns is also introduced.

pdf abs
Is my Judge a good One?
Olivier Hamon
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper aims at measuring the reliability of judges in MT evaluation. The scope is two evaluation campaigns from the CESTA project, during which human evaluations were carried out on fluency and adequacy criteria for English-to-French documents. Our objectives were threefold: observe both inter- and intra-judge agreements, and then study the influence of the evaluation design especially implemented for the need of the campaigns. Indeed, a web interface was especially developed to help with the human judgments and store the results, but some design changes were made between the first and the second campaign. Considering the low agreements observed, the judges' behaviour has been analysed in that specific context. We also asked several judges to repeat their own evaluations a few times after the first judgments done during the official evaluation campaigns. Even if judges did not seem to agree fully at first sight, a less strict comparison led to a strong agreement. Furthermore, the evolution of the design during the project seemed to have been a source for the difficulties that judges encountered to keep the same interpretation of quality.

pdf
The Second Evaluation Campaign of PASSAGE on Parsing of French
Patrick Paroubek | Olivier Hamon | Eric de La Clergerie | Cyril Grouin | Anne Vilnat
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

pdf abs
Cooperation for Arabic Language Resources and Tools — The MEDAR Project
Bente Maegaard | Mohamed Attia | Khalid Choukri | Olivier Hamon | Steven Krauwer | Mustafa Yaseen
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The paper describes some of the work carried out within the European funded project MEDAR. The project has three streams of activity: the technical stream, the cooperation stream and the dissemination stream. MEDAR has first updated the existing surveys and BLARK for Arabic, and then the technical stream focused on machine translation. The consortium identified a number of freely available MT systems and then customized two versions of the famous MOSES package. The Consortium addressed the needs to package MOSES for English to Arabic (while the main MT stream is on Arabic to English). For performance assessment purposes, the partners produced test data that allowed carrying out an evaluation campaign with 5 different systems (including from outside the consortium) and two online ones. Both the MT baselines and the collected data will be made available via ELRA catalogue. The cooperation stream focuses mostly on the cooperation roadmap for Human Language Technologies for Arabic. Cooperation Roadmap for the region directed towards the Arabic HLT in general. It is the purpose of the roadmap to outline areas and priorities for collaboration, in terms of collaboration between EU countries and Arabic speaking countries, as well as cooperation in general: between countries, between universities, and last but not least between universities and industry.

2009

pdf
Évaluation des outils terminologiques : enjeux, difficultés et propositions [Evaluation of terminological tools : challenges, problems and propositions]
Adeline Nazarenko | Haïfa Zargayouna | Olivier Hamon | Jonathan van Puymbrouck
Traitement Automatique des Langues, Volume 50, Numéro 1 : Varia [Varia]

2008

pdf
Large Scale Production of Syntactic Annotations to Move Forward
Anne Vilnat | Gil Francopoulo | Olivier Hamon | Sylvain Loiseau | Patrick Paroubek | Eric Villemonte de la Clergerie
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

pdf
SEWS : un serveur d’évaluation orienté Web pour la syntaxe [SEWS : an web-based server for evaluating syntactic annotation tools]
Olivier Hamon | Patrick Paroubek | Djamel Mostef
Traitement Automatique des Langues, Volume 49, Numéro 2 : Plate-formes pour le traitement automatique des langues [Platforms for Natural Language Processing]

pdf
The Impact of Reference Quality on Automatic MT Evaluation
Olivier Hamon | Djamel Mostefa
Coling 2008: Companion volume: Posters

pdf abs
PASSAGE: from French Parser Evaluation to Large Sized Treebank
Éric Villemonte de la Clergerie | Olivier Hamon | Djamel Mostefa | Christelle Ayache | Patrick Paroubek | Anne Vilnat
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present the PASSAGE project which aims at building automatically a French Treebank of large size by combining the output of several parsers, using the EASY annotation scheme. We present also the results of the of the first evaluation campaign of the project and the preliminary results we have obtained with our ROVER procedure for combining parsers automatically.

pdf abs
An Experimental Methodology for an End-to-End Evaluation in Speech-to-Speech Translation
Olivier Hamon | Djamel Mostefa
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the evaluation methodology used to evaluate the TC-STAR speech-to-speech translation (SST) system and the results from the third year of the project. It follows the results presented in Hamon (2007), dealing with the first end-to-end evaluation of the project. In this paper, we try to experiment with the methodology and the protocol during a second end-to-end evaluation, by comparing outputs from the TC-STAR system with interpreters from the European parliament. For this purpose, we test different criteria of evaluation and type of questions within a comprehension test. The results show that interpreters do not translate all the information (as opposed to the automatic system), but the quality of SST is still far from that of human translation. The experimental comprehension test used provides new information to study the quality of automatic systems, but without settling the issue of which protocol is the best. This depends on what the evaluator wants to know about the SST: either to have a subjective end-user evaluation or a more objective one.

2007

pdf
How much data is needed for reliable MT evaluation? Using bootstrapping to study human and automatic metrics
Paula Estrella | Olivier Hamon | Andrei Popescu-Belis
Proceedings of Machine Translation Summit XI: Papers

pdf
End-to-end evaluation of a speech-to-speech translation system in TC-STAR
Olivier Hamon | Djamel Mostefa | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers

pdf
Assessing human and automated quality judgments in the French MT evaluation campaign CESTA
Olivier Hamon | Anthony Hartley | Andrei Popescu-Belis | Khalid Choukri
Proceedings of Machine Translation Summit XI: Papers

Experiences and conclusions from the CESTA evaluation project
Olivier Hamon
Proceedings of the Workshop on Automatic procedures in MT evaluation

MT evaluation & TC-STAR
Khalid Choukri | Olivier Hamon | Djamel Mostefa
Proceedings of the Workshop on Automatic procedures in MT evaluation

2006

pdf abs
Terminological Resources Acquisition Tools: Toward a User-oriented Evaluation Model
Widad Mustafa El Hadi | Ismail Timimi | Marianne Dabbadie | Khalid Choukri | Olivier Hamon | Yun-Chuang Chiao
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the CESART project which deals with the evaluation of terminological resources acquisition tools. The objective of the project is to propose and validate an evaluation protocol allowing one to objectively evaluate and compare different systems for terminology application such as terminological resource creation and semantic relation extraction. The project also aims to create quality-controlled resources such as domain-specific corpora, automatic scoring tool, etc.

This article outlines the evaluation protocol and provides the main results of the French Evaluation Campaign for Machine Translation Systems, CESTA. Following the initial objectives and evaluation plans, the evaluation metrics are briefly described: along with fluency and adequacy assessed by human judges, a number of recently proposed automated metrics are used. Two evaluation campaigns were organized, the first one in the general domain, and the second one in the medical domain. Up to six systems translating from English into French, and two systems translating from Arabic into French, took part in the campaign. The numerical results illustrate the differences between classes of systems, and provide interesting indications about the reliability of the automated metrics for French as a target language, both by comparison to human judges and using correlations between metrics. The corpora that were produced, as well as the information about the reliability of metrics, constitute reusable resources for MT evaluation.

pdf abs
X-Score: Automatic Evaluation of Machine Translation Grammaticality
O. Hamon | M. Rajman
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we report an experiment of an automated metric used to analyse the grammaticality of machine translation output. The approach (Rajman, Hartley, 2001) is based on the distribution of the linguistic information within a translated text, which is supposed similar between a learning corpus and the translation. This method is quite inexpensive, since it does not need any reference translation. First we describe the experimental method and the different tests we used. Then we show the promising results we obtained on the CESTA data, and how they correlate well with human judgments.

pdf abs
Evaluation of Automatic Speech Recognition and Speech Language Translation within TC-STAR:Results from the first evaluation campaign
Djamel Mostefa | Olivier Hamon | Khalid Choukri
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper reports on the evaluation activities conducted in the first year of the TC-STAR project. The TC-STAR project, financed by the European Commission within the Sixth Framework Program, is envisaged as a long-term effort to advance research in the core technologies of Speech-to-Speech Translation (SST). SST technology is a combination of Automatic Speech Recognition (ASR), Spoken Language Translation (SLT) and Text To Speech (TTS).

2005

In this paper, we report on the results of a full-size evaluation campaign of various MT systems. This campaign is novel compared to the classical DARPA/NIST MT evaluation campaigns in the sense that French is the target language, and that it includes an experiment of meta-evaluation of various metrics claiming to better predict different attributes of translation quality. We first describe the campaign, its context, its protocol and the data we used. Then we summarise the results obtained by the participating systems and discuss the meta-evaluation of the metrics used.