2006
pdf
abs
Terminological Resources Acquisition Tools: Toward a User-oriented Evaluation Model
Widad Mustafa El Hadi
|
Ismail Timimi
|
Marianne Dabbadie
|
Khalid Choukri
|
Olivier Hamon
|
Yun-Chuang Chiao
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This paper describes the CESART project which deals with the evaluation of terminological resources acquisition tools. The objective of the project is to propose and validate an evaluation protocol allowing one to objectively evaluate and compare different systems for terminology application such as terminological resource creation and semantic relation extraction. The project also aims to create quality-controlled resources such as domain-specific corpora, automatic scoring tool, etc.
pdf
abs
CESTA: First Conclusions of the Technolangue MT Evaluation Campaign
O. Hamon
|
A. Popescu-Belis
|
K. Choukri
|
M. Dabbadie
|
A. Hartley
|
W. Mustafa El Hadi
|
M. Rajman
|
I. Timimi
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This article outlines the evaluation protocol and provides the main results of the French Evaluation Campaign for Machine Translation Systems, CESTA. Following the initial objectives and evaluation plans, the evaluation metrics are briefly described: along with fluency and adequacy assessed by human judges, a number of recently proposed automated metrics are used. Two evaluation campaigns were organized, the first one in the general domain, and the second one in the medical domain. Up to six systems translating from English into French, and two systems translating from Arabic into French, took part in the campaign. The numerical results illustrate the differences between classes of systems, and provide interesting indications about the reliability of the automated metrics for French as a target language, both by comparison to human judges and using correlations between metrics. The corpora that were produced, as well as the information about the reliability of metrics, constitute reusable resources for MT evaluation.
2005
pdf
abs
Evaluation of Machine Translation with Predictive Metrics beyond BLEU/NIST: CESTA Evaluation Campaign # 1
Sylvain Surcin
|
Olivier Hamon
|
Antony Hartley
|
Martin Rajman
|
Andrei Popescu-Belis
|
Widad Mustafa El Hadi
|
Ismaïl Timimi
|
Marianne Dabbadie
|
Khalid Choukri
Proceedings of Machine Translation Summit X: Papers
In this paper, we report on the results of a full-size evaluation campaign of various MT systems. This campaign is novel compared to the classical DARPA/NIST MT evaluation campaigns in the sense that French is the target language, and that it includes an experiment of meta-evaluation of various metrics claiming to better predict different attributes of translation quality. We first describe the campaign, its context, its protocol and the data we used. Then we summarise the results obtained by the participating systems and discuss the meta-evaluation of the metrics used.
2004
pdf
EVALDA-CESART Project: Terminological Resources Acquisition Tools Evaluation Campaign
Widad Mustafa El Hadi
|
Ismail Timimi
|
Marianne Dabbadie
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
pdf
bib
CESTA: Machine Translation Evaluation Campaign [Work-in-Progress Project Report]
Widad Mustafa El Hadi
|
Marianne Dabbadie
|
Ismaïl Timimi
|
Martin Rajman
|
Philippe Langlais
|
Antony Hartley
|
Andrei Popescu Belis
Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training
2002
pdf
Terminological Enrichment for non-Interactive MT Evaluation
Marianne Dabbadie
|
Widad Mustafa El Hadi
|
Ismaïl Timimi
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
2001
pdf
abs
Setting a methodology for machine translation evaluation
Widad Mustafa El Hadi
|
Ismail Timimi
|
Marianne Dabbadie
Workshop on MT Evaluation
In this paper some of the problems encountered in designing an evaluation for an MT system will be examined. The source text, in French, provided by INRA (Institut National pour la Recherche Agronomique i.e. National Institute for Agronomic Research) deals with biotechnology and animal reproduction. It has been translated into English. The output of the system (i.e. the result of the assembling of several components), as opposed to its individual modules or specific components (i.e. analysis, generation, grammar, lexicon, core, etc.), will be evaluated. Moreover, the evaluation will concentrate on translation quality and its fidelity to the source text. The evaluation is not comparative, which means that we tested a specific MT system, not necessarily representative of other MT systems that can be found on the market.