2009
pdf
Translation Memory Technology Assessment
Carol Van Ess-Dykema
|
Dennis Perzanowsky
|
Susan Converse
|
Rachel Richardson
|
John S. White
|
Tucker Maney
Proceedings of Machine Translation Summit XII: Government MT User Program
2007
pdf
bib
The Chinese Room Experiment: The Self-Organizing Feng Shui of MT
John S. White
|
Florence Reeder
Proceedings of the Workshop on the Chinese room experiment
2006
pdf
First strategies for integrating hybrid approaches into established systems
Jean Senellart
|
John S. White
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Panel on hybrid machine translation: why and how?
2003
pdf
abs
Granularity in MT evaluation
Florence Reeder
|
John White
Workshop on Systemizing MT Evaluation
This paper looks at granularity issues in machine translation evaluation. We start with work by (White, 2001) who examined the correlation between intelligibility and fidelity at the document level. His work showed that intelligibility and fidelity do not correlate well at the document level. These dissimilarities lead to our investigation of evaluation granularity. In particular, we revisit the intelligibility and fidelity relationship at the corpus level. We expect these to support certain assumptions in both evaluations as well as indicate issues germane to future evaluations.
2001
pdf
abs
The naming of things and the confusion of tongues: an MT metric
Florence Reeder
|
Keith Miller
|
Jennifer Doyon
|
John White
Workshop on MT Evaluation
This paper reports the results of an experiment in machine translation (MT) evaluation, designed to determine whether easily/rapidly collected metrics can predict the human generated quality parameters of MT output. In this experiment we evaluated a system’s ability to translate named entities, and compared this measure with previous evaluation scores of fidelity and intelligibility. There are two significant benefits potentially associated with a correlation between traditional MT measures and named entity scores: the ability to automate named entity scoring and thus MT scoring; and insights into the linguistic aspects of task-based uses of MT, as captured in previous studies.
pdf
abs
Predicting intelligibility from fidelity in MT evaluation
John White
Workshop on MT Evaluation
Attempts to formulate methods of automatically evaluating machine translation (MT) have generally looked at some attrinbute of translation and then tried, explicitly or implicitly, to extrapolate the measurement to cover a broader class of attributes. In particular, some studies have focused on measuring fidelity of translation, and inferring intelligibility from that, and others have taken the opposite approach. In this paper we examine the more fundamental question of whether, and to what extent, the one attribute can be predicted by the other. As a starting point we use the 1994 DARPA MT corpus, which has measures for both attributes, and perform a simple comparison of the behavior of each. Two hypotheses about a predictable inference between fidelity and intelligibility are compared with the comparative behavior across all language pairs and all documents in the corpus.
pdf
abs
Predicting MT fidelity from noun-compound handling
John White
|
Monika Forner
Workshop on MT Evaluation
Approaches to the automation of machine translation (MT) evaluation have attempted, or presumed, to connect some rapidly measurable phenomenon with general attributes of the MT output and/or system. In particular, measurements of the fluency of output are often asserted to be predictive of the usefulness of MT output in information-intensive, downstream tasks. The connections between the fluency (“intelligibility”) of translation and its informational adequacy (“fidelity”) are not actually straightforward. This paper discussed a small experiment in isolating a particular contrastive linguistic phenomena common to both French-English and Spanish-English pairs, and attempts to associate that behavior in machine and human translations with known fidelity properties of those translations. Our results show a definite correlative trend.
2000
pdf
abs
Contemplating automatic MT evaluation
John S. White
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: Technical Papers
Researchers, developers, translators and information consumers all share the problem that there is no accepted standard for machine translation. The problem is much further confounded by the fact that MT evaluations properly done require a considerable commitment of time and resources, an anachronism in this day of cross-lingual information processing when new MT systems may developed in weeks instead of years. This paper surveys the needs addressed by several of the classic “types” of MT, and speculates on ways that each of these types might be automated to create relevant, near-instantaneous evaluation of approaches and systems.
pdf
Book Reviews: Breadth and Depth of Semantic Lexicons
John S. White
Computational Linguistics, Volume 26, Number 4, December 2000
pdf
Determining the Tolerance of Text-handling Tasks for MT Output
John White
|
Jennifer Doyon
|
Susan Talbott
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)
pdf
bib
Task Tolerance of MT Output in Integrated Text Processes
John S. White
|
Jennifer B. Doyon
|
Susan W. Talbott
ANLP-NAACL 2000 Workshop: Embedded Machine Translation Systems
1999
pdf
abs
MT evaluation
Margaret King
|
Eduard Hovy
|
Benjamin K. Tsou
|
John White
|
Yusoff Zaharin
Proceedings of Machine Translation Summit VII
This panel deals with the general topic of evaluation of machine translation systems. The first contribution sets out some recent work on creating standards for the design of evaluations. The second, by Eduard Hovy. takes up the particular issue of how metrics can be differentiated and systematized. Benjamin K. T'sou suggests that whilst men may evaluate machines, machines may also evaluate men. John S. White focuses on the question of the role of the user in evaluation design, and Yusoff Zaharin points out that circumstances and settings may have a major influence on evaluation design.
pdf
abs
Task-based evaluation for machine translation
Jennifer B. Doyon
|
Kathryn B. Taylor
|
John S. White
Proceedings of Machine Translation Summit VII
In an effort to reduce the subjectivity, cost, and complexity of evaluation methods for machine translation (MT) and other language technologies, task-based assessment is examined as an alternative to metrics-based in human judgments about MT, i.e., the previously applied adequacy, fluency, and informativeness measures. For task-based evaluation strategies to be employed effectively to evaluate languageprocessing technologies in general, certain key elements must be known. Most importantly, the objectives the technology’s use is expected to accomplish must be known, the objectives must be expressed as tasks that accomplish the objectives, and then successful outcomes defined for the tasks. For MT, task-based evaluation is correlated to a scale of tasks, and has as its premise that certain tasks are more forgiving of errors than others. In other words, a poor translation may suffice to determine the general topic of a text, but may not permit accurate identification of participants or the specific event. The ordering of tasks according to their tolerance for errors, as determined by actual task outcomes provided in this paper, is the basis of a scale and repeatable process by which to measure MT systems that has advantages over previous methods.
1998
bib
MT evaluation
John S. White
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Tutorial Descriptions
pdf
abs
Predicting what MT is good for: user judgments and task performance
Kathryn Taylor
|
John White
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
As part of the Machine Translation (MT) Proficiency Scale project at the US Federal Intelligent Document Understanding Laboratory (FIDUL), Litton PRC is developing a method to measure MT systems in terms of the tasks for which their output may be successfully used. This paper describes the development of a task inventory, i.e., a comprehensive list of the tasks analysts perform with translated material and details the capture of subjective user judgments and insights about MT samples. Also described are the user exercises conducted using machine and human translation samples and the assessment of task performance. By analyzing translation errors, user judgments about errors that interfere with task performance, and user task performance results, we isolate source language patterns which produce output problems. These patterns can then be captured in a single diagnostic test set, to be easily applied to any new Japanese-English system to predict the utility of its output.
1997
bib
abs
MT evaluation: old, new, and recycled
John White
Proceedings of Machine Translation Summit VI: Tutorials
The tutorial addresses the issues peculiar to machine translation evaluation, namely the difficulty in determining what constitutes correct translation, and which types of evaluation are the most meaningful for evaluation "consumers." The tutorial is structured around evaluation methods designed for particular purposes: types of MT design, stages in the development lifecycle, and intended end-use of a system that includes MT. It will provide an overview of the issues and classic approaches to MT evaluation. The traditional processes, such as those outlined in the ALPAC report, will be examined for their value historically and in terms of today's environments. The tutorial also provides an insight into the latest evaluation techniques, designed to capture the value of MT systems in the context of current and future automated text handling processes.
1996
pdf
Adaptation of the DARPA machine translation evlauation paradigm to end-to-end systems
John S. White
|
Theresa A. O’Connell
Conference of the Association for Machine Translation in the Americas
pdf
The primacy of core technology MT evaluation
John S. White
Conference of the Association for Machine Translation in the Americas
1995
pdf
Approaches to black box MT evaluation
John S. White
Proceedings of Machine Translation Summit V
1994
pdf
The ARPA MT Evaluation Methodologies: Evolution, Lessons, and Future Approaches
John S. White
|
Theresa A. O’Connell
|
Francis E. O’Mara
Proceedings of the First Conference of the Association for Machine Translation in the Americas
pdf
The Role of MT Evaluation
Scott Bennett
|
George Doddington
|
Mary Flanagan
|
Laurie Gerber
|
Maghi King
|
Marjorie León
|
John White
Proceedings of the First Conference of the Association for Machine Translation in the Americas
pdf
Evaluation in the ARPA Machine Translation Program: 1993 Methodology
John S. White
|
Theresa A. O’Connell
Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994
1993
pdf
Book Reviews: Questions and Information Systems
John S. White
Computational Linguistics, Volume 19, Number 2, June 1993, Special Issue on Using Large Corpora: II
pdf
Evaluation of Machine Translation
John S. White
|
Theresa A. O’Connell
Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993
1991
pdf
Lexical and World Knowledge: Theoretical and Applied Viewpoints
John S. White
Lexical Semantics and Knowledge Representation
1989
pdf
Book Reviews: Machine Translation Systems
John S. White
Computational Linguistics, Volume 15, Number 3, September 1989
1988
pdf
Application of natural language interface to a machine translation problem
Heidi M. Johnson
|
Yukiko Sekine
|
John S. White
|
Gil C. Kim
Proceedings of the Second Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages
1986
pdf
What Should Machine Translation Be?
John S. White
24th Annual Meeting of the Association for Computational Linguistics
1985
Characteristics of the metal Machine Translation System at Production Stage
John S. White
Proceedings of the first Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages