Jennifer Doyon

Also published as: Jennifer B. Doyon


2022

pdf
Speech-to-Text and Evaluation of Multiple Machine Translation Systems
Evelyne Tzoukermann | Steven Van Guilder | Jennifer Doyon | Ekaterina Harke
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)

The National Virtual Translation Center (NVTC) and the larger Federal Bureau of Investiga-tion (FBI) seek to acquire tools that will facilitate its mission to provide English translations of non-English language audio and video files. In the text domain, NVTC has been using translation memory (TM) for some time and has reported on the incorporation of machine translation (MT) into that workflow. While we have explored the use of speech-to-text (STT) and speech translation (ST) in the past, we have now invested in the creation of a substantial human-created corpus to thoroughly evaluate alternatives in three languages: French, Rus-sian, and Persian. We report on the results of multiple STT systems combined with four MT systems for these languages. We evaluated and scored the different systems in combination and analyzed results. This points the way to the most successful tool combination to deploy in this workflow.

2021

pdf
Corpus Creation and Evaluation for Speech-to-Text and Speech Translation
Corey Miller | Evelyne Tzoukermann | Jennifer Doyon | Elizabeth Mallard
Proceedings of Machine Translation Summit XVIII: Users and Providers Track

The National Virtual Translation Center (NVTC) seeks to acquire human language technology (HLT) tools that will facilitate its mission to provide verbatim English translations of foreign language audio and video files. In the text domain, NVTC has been using translation memory (TM) for some time and has reported on the incorporation of machine translation (MT) into that workflow (Miller et al., 2020). While we have explored the use of speech-totext (STT) and speech translation (ST) in the past (Tzoukermann and Miller, 2018), we have now invested in the creation of a substantial human-made corpus to thoroughly evaluate alternatives. Results from our analysis of this corpus and the performance of HLT tools point the way to the most promising ones to deploy in our workflow.

2018

pdf bib
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)
Janice Campbell | Alex Yanishevsky | Jennifer Doyon | Doug Jones
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)

2008

pdf
Automated Machine Translation Improvement Through Post-Editing Techniques: Analyst and Translator Experiments
Jennifer Doyon | Christine Doran | C. Donald Means | Domenique Parr
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Government and Commercial Uses of MT

From the Automatic Language Processing Advisory Committee (ALP AC) (Pierce et al., 1966) machine translation (MT) evaluations of the ‘60s to the Defense Advanced Research Projects Agency (DARPA) Global Autonomous Language Exploitation (GALE) (Olive, 2008) and National Institute of Standards and Technology (NIST) (NIST, 2008) MT evaluations of today, the U.S. Government has been instrumental in establishing measurements and baselines for the state-of-the-art in MT engines. In the same vein, the Automated Machine Translation Improvement Through Post-Editing Techniques (PEMT) project sought to establish a baseline of MT engines based on the perceptions of potential users. In contrast to these previous evaluations, the PEMT project’s experiments also determined the minimal quality level output needed to achieve before users found the output acceptable. Based on these findings, the PEMT team investigated using post-editing techniques to achieve this level. This paper will present experiments in which analysts and translators were asked to evaluate MT output processed with varying post-editing techniques. The results show at what level the analysts and translators find MT useful and are willing to work with it. We also establish a ranking of the types of post-edits necessary to elevate MT output to the minimal acceptance level.

2001

pdf
The naming of things and the confusion of tongues: an MT metric
Florence Reeder | Keith Miller | Jennifer Doyon | John White
Workshop on MT Evaluation

This paper reports the results of an experiment in machine translation (MT) evaluation, designed to determine whether easily/rapidly collected metrics can predict the human generated quality parameters of MT output. In this experiment we evaluated a system’s ability to translate named entities, and compared this measure with previous evaluation scores of fidelity and intelligibility. There are two significant benefits potentially associated with a correlation between traditional MT measures and named entity scores: the ability to automate named entity scoring and thus MT scoring; and insights into the linguistic aspects of task-based uses of MT, as captured in previous studies.

2000

pdf
Determining the Tolerance of Text-handling Tasks for MT Output
John White | Jennifer Doyon | Susan Talbott
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Task Tolerance of MT Output in Integrated Text Processes
John S. White | Jennifer B. Doyon | Susan W. Talbott
ANLP-NAACL 2000 Workshop: Embedded Machine Translation Systems

1999

pdf
Task-based evaluation for machine translation
Jennifer B. Doyon | Kathryn B. Taylor | John S. White
Proceedings of Machine Translation Summit VII

In an effort to reduce the subjectivity, cost, and complexity of evaluation methods for machine translation (MT) and other language technologies, task-based assessment is examined as an alternative to metrics-based in human judgments about MT, i.e., the previously applied adequacy, fluency, and informativeness measures. For task-based evaluation strategies to be employed effectively to evaluate languageprocessing technologies in general, certain key elements must be known. Most importantly, the objectives the technology’s use is expected to accomplish must be known, the objectives must be expressed as tasks that accomplish the objectives, and then successful outcomes defined for the tasks. For MT, task-based evaluation is correlated to a scale of tasks, and has as its premise that certain tasks are more forgiving of errors than others. In other words, a poor translation may suffice to determine the general topic of a text, but may not permit accurate identification of participants or the specific event. The ordering of tasks according to their tolerance for errors, as determined by actual task outcomes provided in this paper, is the basis of a scale and repeatable process by which to measure MT systems that has advantages over previous methods.