Predicting what MT is good for: user judgments and task performance

Kathryn Taylor, John White


Abstract
As part of the Machine Translation (MT) Proficiency Scale project at the US Federal Intelligent Document Understanding Laboratory (FIDUL), Litton PRC is developing a method to measure MT systems in terms of the tasks for which their output may be successfully used. This paper describes the development of a task inventory, i.e., a comprehensive list of the tasks analysts perform with translated material and details the capture of subjective user judgments and insights about MT samples. Also described are the user exercises conducted using machine and human translation samples and the assessment of task performance. By analyzing translation errors, user judgments about errors that interfere with task performance, and user task performance results, we isolate source language patterns which produce output problems. These patterns can then be captured in a single diagnostic test set, to be easily applied to any new Japanese-English system to predict the utility of its output.
Anthology ID:
1998.amta-papers.32
Volume:
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
October 28-31
Year:
1998
Address:
Langhorne, PA, USA
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
364–373
Language:
URL:
https://link.springer.com/chapter/10.1007/3-540-49478-2_33
DOI:
Bibkey:
Cite (ACL):
Kathryn Taylor and John White. 1998. Predicting what MT is good for: user judgments and task performance. In Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 364–373, Langhorne, PA, USA. Springer.
Cite (Informal):
Predicting what MT is good for: user judgments and task performance (Taylor & White, AMTA 1998)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/3-540-49478-2_33