Kathryn Taylor


Predicting what MT is good for: user judgments and task performance
Kathryn Taylor | John White
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers

As part of the Machine Translation (MT) Proficiency Scale project at the US Federal Intelligent Document Understanding Laboratory (FIDUL), Litton PRC is developing a method to measure MT systems in terms of the tasks for which their output may be successfully used. This paper describes the development of a task inventory, i.e., a comprehensive list of the tasks analysts perform with translated material and details the capture of subjective user judgments and insights about MT samples. Also described are the user exercises conducted using machine and human translation samples and the assessment of task performance. By analyzing translation errors, user judgments about errors that interfere with task performance, and user task performance results, we isolate source language patterns which produce output problems. These patterns can then be captured in a single diagnostic test set, to be easily applied to any new Japanese-English system to predict the utility of its output.