Task-based evaluation for machine translation

Jennifer B. Doyon, Kathryn B. Taylor, John S. White


Abstract
In an effort to reduce the subjectivity, cost, and complexity of evaluation methods for machine translation (MT) and other language technologies, task-based assessment is examined as an alternative to metrics-based in human judgments about MT, i.e., the previously applied adequacy, fluency, and informativeness measures. For task-based evaluation strategies to be employed effectively to evaluate languageprocessing technologies in general, certain key elements must be known. Most importantly, the objectives the technology’s use is expected to accomplish must be known, the objectives must be expressed as tasks that accomplish the objectives, and then successful outcomes defined for the tasks. For MT, task-based evaluation is correlated to a scale of tasks, and has as its premise that certain tasks are more forgiving of errors than others. In other words, a poor translation may suffice to determine the general topic of a text, but may not permit accurate identification of participants or the specific event. The ordering of tasks according to their tolerance for errors, as determined by actual task outcomes provided in this paper, is the basis of a scale and repeatable process by which to measure MT systems that has advantages over previous methods.
Anthology ID:
1999.mtsummit-1.85
Volume:
Proceedings of Machine Translation Summit VII
Month:
September 13-17
Year:
1999
Address:
Singapore, Singapore
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
574–578
Language:
URL:
https://aclanthology.org/1999.mtsummit-1.85
DOI:
Bibkey:
Cite (ACL):
Jennifer B. Doyon, Kathryn B. Taylor, and John S. White. 1999. Task-based evaluation for machine translation. In Proceedings of Machine Translation Summit VII, pages 574–578, Singapore, Singapore.
Cite (Informal):
Task-based evaluation for machine translation (Doyon et al., MTSummit 1999)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/1999.mtsummit-1.85.pdf