Christine Doran

Also published as: C Doran

2012

Navigating Large Comment Threads with CoFi
Christine Doran | Guido Zarrella | John C. Henderson
Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2008

pdf bib abs

Automated Machine Translation Improvement Through Post-Editing Techniques: Analyst and Translator Experiments
Jennifer Doyon | Christine Doran | C. Donald Means | Domenique Parr
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Government and Commercial Uses of MT

From the Automatic Language Processing Advisory Committee (ALP AC) (Pierce et al., 1966) machine translation (MT) evaluations of the ‘60s to the Defense Advanced Research Projects Agency (DARPA) Global Autonomous Language Exploitation (GALE) (Olive, 2008) and National Institute of Standards and Technology (NIST) (NIST, 2008) MT evaluations of today, the U.S. Government has been instrumental in establishing measurements and baselines for the state-of-the-art in MT engines. In the same vein, the Automated Machine Translation Improvement Through Post-Editing Techniques (PEMT) project sought to establish a baseline of MT engines based on the perceptions of potential users. In contrast to these previous evaluations, the PEMT project’s experiments also determined the minimal quality level output needed to achieve before users found the output acceptable. Based on these findings, the PEMT team investigated using post-editing techniques to achieve this level. This paper will present experiments in which analysts and translators were asked to evaluate MT output processed with varying post-editing techniques. The results show at what level the analysts and translators find MT useful and are willing to work with it. We also establish a ranking of the types of post-edits necessary to elevate MT output to the minimal acceptance level.

There are currently two philosophies for building grammars and parsers – Statistically induced grammars and Wide-coverage grammars. One way to combine the strengths of both approaches is to have a wide-coverage grammar with a heuristic component which is domain independent but whose contribution is tuned to particular domains. In this paper, we discuss a three-stage approach to disambiguation in the context of a lexicalized grammar, using a variety of domain independent heuristic techniques. We present a training algorithm which uses hand-bracketed treebank parses to set the weights of these heuristics. We compare the performance of our grammar against the performance of the IBM statistical grammar, using both untrained and trained weights for the heuristics.

Co-authors

Venues

acl1

Christine Doran

2012

2008

2001

2000

1997

1996

1995

Co-authors

Venues