2012
pdf
abs
Lost & Found in Translation: Impact of Machine Translated Results on Translingual Information Retrieval
Kristen Parton
|
Nizar Habash
|
Kathleen McKeown
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers
In an ideal cross-lingual information retrieval (CLIR) system, a user query would generate a search over documents in a different language and the relevant results would be presented in the user’s language. In practice, CLIR systems are typically evaluated by judging result relevance in the document language, to factor out the effects of translating the results using machine translation (MT). In this paper, we investigate the influence of four different approaches for integrating MT and CLIR on both retrieval accuracy and user judgment of relevancy. We create a corpus with relevance judgments for both human and machine translated results, and use it to quantify the effect that MT quality has on end-to-end relevance. We find that MT errors result in a 16-39% decrease in mean average precision over the ground truth system that uses human translations. MT errors also caused relevant sentences to appear irrelevant – 5-19% of sentences were relevant in human translation, but were judged irrelevant in MT. To counter this degradation, we present two hybrid retrieval models and two automatic MT post-editing techniques and show that these approaches substantially mitigate the errors and improve the end-to-end relevance.
pdf
abs
Learning to Automatically Post-Edit Dropped Words in MT
Jacob Mundt
|
Kristen Parton
|
Kathleen McKeown
Workshop on Post-Editing Technology and Practice
Automatic post-editors (APEs) can improve adequacy of MT output by detecting and reinserting dropped content words, but the location where these words are inserted is critical. In this paper, we describe a probabilistic approach for learning reinsertion rules for specific languages and MT systems, as well as a method for synthesizing training data from reference translations. We test the insertion logic on MT systems for Chinese to English and Arabic to English. Our adaptive APE is able to insert within 3 words of the best location 73% of the time (32% in the exact location) in Arabic-English MT output, and 67% of the time in Chinese-English output (30% in the exact location), and delivers improved performance on automated adequacy metrics over a previous rule-based approach to insertion. We consider how particular aspects of the insertion problem make it particularly amenable to machine learning solutions.
pdf
Can Automatic Post-Editing Make MT More Meaningful
Kristen Parton
|
Nizar Habash
|
Kathleen McKeown
|
Gonzalo Iglesias
|
Adrià de Gispert
Proceedings of the 16th Annual Conference of the European Association for Machine Translation
2011
pdf
E-rating Machine Translation
Kristen Parton
|
Joel Tetreault
|
Nitin Madnani
|
Martin Chodorow
Proceedings of the Sixth Workshop on Statistical Machine Translation
2010
pdf
MT Error Detection for Cross-Lingual Question Answering
Kristen Parton
|
Kathleen McKeown
Coling 2010: Posters
2009
pdf
Who, What, When, Where, Why? Comparing Multiple Approaches to the Cross-Lingual 5W Task
Kristen Parton
|
Kathleen R. McKeown
|
Bob Coyne
|
Mona T. Diab
|
Ralph Grishman
|
Dilek Hakkani-Tür
|
Mary Harper
|
Heng Ji
|
Wei Yun Ma
|
Adam Meyers
|
Sara Stolbach
|
Ang Sun
|
Gokhan Tur
|
Wei Xu
|
Sibel Yaman
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP