Marianna Martindale

Also published as: Marianna J. Martindale


Machine Translation Believability
Marianna Martindale | Kevin Duh | Marine Carpuat
Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing

Successful Machine Translation (MT) deployment requires understanding not only the intrinsic qualities of MT output, such as fluency and adequacy, but also user perceptions. Users who do not understand the source language respond to MT output based on their perception of the likelihood that the meaning of the MT output matches the meaning of the source text. We refer to this as believability. Output that is not believable may be off-putting to users, but believable MT output with incorrect meaning may mislead them. In this work, we study the relationship of believability to fluency and adequacy by applying traditional MT direct assessment protocols to annotate all three features on the output of neural MT systems. Quantitative analysis of these annotations shows that believability is closely related to but distinct from fluency, and initial qualitative analysis suggests that semantic features may account for the difference.


Responsible ‘Gist’ MT Use in the Age of Neural MT
Marianna J. Martindale
Workshop on the Impact of Machine Translation (iMpacT 2020)


An Exploration of Placeholding in Neural Machine Translation
Matt Post | Shuoyang Ding | Marianna Martindale | Winston Wu
Proceedings of Machine Translation Summit XVII: Research Track

Identifying Fluently Inadequate Output in Neural and Statistical Machine Translation
Marianna Martindale | Marine Carpuat | Kevin Duh | Paul McNamee
Proceedings of Machine Translation Summit XVII: Research Track


Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT
Marianna Martindale | Marine Carpuat
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)


A Study of Style in Machine Translation: Controlling the Formality of Machine Translation Output
Xing Niu | Marianna Martindale | Marine Carpuat
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Stylistic variations of language, such as formality, carry speakers’ intention beyond literal meaning and should be conveyed adequately in translation. We propose to use lexical formality models to control the formality level of machine translation output. We demonstrate the effectiveness of our approach in empirical evaluations, as measured by automatic metrics and human assessments.


MoJo: Bringing Hybrid MT to the Center for Applied Machine Translation
Marianna Martindale
Conferences of the Association for Machine Translation in the Americas: MT Users' Track


Class-based N-gram language difference models for data selection
Amittai Axelrod | Yogarshi Vyas | Marianna Martindale | Marine Carpuat
Proceedings of the 12th International Workshop on Spoken Language Translation: Papers


Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine?
Marianna J. Martindale
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Statistical post-editing has been shown in several studies to increase BLEU score for rule-based MT systems. However, previous studies have relied solely on BLEU and have not conducted further study to determine whether those gains indicated an increase in quality or in score alone. In this work we conduct a human evaluation of statistical post-edited output from a weak rule-based MT system, comparing the results with the output of the original rule-based system and a phrase-based statistical MT system trained on the same data. We show that for this weak rule-based system, despite significant BLEU score increases, human evaluators prefer the output of the original system. While this is not a generally conclusive condemnation of statistical post-editing, this result does cast doubt on the efficacy of statistical post-editing for weak MT systems and on the reliability of BLEU score for comparison between weak rule-based and hybrid systems built from them.