Sometimes We Want Ungrammatical Translations
Prasanna Parthasarathi, Koustuv Sinha, Joelle Pineau, Adina Williams
Abstract
Rapid progress in Neural Machine Translation (NMT) systems over the last few years has focused primarily on improving translation quality, and as a secondary focus, improving robustness to perturbations (e.g. spelling). While performance and robustness are important objectives, by over-focusing on these, we risk overlooking other important properties. In this paper, we draw attention to the fact that for some applications, faithfulness to the original (input) text is important to preserve, even if it means introducing unusual language patterns in the (output) translation. We propose a simple, novel way to quantify whether an NMT system exhibits robustness or faithfulness, by focusing on the case of word-order perturbations. We explore a suite of functions to perturb the word order of source sentences without deleting or injecting tokens, and measure their effects on the target side. Across several experimental conditions, we observe a strong tendency towards robustness rather than faithfulness. These results allow us to better understand the trade-off between faithfulness and robustness in NMT, and opens up the possibility of developing systems where users have more autonomy and control in selecting which property is best suited for their use case.- Anthology ID:
- 2021.findings-emnlp.275
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3205–3227
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.275
- DOI:
- 10.18653/v1/2021.findings-emnlp.275
- Cite (ACL):
- Prasanna Parthasarathi, Koustuv Sinha, Joelle Pineau, and Adina Williams. 2021. Sometimes We Want Ungrammatical Translations. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3205–3227, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Sometimes We Want Ungrammatical Translations (Parthasarathi et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2021.findings-emnlp.275.pdf
- Code
- ppartha03/umt