“You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases

Dirk Hovy, Federico Bianchi, Tommaso Fornaciari


Abstract
The main goal of machine translation has been to convey the correct content. Stylistic considerations have been at best secondary. We show that as a consequence, the output of three commercial machine translation systems (Bing, DeepL, Google) make demographically diverse samples from five languages “sound” older and more male than the original. Our findings suggest that translation models reflect demographic bias in the training data. This opens up interesting new research avenues in machine translation to take stylistic considerations into account.
Anthology ID:
2020.acl-main.154
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1686–1690
Language:
URL:
https://aclanthology.org/2020.acl-main.154
DOI:
10.18653/v1/2020.acl-main.154
Bibkey:
Cite (ACL):
Dirk Hovy, Federico Bianchi, and Tommaso Fornaciari. 2020. “You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1686–1690, Online. Association for Computational Linguistics.
Cite (Informal):
“You Sound Just Like Your Father” Commercial Machine Translation Systems Include Stylistic Biases (Hovy et al., ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.acl-main.154.pdf
Dataset:
 2020.acl-main.154.Dataset.zip
Video:
 http://slideslive.com/38929070