Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES
Beatrice Savoldi, Marco Gaido, Matteo Negri, Luisa Bentivogli
Abstract
As part of the WMT-2023 “Test suites” shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHEWMT23 and INES. By focusing on the en-de and de-en language pairs, we rely on these newly created test suites to investigate systems’ ability to translate feminine and masculine gender and produce gender-inclusive translations. Furthermore we discuss metrics associated with our test suites and validate them by means of human evaluations. Our results indicate that systems achieve reasonable and comparable performance in correctly translating both feminine and masculine gender forms for naturalistic gender phenomena. Instead, the generation of inclusive language forms in translation emerges as a challenging task for all the evaluated MT models, indicating room for future improvements and research on the topic. We make MuST-SHEWMT23 and INES freely available.- Anthology ID:
- 2023.wmt-1.25
- Volume:
- Proceedings of the Eighth Conference on Machine Translation
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 252–262
- Language:
- URL:
- https://aclanthology.org/2023.wmt-1.25
- DOI:
- 10.18653/v1/2023.wmt-1.25
- Cite (ACL):
- Beatrice Savoldi, Marco Gaido, Matteo Negri, and Luisa Bentivogli. 2023. Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES. In Proceedings of the Eighth Conference on Machine Translation, pages 252–262, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES (Savoldi et al., WMT 2023)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2023.wmt-1.25.pdf