Abstract
Machine translation models sometimes lead to added toxicity: translated outputs may contain more toxic content that the original input. In this paper, we introduce MinTox, a novel pipeline to automatically identify and mitigate added toxicity at inference time, without further model training. MinTox leverages a multimodal (speech and text) toxicity classifier that can scale across languages.We demonstrate the capabilities of MinTox when applied to SEAMLESSM4T, a multi-modal and massively multilingual machine translation system. MinTox significantly reduces added toxicity: across all domains, modalities and language directions, 25% to95% of added toxicity is successfully filtered out, while preserving translation quality- Anthology ID:
- 2024.eamt-1.31
- Volume:
- Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
- Month:
- June
- Year:
- 2024
- Address:
- Sheffield, UK
- Editors:
- Carolina Scarton, Charlotte Prescott, Chris Bayliss, Chris Oakley, Joanna Wright, Stuart Wrigley, Xingyi Song, Edward Gow-Smith, Rachel Bawden, Víctor M Sánchez-Cartagena, Patrick Cadwell, Ekaterina Lapshinova-Koltunski, Vera Cabarrão, Konstantinos Chatzitheodorou, Mary Nurminen, Diptesh Kanojia, Helena Moniz
- Venue:
- EAMT
- SIG:
- Publisher:
- European Association for Machine Translation (EAMT)
- Note:
- Pages:
- 360–372
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.eamt-1.31/
- DOI:
- Cite (ACL):
- Marta Costa-jussà, David Dale, Maha Elbayad, and Bokai Yu. 2024. Added Toxicity Mitigation at Inference Time for Multimodal and Massively Multilingual Translation. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 360–372, Sheffield, UK. European Association for Machine Translation (EAMT).
- Cite (Informal):
- Added Toxicity Mitigation at Inference Time for Multimodal and Massively Multilingual Translation (Costa-jussà et al., EAMT 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.eamt-1.31.pdf