Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.

Daniil Moskovskiy, Daryna Dementieva, Alexander Panchenko


Abstract
Detoxification is a task of generating text in polite style while preserving meaning and fluency of the original toxic text. Existing detoxification methods are monolingual i.e. designed to work in one exact language. This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models in this setting. Unlike previous works we aim to make large language models able to perform detoxification without direct fine-tuning in a given language. Experiments show that multilingual models are capable of performing multilingual style transfer. However, tested state-of-the-art models are not able to perform cross-lingual detoxification and direct fine-tuning on exact language is currently inevitable and motivating the need of further research in this direction.
Anthology ID:
2022.acl-srw.26
Original:
2022.acl-srw.26v1
Version 2:
2022.acl-srw.26v2
Version 3:
2022.acl-srw.26v3
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Samuel Louvan, Andrea Madotto, Brielen Madureira
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
346–354
Language:
URL:
https://aclanthology.org/2022.acl-srw.26
DOI:
10.18653/v1/2022.acl-srw.26
Bibkey:
Cite (ACL):
Daniil Moskovskiy, Daryna Dementieva, and Alexander Panchenko. 2022. Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 346–354, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models. (Moskovskiy et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2022.acl-srw.26.pdf
Code
 skoltech-nlp/multilingual_detox