Towards Formality-Aware Neural Machine Translation by Leveraging Context Information

Dohee Kim, Yujin Baek, Soyoung Yang, Jaegul Choo


Abstract
Formality is one of the most important linguistic properties to determine the naturalness of translation. Although a target-side context contains formality-related tokens, the sparsity within the context makes it difficult for context-aware neural machine translation (NMT) models to properly discern them. In this paper, we introduce a novel training method to explicitly inform the NMT model by pinpointing key informative tokens using a formality classifier. Given a target context, the formality classifier guides the model to concentrate on the formality-related tokens within the context. Additionally, we modify the standard cross-entropy loss, especially toward the formality-related tokens obtained from the classifier. Experimental results show that our approaches not only improve overall translation quality but also reflect the appropriate formality from the target context.
Anthology ID:
2023.findings-emnlp.494
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7384–7392
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.494
DOI:
10.18653/v1/2023.findings-emnlp.494
Bibkey:
Cite (ACL):
Dohee Kim, Yujin Baek, Soyoung Yang, and Jaegul Choo. 2023. Towards Formality-Aware Neural Machine Translation by Leveraging Context Information. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7384–7392, Singapore. Association for Computational Linguistics.
Cite (Informal):
Towards Formality-Aware Neural Machine Translation by Leveraging Context Information (Kim et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.findings-emnlp.494.pdf