Mitigating Domain Mismatch in Machine Translation via Paraphrasing

Hyuga Koretaka, Tomoyuki Kajiwara, Atsushi Fujita, Takashi Ninomiya


Abstract
Quality of machine translation (MT) deteriorates significantly when translating texts having characteristics that differ from the training data, such as content domain. Although previous studies have focused on adapting MT models on a bilingual parallel corpus in the target domain, this approach is not applicable when no parallel data are available for the target domain or when utilizing black-box MT systems. To mitigate problems caused by such domain mismatch without relying on any corpus in the target domain, this study proposes a method to search for better translations by paraphrasing input texts of MT. To obtain better translations even for input texts from unforeknown domains, we generate their multiple paraphrases, translate each, and rerank the resulting translations to select the most likely one. Experimental results on Japanese-to-English translation reveal that the proposed method improves translation quality in terms of BLEU score for input texts from specific domains.
Anthology ID:
2023.wat-1.2
Volume:
Proceedings of the 10th Workshop on Asian Translation
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Toshiaki Nakazawa, Kazutaka Kinugawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, Shantipriya Parida, Makoto Morishita, Ondrej Bojar, Akiko Eriguchi, Yusuke Oda, Akiko Eriguchi, Chenhui Chu, Sadao Kurohashi
Venue:
WAT
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
29–40
Language:
URL:
https://aclanthology.org/2023.wat-1.2
DOI:
Bibkey:
Cite (ACL):
Hyuga Koretaka, Tomoyuki Kajiwara, Atsushi Fujita, and Takashi Ninomiya. 2023. Mitigating Domain Mismatch in Machine Translation via Paraphrasing. In Proceedings of the 10th Workshop on Asian Translation, pages 29–40, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
Mitigating Domain Mismatch in Machine Translation via Paraphrasing (Koretaka et al., WAT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.wat-1.2.pdf