The MAKE-NMTVIZ System Description for the WMT23 Literary Task

Fabien Lopez; Gabriela González; Damien Hansen; Mariam Nakhlé; Behnoosh Namdarzadeh; Nicolas Ballier; Marco Dinarelli; Emmanuelle Esperança-Rodier; Sui He; Sadaf Mohseni; Caroline Rossi; Didier Schwab; Jun Yang; Jean-Baptiste Yunès; Lichao Zhu

doi:10.18653/v1/2023.wmt-1.30

The MAKE-NMTVIZ System Description for the WMT23 Literary Task

Fabien Lopez, Gabriela González, Damien Hansen, Mariam Nakhle, Behnoosh Namdarzadeh, Nicolas Ballier, Marco Dinarelli, Emmanuelle Esperança-Rodier, Sui He, Sadaf Mohseni, Caroline Rossi, Didier Schwab, Jun Yang, Jean-Baptiste Yunès, Lichao Zhu

Abstract

This paper describes the MAKE-NMTVIZ Systems trained for the WMT 2023 Literary task. As a primary submission, we used Train, Valid1, test1 as part of the GuoFeng corpus (Wang et al., 2023) to fine-tune the mBART50 model with Chinese-English data. We followed very similar training parameters to (Lee et al. 2022) when fine-tuning mBART50. We trained for 3 epochs, using gelu as an activation function, with a learning rate of 0.05, dropout of 0.1 and a batch size of 16. We decoded using a beam search of size 5. For our contrastive1 submission, we implemented a fine-tuned concatenation transformer (Lupo et al., 2023). The training was developed in two steps: (i) a sentence-level transformer was implemented for 10 epochs trained using general, test1, and valid1 data (more details in contrastive2 system); (ii) second, we fine-tuned at document-level using 3-sentence concatenation for 4 epochs using train, test2, and valid2 data. During the fine-tuning, we used ReLU as an activation function, with an inverse square root learning rate, dropout of 0.1, and a batch size of 64. We decoded using a beam search of size. Four our contrastive2 and last submission, we implemented a sentence-level transformer model (Vaswani et al., 2017). The model was trained with general data for 10 epochs using general-purpose, test1, and valid 1 data. The training parameters were an inverse square root scheduled learning rate, a dropout of 0.1, and a batch size of 64. We decoded using a beam search of size 4. We then compared the three translation outputs from an interdisciplinary perspective, investigating some of the effects of sentence- vs document-based training. Computer scientists, translators and corpus linguists discussed the linguistic remaining issues for this discourse-level literary translation.

Anthology ID:: 2023.wmt-1.30
Volume:: Proceedings of the Eighth Conference on Machine Translation
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 287–295
Language:
URL:: https://aclanthology.org/2023.wmt-1.30
DOI:: 10.18653/v1/2023.wmt-1.30
Bibkey:
Cite (ACL):: Fabien Lopez, Gabriela González, Damien Hansen, Mariam Nakhle, Behnoosh Namdarzadeh, Nicolas Ballier, Marco Dinarelli, Emmanuelle Esperança-Rodier, Sui He, Sadaf Mohseni, Caroline Rossi, Didier Schwab, Jun Yang, Jean-Baptiste Yunès, and Lichao Zhu. 2023. The MAKE-NMTVIZ System Description for the WMT23 Literary Task. In Proceedings of the Eighth Conference on Machine Translation, pages 287–295, Singapore. Association for Computational Linguistics.
Cite (Informal):: The MAKE-NMTVIZ System Description for the WMT23 Literary Task (Lopez et al., WMT 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/2023.wmt-1.30.pdf

PDF Search