MAX-ISI System at WMT23 Discourse-Level Literary Translation Task

Li An, Linghao Jin, Xuezhe Ma


Abstract
This paper describes our translation systems for the WMT23 shared task. We participated in the discourse-level literary translation task - constrained track. In our methodology, we conduct a comparative analysis between the conventional Transformer model and the recently introduced MEGA model, which exhibits enhanced capabilities in modeling long-range sequences compared to the traditional Transformers. To explore whether language models can more effectively harness document-level context using paragraph-level data, we took the approach of aggregating sentences into paragraphs from the original literary dataset provided by the organizers. This paragraph-level data was utilized in both the Transformer and MEGA models. To ensure a fair comparison across all systems, we employed a sentence-alignment strategy to reverse our translation results from the paragraph-level back to the sentence-level alignment. Finally, our evaluation process encompassed sentence-level metrics such as BLEU, as well as two document-level metrics: d-BLEU and BlonDe.
Anthology ID:
2023.wmt-1.29
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
282–286
Language:
URL:
https://aclanthology.org/2023.wmt-1.29
DOI:
10.18653/v1/2023.wmt-1.29
Bibkey:
Cite (ACL):
Li An, Linghao Jin, and Xuezhe Ma. 2023. MAX-ISI System at WMT23 Discourse-Level Literary Translation Task. In Proceedings of the Eighth Conference on Machine Translation, pages 282–286, Singapore. Association for Computational Linguistics.
Cite (Informal):
MAX-ISI System at WMT23 Discourse-Level Literary Translation Task (An et al., WMT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2023.wmt-1.29.pdf