Abstract
In this paper, we re-examine the Markov property in the context of neural machine translation. We design a Markov Autoregressive Transformer (MAT) and undertake a comprehensive assessment of its performance across four WMT benchmarks. Our findings indicate that MAT with an order larger than 4 can generate translations with quality on par with that of conventional autoregressive transformers. In addition, counter-intuitively, we also find that the advantages of utilizing a higher-order MAT do not specifically contribute to the translation of longer sentences.- Anthology ID:
- 2024.findings-eacl.40
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2024
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 582–588
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.findings-eacl.40/
- DOI:
- Cite (ACL):
- Cunxiao Du, Hao Zhou, Zhaopeng Tu, and Jing Jiang. 2024. Revisiting the Markov Property for Machine Translation. In Findings of the Association for Computational Linguistics: EACL 2024, pages 582–588, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Revisiting the Markov Property for Machine Translation (Du et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.findings-eacl.40.pdf