Abstract
We describe our two NMT systems submitted to the WMT2021 shared task in English-Czech news translation: CUNI-DocTransformer (document-level CUBBITT) and CUNI-Marian-Baselines. We improve the former with a better sentence-segmentation pre-processing and a post-processing for fixing errors in numbers and units. We use the latter for experiments with various backtranslation techniques.- Anthology ID:
- 2021.wmt-1.7
- Volume:
- Proceedings of the Sixth Conference on Machine Translation
- Month:
- November
- Year:
- 2021
- Address:
- Online
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 123–129
- Language:
- URL:
- https://aclanthology.org/2021.wmt-1.7
- DOI:
- Cite (ACL):
- Petr Gebauer, Ondřej Bojar, Vojtěch Švandelík, and Martin Popel. 2021. CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT. In Proceedings of the Sixth Conference on Machine Translation, pages 123–129, Online. Association for Computational Linguistics.
- Cite (Informal):
- CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT (Gebauer et al., WMT 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.wmt-1.7.pdf