Robust parfda Statistical Machine Translation Results

Ergun Biçici


Abstract
We build parallel feature decay algorithms (parfda) Moses statistical machine translation (SMT) models for language pairs in the translation task. parfda obtains results close to the top constrained phrase-based SMT with an average of 2.252 BLEU points difference on WMT 2017 datasets using significantly less computation for building SMT systems than that would be spent using all available corpora. We obtain BLEU upper bounds based on target coverage to identify which systems used additional data. We use PRO for tuning to decrease fluctuations in the results and postprocess translation outputs to decrease translation errors due to the casing of words. F1 scores on the key phrases of the English to Turkish testsuite that we prepared reveal that parfda achieves 2nd best results. Truecasing translations before scoring obtained the best results overall.
Anthology ID:
W18-6405
Volume:
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
Month:
October
Year:
2018
Address:
Belgium, Brussels
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
345–354
Language:
URL:
https://aclanthology.org/W18-6405
DOI:
10.18653/v1/W18-6405
Bibkey:
Cite (ACL):
Ergun Biçici. 2018. Robust parfda Statistical Machine Translation Results. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pages 345–354, Belgium, Brussels. Association for Computational Linguistics.
Cite (Informal):
Robust parfda Statistical Machine Translation Results (Biçici, WMT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp22-frontmatter/W18-6405.pdf