Syntax-Based Statistical Machine Translation

Philip Williams, Philipp Koehn


Abstract
The tutorial explains in detail syntax-based statistical machine translation with synchronous context free grammars (SCFG). It is aimed at researchers who have little background in this area, and gives a comprehensive overview about the main models and methods.While syntax-based models in statistical machine translation have a long history, spanning back almost 20 years, they have only recently shown superior translation quality over the more commonly used phrase-based models, and are now considered state of the art for some language pairs, such as Chinese-English (since ISI's submission to NIST 2006), and English-German (since Edinburgh's submission to WMT 2012).While the field is very dynamic, there is a core set of methods that have become dominant. Such SCFG models are implemented in the open source machine translation toolkit Moses, and the tutors draw from the practical experience of its development.The tutorial focuses on explaining core established concepts in SCFG-based approaches, which are the most popular in this area. The main goal of the tutorial is for the audience to understand how these systems work end-to-end. We review as much relevant literature as necessary, but the tutorial is not a primarily research survey.The tutorial is rounded up with open problems and advanced topics, such as computational challenges, different formalisms for syntax-based models and inclusion of semantics.
Anthology ID:
D14-2005
Volume:
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Month:
October
Year:
2014
Address:
Doha, Qatar
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/D14-2005
DOI:
Bibkey:
Cite (ACL):
Philip Williams and Philipp Koehn. 2014. Syntax-Based Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, Doha, Qatar. Association for Computational Linguistics.
Cite (Informal):
Syntax-Based Statistical Machine Translation (Williams & Koehn, EMNLP 2014)
Copy Citation: