Abstract
Incorporating syntactic information in Neural Machine Translation (NMT) can lead to better reorderings, particularly useful when the language pairs are syntactically highly divergent or when the training bitext is not large. Previous work on using syntactic information, provided by top-1 parse trees generated by (inevitably error-prone) parsers, has been promising. In this paper, we propose a forest-to-sequence NMT model to make use of exponentially many parse trees of the source sentence to compensate for the parser errors. Our method represents the collection of parse trees as a packed forest, and learns a neural transducer to translate from the input forest to the target sentence. Experiments on English to German, Chinese and Farsi translation tasks show the superiority of our approach over the sequence-to-sequence and tree-to-sequence neural translation models.- Anthology ID:
- C18-1120
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1421–1429
- Language:
- URL:
- https://aclanthology.org/C18-1120
- DOI:
- Cite (ACL):
- Poorya Zaremoodi and Gholamreza Haffari. 2018. Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1421–1429, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model (Zaremoodi & Haffari, COLING 2018)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/C18-1120.pdf