Maryam Siahbani

2018

pdf
Simultaneous Translation using Optimized Segmentation
Maryam Siahbani | Hassan Shavarani | Ashkan Alinejad | Anoop Sarkar
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf abs
Prediction Improves Simultaneous Neural Machine Translation
Ashkan Alinejad | Maryam Siahbani | Anoop Sarkar
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Simultaneous speech translation aims to maintain translation quality while minimizing the delay between reading input and incrementally producing the output. We propose a new general-purpose prediction action which predicts future words in the input to improve quality and minimize delay in simultaneous translation. We train this agent using reinforcement learning with a novel reward function. Our agent with prediction has better translation quality and less delay compared to an agent-based simultaneous translation system without prediction.

2017

pdf abs
Lexicalized Reordering for Left-to-Right Hierarchical Phrase-based Translation
Maryam Siahbani | Anoop Sarkar
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Phrase-based and hierarchical phrase-based (Hiero) translation models differ radically in the way reordering is modeled. Lexicalized reordering models play an important role in phrase-based MT and such models have been added to CKY-based decoders for Hiero. Watanabe et al. (2006) proposed a promising decoding algorithm for Hiero (LR-Hiero) that visits input spans in arbitrary order and produces the translation in left to right (LR) order which leads to far fewer language model calls and leads to a considerable speedup in decoding. We introduce a novel shift-reduce algorithm to LR-Hiero to decode with our lexicalized reordering model (LRM) and show that it improves translation quality for Czech-English, Chinese-English and German-English.

2015

pdf
Learning segmentations that balance latency versus quality in spoken language translation
Hassan Shavarani | Maryam Siahbani | Ramtin Mehdizadeh Seraj | Anoop Sarkar
Proceedings of the 12th International Workshop on Spoken Language Translation: Papers

pdf
Improving Statistical Machine Translation with a Multilingual Paraphrase Database
Ramtin Mehdizadeh Seraj | Maryam Siahbani | Anoop Sarkar
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib abs
Expressive hierarchical rule extraction for left-to-right translation
Maryam Siahbani | Anoop Sarkar
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

Left-to-right (LR) decoding Watanabe et al. (2006) is a promising decoding algorithm for hierarchical phrase-based translation (Hiero) that visits input spans in arbitrary order producing the output translation in left to right order. This leads to far fewer language model calls. But the constrained SCFG grammar used in LR-Hiero (GNF) with at most two non-terminals is unable to account for some complex phrasal reordering. Allowing more non-terminals in the rules results in a more expressive grammar. LR-decoding can be used to decode with SCFGs with more than two non-terminals, but the CKY decoders used for Hiero systems cannot deal with such expressive grammars due to a blowup in computational complexity. In this paper we present a dynamic programming algorithm for GNF rule extraction which efficiently extracts sentence level SCFG rule sets with an arbitrary number of non-terminals. We analyze the performance of the obtained grammar for statistical machine translation on three language pairs.

pdf
Two Improvements to Left-to-Right Decoding for Hierarchical Phrase-based Machine Translation
Maryam Siahbani | Anoop Sarkar
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)