An exploration of segmentation strategies in stream decoding

Andrew Finch, Xiaolin Wang, Eiichiro Sumita


Abstract
In this paper we explore segmentation strategies for the stream decoder a method for decoding from a continuous stream of input tokens, rather than the traditional method of decoding from sentence segmented text. The behavior of the decoder is analyzed and modifications to the decoding algorithm are proposed to improve its performance. The experimental results show our proposed decoding strategies to be effective, and add support to the original findings that this approach is capable of approaching the performance of the underlying phrase-based machine translation decoder, at useful levels of latency. Our experiments evaluated the stream decoder on a broader set of language pairs than in previous work. We found most European language pairs were similar in character, and report results on English-Chinese and English-German pairs which are of interest due to the reordering required.
Anthology ID:
2014.iwslt-papers.8
Volume:
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers
Month:
December 4-5
Year:
2014
Address:
Lake Tahoe, California
Venue:
IWSLT
SIG:
Publisher:
Note:
Pages:
206–213
Language:
URL:
https://aclanthology.org/2014.iwslt-papers.8
DOI:
Bibkey:
Cite (ACL):
Andrew Finch, Xiaolin Wang, and Eiichiro Sumita. 2014. An exploration of segmentation strategies in stream decoding. In Proceedings of the 11th International Workshop on Spoken Language Translation: Papers, pages 206–213, Lake Tahoe, California.
Cite (Informal):
An exploration of segmentation strategies in stream decoding (Finch et al., IWSLT 2014)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2014.iwslt-papers.8.pdf