Hendra Setiawan


2023

pdf
Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data
Mozhdeh Gheini | Tatiana Likhomanenko | Matthias Sperber | Hendra Setiawan
Findings of the Association for Computational Linguistics: ACL 2023

Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language. Specifically, self-training, or pseudo-labeling, labels unsupervised data and adds that to the training pool. In this work, we investigate and use pseudo-labeling for a recently proposed novel setup: joint transcription and translation of speech, which suffers from an absence of sufficient parallel data resources. We show that under such data-deficient circumstances, the unlabeled data can significantly vary in domain from the supervised data, which results in pseudo-label quality degradation. We investigate two categories of remedies that require no additional supervision and target the domain mismatch: pseudo-label filtering and data augmentation. We show that pseudo-label analysis and processing in this way results in additional gains on top of the vanilla pseudo-labeling setup providing a total improvement of up to 0.4% absolute WER and 2.1 BLEU points for En–De and 0.6% absolute WER and 2.2 BLEU points for En–Zh.

2022

pdf
End-to-End Speech Translation for Code Switched Speech
Orion Weller | Matthias Sperber | Telmo Pires | Hendra Setiawan | Christian Gollan | Dominic Telaar | Matthias Paulik
Findings of the Association for Computational Linguistics: ACL 2022

Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages. CS can pose significant accuracy challenges to NLP, due to the often monolingual nature of the underlying systems. In this work, we focus on CS in the context of English/Spanish conversations for the task of speech translation (ST), generating and evaluating both transcript and translation. To evaluate model performance on this task, we create a novel ST corpus derived from existing public data sets. We explore various ST architectures across two dimensions: cascaded (transcribe then translate) vs end-to-end (jointly transcribe and translate) and unidirectional (source -> target) vs bidirectional (source <-> target). We show that our ST architectures, and especially our bidirectional end-to-end architecture, perform well on CS speech, even when no CS training data is used.

2020

pdf
Variational Neural Machine Translation with Normalizing Flows
Hendra Setiawan | Matthias Sperber | Udhyakumar Nallasamy | Matthias Paulik
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations, conditioned not only on the source sentence but also on some latent random variables. The latent variable modeling may introduce useful statistical dependencies that can improve translation accuracy. Unfortunately, learning informative latent variables is non-trivial, as the latent space can be prohibitively large, and the latent codes are prone to be ignored by many translation models at training time. Previous works impose strong assumptions on the distribution of the latent code and limit the choice of the NMT architecture. In this paper, we propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows. We demonstrate the efficacy of our proposal under both in-domain and out-of-domain conditions, significantly outperforming strong baselines.

pdf
Consistent Transcription and Translation of Speech
Matthias Sperber | Hendra Setiawan | Christian Gollan | Udhyakumar Nallasamy | Matthias Paulik
Transactions of the Association for Computational Linguistics, Volume 8

The conventional paradigm in speech translation starts with a speech recognition step to generate transcripts, followed by a translation step with the automatic transcripts as input. To address various shortcomings of this paradigm, recent work explores end-to-end trainable direct models that translate without transcribing. However, transcripts can be an indispensable output in practical applications, which often display transcripts alongside the translations to users.We make this common requirement explicit and explore the task of jointly transcribing and translating speech. Although high accuracy of transcript and translation are crucial, even highly accurate systems can suffer from inconsistencies between both outputs that degrade the user experience. We introduce a methodology to evaluate consistency and compare several modeling approaches, including the traditional cascaded approach and end-to-end models. We find that direct models are poorly suited to the joint transcription/translation task, but that end-to-end models that feature a coupled inference procedure are able to achieve strong consistency. We further introduce simple techniques for directly optimizing for consistency, and analyze the resulting trade-offs between consistency, transcription accuracy, and translation accuracy.1

2015

pdf
Statistical Machine Translation Features with Multitask Tensor Networks
Hendra Setiawan | Zhongqiang Huang | Jacob Devlin | Thomas Lamar | Rabih Zbib | Richard Schwartz | John Makhoul
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2013

pdf
Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
Hendra Setiawan | Bowen Zhou | Bing Xiang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf
Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning
Hendra Setiawan | Bowen Zhou
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
Hendra Setiawan | Bowen Zhou | Bing Xiang | Libin Shen
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2010

pdf
Discriminative Word Alignment with a Function Word Reordering Model
Hendra Setiawan | Chris Dyer | Philip Resnik
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Generalizing Hierarchical Phrase-based Translation using Rules with Adjacent Nonterminals
Hendra Setiawan | Philip Resnik
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models
Chris Dyer | Adam Lopez | Juri Ganitkevitch | Jonathan Weese | Ferhan Ture | Phil Blunsom | Hendra Setiawan | Vladimir Eidelman | Philip Resnik
Proceedings of the ACL 2010 System Demonstrations

2009

pdf
Topological Ordering of Function Words in Hierarchical Phrase-based Translation
Hendra Setiawan | Min-Yen Kan | Haizhou Li | Philip Resnik
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf
The University of Maryland Statistical Machine Translation System for the Fourth Workshop on Machine Translation
Chris Dyer | Hendra Setiawan | Yuval Marton | Philip Resnik
Proceedings of the Fourth Workshop on Statistical Machine Translation

2007

pdf
Ordering Phrases with Function Words
Hendra Setiawan | Min-Yen Kan | Haizhou Li
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2005

pdf
Learning Phrase Translation using Level of Detail Approach
Hendra Setiawan | Haizhou Li | Min Zhang
Proceedings of Machine Translation Summit X: Papers

We propose a simplified Level Of Detail (LOD) algorithm to learn phrase translation for statistical machine translation. In particular, LOD learns unknown phrase translations from parallel texts without linguistic knowledge. LOD uses an agglomerative method to attack the combinatorial explosion that results when generating candidate phrase translations. Although LOD was previously proposed by (Setiawan et al., 2005), we improve the original algorithm in two ways: simplifying the algorithm and using a simpler translation model. Experimental results show that our algorithm provides comparable performance while demonstrating a significant reduction in computation time.

pdf
Phrase-Based Statistical Machine Translation: A Level of Detail Approach
Hendra Setiawan | Haizhou Li | Min Zhang | Beng Chin Ooi
Second International Joint Conference on Natural Language Processing: Full Papers

pdf
A Phrase-Based Context-Dependent Joint Probability Model for Named Entity Translation
Min Zhang | Haizhou Li | Jian Su | Hendra Setiawan
Second International Joint Conference on Natural Language Processing: Full Papers