Mei Yang


LITE: Intent-based Task Representation Learning Using Weak Supervision
Naoki Otani | Michael Gamon | Sujay Kumar Jauhar | Mei Yang | Sri Raghu Malireddi | Oriana Riva
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Users write to-dos as personal notes to themselves, about things they need to complete, remember or organize. To-do texts are usually short and under-specified, which poses a challenge for current text representation models. Yet, understanding and representing their meaning is the first step towards providing intelligent assistance for to-do management. We address this problem by proposing a neural multi-task learning framework, LITE, which extracts representations of English to-do tasks with a multi-head attention mechanism on top of a pre-trained text encoder. To adapt representation models to to-do texts, we collect weak-supervision labels from semantically rich external resources (e.g., dynamic commonsense knowledge bases), following the principle that to-do tasks with similar intents have similar labels. We then train the model on multiple generative/predictive training objectives jointly. We evaluate our representation model on four downstream tasks and show that our approach consistently improves performance over baseline models, achieving error reduction of up to 38.7%.


Unsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation
Mei Yang | Katrin Kirchhoff
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

Most attempts at integrating word sense disambiguation with statistical machine translation have focused on supervised disambiguation approaches. These approaches are of limited use when the distribution of the test data differs strongly from that of the training data; however, word sense errors tend to be especially common under these conditions. In this paper we present different approaches to unsupervised word translation disambiguation and apply them to the problem of translating conversational speech under resource-poor training conditions. Both human and automatic evaluation metrics demonstrate significant improvements resulting from our technique.


Contextual Modeling for Meeting Translation Using Unsupervised Word Sense Disambiguation
Mei Yang | Katrin Kirchhoff
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)


Toward Smaller, Faster, and Better Hierarchical Phrase-based SMT
Mei Yang | Jing Zheng
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

The University of Washington machine translation system for IWSLT 2009
Mei Yang | Amittai Axelrod | Kevin Duh | Katrin Kirchhoff
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the University of Washington’s system for the 2009 International Workshop on Spoken Language Translation (IWSLT) evaluation campaign. Two systems were developed, one each for the BTEC Chinese-to-English and Arabic-to-English tracks. We describe experiments with different preprocessing and alignment combination schemes. Our main focus this year was on exploring a novel semi-supervised approach to N-best list reranking; however, this method yielded inconclusive results.


The University of Washington Machine Translation System for ACL WMT 2008
Amittai Axelrod | Mei Yang | Kevin Duh | Katrin Kirchhoff
Proceedings of the Third Workshop on Statistical Machine Translation

Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems
Xiaodong He | Mei Yang | Jianfeng Gao | Patrick Nguyen | Robert Moore
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing


The University of Washington machine translation system for the IWSLT 2007 competition
Katrin Kirchhoff | Mei Yang
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper presents the University of Washington’s submission to the 2007 IWSLT benchmark evaluation. The UW system participated in two data tracks, Italian-to-English and Arabic-to-English. Our main focus was on incorporating out-of-domain data, which contributed to improvements for both language pairs in both the clean text and ASR output conditions. In addition, we compared supervised and semi-supervised preprocessing schemes for the Arabic-to-English task and found that the semi-supervised scheme performs competitively with the supervised algorithm while using a fraction of the run-time.


Phrase-Based Backoff Models for Machine Translation of Highly Inflected Languages
Mei Yang | Katrin Kirchhoff
11th Conference of the European Chapter of the Association for Computational Linguistics


Improved Language Modeling for Statistical Machine Translation
Katrin Kirchhoff | Mei Yang
Proceedings of the ACL Workshop on Building and Using Parallel Texts