Yuntian Deng

2023

pdf abs
Tree Prompting: Efficient Task Adaptation without Fine-Tuning
Chandan Singh | John Morris | Alexander Rush | Jianfeng Gao | Yuntian Deng
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Prompting language models (LMs) is the main interface for applying them to new tasks. However, for smaller LMs, prompting provides low accuracy compared to gradient-based fine-tuning. Tree Prompting is an approach to prompting which builds a decision tree of prompts, linking multiple prompt-LM calls together to solve a task. At inference time, each call to the LM is determined by efficiently routing the outcome of the previous call using the tree. Experiments on classification datasets show that Tree Prompting improves accuracy over competing methods and is competitive with fine-tuning. We also show that variants of Tree Prompting allow inspection of a model’s decision-making process.

2022

pdf abs
Model Criticism for Long-Form Text Generation
Yuntian Deng | Volodymyr Kuleshov | Alexander Rush
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Language models have demonstrated the ability to generate highly fluent text; however, it remains unclear whether their output retains coherent high-level structure (e.g., story progression). Here, we propose to apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of the generated text. Model criticism compares the distributions between real and generated data in a latent space obtained according to an assumptive generative process. Different generative processes identify specific failure modes of the underlying model. We perform experiments on three representative aspects of high-level discourse—coherence, coreference, and topicality—and find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.

2021

pdf abs
Sequence-to-Lattice Models for Fast Translation
Yuntian Deng | Alexander Rush
Findings of the Association for Computational Linguistics: EMNLP 2021

Non-autoregressive machine translation (NAT) approaches enable fast generation by utilizing parallelizable generative processes. The remaining bottleneck in these models is their decoder layers; unfortunately unlike in autoregressive models (Kasai et al., 2020), removing decoder layers from NAT models significantly degrades accuracy. This work proposes a sequence-to-lattice model that replaces the decoder with a search lattice. Our approach first constructs a candidate lattice using efficient lookup operations, generates lattice scores from a deep encoder, and finally finds the best path using dynamic programming. Experiments on three machine translation datasets show that our method is faster than past non-autoregressive generation approaches, and more accurate than naively reducing the number of decoder layers.

pdf abs
Rationales for Sequential Predictions
Keyon Vafa | Yuntian Deng | David Blei | Alexander Rush
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We find sequential rationales by solving a combinatorial optimization: the best rationale is the smallest subset of input tokens that would predict the same output as the full sequence. Enumerating all subsets is intractable, so we propose an efficient greedy algorithm to approximate this objective. The algorithm, which is called greedy rationalization, applies to any model. For this approach to be effective, the model should form compatible conditional distributions when making predictions on incomplete subsets of the context. This condition can be enforced with a short fine-tuning step. We study greedy rationalization on language modeling and machine translation. Compared to existing baselines, greedy rationalization is best at optimizing the sequential objective and provides the most faithful rationales. On a new dataset of annotated sequential rationales, greedy rationales are most similar to human rationales.

2019

pdf abs
Neural Linguistic Steganography
Zachary Ziegler | Yuntian Deng | Alexander Rush
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Whereas traditional cryptography encrypts a secret message into an unintelligible form, steganography conceals that communication is taking place by encoding a secret message into a cover signal. Language is a particularly pragmatic cover signal due to its benign occurrence and independence from any one medium. Traditionally, linguistic steganography systems encode secret messages in existing text via synonym substitution or word order rearrangements. Advances in neural language models enable previously impractical generation-based techniques. We propose a steganography technique based on arithmetic coding with large-scale neural language models. We find that our approach can generate realistic looking cover sentences as evaluated by humans, while at the same time preserving security by matching the cover message distribution with the language model distribution.

2018

pdf
OpenNMT: Neural Machine Translation Toolkit
Guillaume Klein | Yoon Kim | Yuntian Deng | Vincent Nguyen | Jean Senellart | Alexander Rush
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf abs
Bottom-Up Abstractive Summarization
Sebastian Gehrmann | Yuntian Deng | Alexander Rush
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Neural summarization produces outputs that are fluent and readable, but which can be poor at content selection, for instance often copying full sentences from the source document. This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary. We use this selector as a bottom-up attention step to constrain the model to likely phrases. We show that this approach improves the ability to compress text, while still generating fluent summaries. This two-step process is both simpler and higher performing than other end-to-end content selection models, leading to significant improvements on ROUGE for both the CNN-DM and NYT corpus. Furthermore, the content selector can be trained with as little as 1,000 sentences making it easy to transfer a trained summarizer to a new domain.

2017

pdf abs
Neural Machine Translation with Recurrent Attention Modeling
Zichao Yang | Zhiting Hu | Yuntian Deng | Chris Dyer | Alex Smola
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et al. (2014) by explicitly modeling the relationship between previous and subsequent attention levels for each word using one recurrent network per input word. This architecture easily captures informative features, such as fertility and regularities in relative distortion. In experiments, we show our parameterization of attention improves translation quality.

pdf
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein | Yoon Kim | Yuntian Deng | Jean Senellart | Alexander Rush
Proceedings of ACL 2017, System Demonstrations

2016