Li Deng


2018

pdf
Tensor Product Generation Networks for Deep NLP Modeling
Qiuyuan Huang | Paul Smolensky | Xiaodong He | Li Deng | Dapeng Wu
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We present a new approach to the design of deep networks for natural language processing (NLP), based on the general technique of Tensor Product Representations (TPRs) for encoding and processing symbol structures in distributed neural networks. A network architecture — the Tensor Product Generation Network (TPGN) — is proposed which is capable in principle of carrying out TPR computation, but which uses unconstrained deep learning to design its internal representations. Instantiated in a model for image-caption generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset. The TPR-capable structure enables interpretation of internal representations and operations, which prove to contain considerable grammatical content. Our caption-generation model can be interpreted as generating sequences of grammatical categories and retrieving words by their categories from a plan encoded as a distributed representation.

2017

pdf
Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension
David Golub | Po-Sen Huang | Xiaodong He | Li Deng
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network. Given a high performing MC model in one domain, our technique aims to answer questions about documents in another domain, where we use no labeled data of question-answer pairs. Using the proposed synthesis network with a pretrained model on the SQuAD dataset, we achieve an F1 measure of 46.6% on the challenging NewsQA dataset, approaching performance of in-domain models (F1 measure of 50.0%) and outperforming the out-of-domain baseline by 7.6%, without use of provided annotations.

pdf
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access
Bhuwan Dhingra | Lihong Li | Xiujun Li | Jianfeng Gao | Yun-Nung Chen | Faisal Ahmed | Li Deng
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper proposes KB-InfoBot - a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically need to interact with an external database to access real-world knowledge. Previous systems achieved this by issuing a symbolic query to the KB to retrieve entries based on their attributes. However, such symbolic operations break the differentiability of the system and prevent end-to-end training of neural dialogue agents. In this paper, we address this limitation by replacing symbolic queries with an induced “soft” posterior distribution over the KB that indicates which entities the user is interested in. Integrating the soft retrieval process with a reinforcement learner leads to higher task success rate and reward in both simulations and against real users. We also present a fully neural end-to-end agent, trained entirely from user feedback, and discuss its application towards personalized dialogue agents.

2016

pdf
Deep Reinforcement Learning with a Natural Language Action Space
Ji He | Jianshu Chen | Xiaodong He | Jianfeng Gao | Lihong Li | Li Deng | Mari Ostendorf
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads
Ji He | Mari Ostendorf | Xiaodong He | Jianshu Chen | Jianfeng Gao | Lihong Li | Li Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf
Bi-directional Attention with Agreement for Dependency Parsing
Hao Cheng | Hao Fang | Xiaodong He | Jianfeng Gao | Li Deng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf
Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
Xiaodong Liu | Jianfeng Gao | Xiaodong He | Li Deng | Kevin Duh | Ye-yi Wang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin | Hao Cheng | Hao Fang | Saurabh Gupta | Li Deng | Xiaodong He | Geoffrey Zweig | Margaret Mitchell
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Modeling Interestingness with Deep Neural Networks
Jianfeng Gao | Patrick Pantel | Michael Gamon | Xiaodong He | Li Deng
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
Learning Continuous Phrase Representations for Translation Modeling
Jianfeng Gao | Xiaodong He | Wen-tau Yih | Li Deng
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf
Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
Xiaodong He | Li Deng
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf
The MSR system for IWSLT 2011 evaluation
Xiaodong He | Amittai Axelrod | Li Deng | Alex Acero | Mei-Yuh Hwang | Alisa Nguyen | Andrew Wang | Xiahui Huang
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the Microsoft Research (MSR) system for the evaluation campaign of the 2011 international workshop on spoken language translation. The evaluation task is to translate TED talks (www.ted.com). This task presents two unique challenges: First, the underlying topic switches sharply from talk to talk. Therefore, the translation system needs to adapt to the current topic quickly and dynamically. Second, only a very small amount of relevant parallel data (transcripts of TED talks) is available. Therefore, it is necessary to perform accurate translation model estimation with limited data. In the preparation for the evaluation, we developed two new methods to attack these problems. Specifically, we developed an unsupervised topic modeling based adaption method for machine translation models. We also developed a discriminative training method to estimate parameters in the generative components of the translation models with limited data. Experimental results show that both methods improve the translation quality. Among all the submissions, ours achieves the best BLEU score in the machine translation Chinese-to-English track (MT_CE) of the IWSLT 2011 evaluation that we participated.

2010

pdf
The MSRA machine translation system for IWSLT 2010
Chi-Ho Li | Nan Duan | Yinggong Zhao | Shujie Liu | Lei Cui | Mei-yuh Hwang | Amittai Axelrod | Jianfeng Gao | Yaodong Zhang | Li Deng
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign