Zhiheng Huang


2022

pdf
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner
Danilo Neves Ribeiro | Shen Wang | Xiaofei Ma | Rui Dong | Xiaokai Wei | Henghui Zhu | Xinchi Chen | Peng Xu | Zhiheng Huang | Andrew Arnold | Dan Roth
Findings of the Association for Computational Linguistics: NAACL 2022

Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain the reasoning behind a QA system’s answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.

2021

pdf
Contrastive Document Representation Learning with Graph Attention Networks
Peng Xu | Xinchi Chen | Xiaofei Ma | Zhiheng Huang | Bing Xiang
Findings of the Association for Computational Linguistics: EMNLP 2021

Recent progress in pretrained Transformer-based language models has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

2020

pdf
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang | Davis Liang | Peng Xu | Bing Xiang
Findings of the Association for Computational Linguistics: EMNLP 2020

The transformer model has demonstrated superior results on NLP tasks including machine translation and question answering. In this paper, we argue that the position information is not fully utilized in existing work. For example, the initial proposal of a sinusoid embedding is fixed and not learnable. In this paper, we first review the absolute position embeddings and existing relative position embedding methods. We then propose new methods to encourage increased interaction between query, key and relative position embeddings in the self-attention mechanism. Our most promising approach is a generalization of the absolute position embedding. Our method results in increased accuracy compared to previous approaches in absolute and relative position embeddings on the SQuAD1.1 dataset. In addition, we address the inductive property of whether a position embedding can be robust enough to handle long sequences. We demonstrate empirically that our relative embedding method can be reasonably generalized to and is robust in the inductive perspective. Finally, we show that our proposed method can be effectively and efficiently adopted as a near drop-in replacement for improving the accuracy of large models with little computational overhead.

pdf
Beyond [CLS] through Ranking by Generation
Cicero Nogueira dos Santos | Xiaofei Ma | Ramesh Nallapati | Zhiheng Huang | Bing Xiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document’s language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, deep generative models such as GPT2 and BART have been shown to be excellent text generators, but their effectiveness as rankers have not been demonstrated yet. In this work, we revisit the generative framework for information retrieval and show that our generative approaches are as effective as state-of-the-art semantic similarity-based discriminative models for the answer selection task. Additionally, we demonstrate the effectiveness of unlikelihood losses for IR.

2012

pdf
Iterative Viterbi A* Algorithm for K-Best Sequential Decoding
Zhiheng Huang | Yi Chang | Bo Long | Jean-Francois Crespo | Anlei Dong | Sathiya Keerthi | Su-Lin Wu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2009

pdf
Investigation of Question Classifier in Question Answering
Zhiheng Huang | Marcus Thint | Asli Celikyilmaz
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
Accurate Semantic Class Classifier for Coreference Resolution
Zhiheng Huang | Guangping Zeng | Weiqun Xu | Asli Celikyilmaz
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf
A Graph-based Semi-Supervised Learning for Question-Answering
Asli Celikyilmaz | Marcus Thint | Zhiheng Huang
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Question Classification using Head Words and their Hypernyms
Zhiheng Huang | Marcus Thint | Zengchang Qin
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing