Meng Fang


2021

pdf bib
Generalization in Text-based Games via Hierarchical Reinforcement Learning
Yunqiu Xu | Meng Fang | Ling Chen | Yali Du | Chengqi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2021

Deep reinforcement learning provides a promising approach for text-based games in studying natural language communication between humans and artificial agents. However, the generalization still remains a big challenge as the agents depend critically on the complexity and variety of training tasks. In this paper, we address this problem by introducing a hierarchical framework built upon the knowledge graph-based RL agent. In the high level, a meta-policy is executed to decompose the whole game into a set of subtasks specified by textual goals, and select one of them based on the KG. Then a sub-policy in the low level is executed to conduct goal-conditioned reinforcement learning. We carry out experiments on games with various difficulty levels and show that the proposed method enjoys favorable generalizability.

pdf bib
ProtoInfoMax: Prototypical Networks with Mutual Information Maximization for Out-of-Domain Detection
Iftitahu Nimah | Meng Fang | Vlado Menkovski | Mykola Pechenizkiy
Findings of the Association for Computational Linguistics: EMNLP 2021

The ability to detect Out-of-Domain (OOD) inputs has been a critical requirement in many real-world NLP applications. For example, intent classification in dialogue systems. The reason is that the inclusion of unsupported OOD inputs may lead to catastrophic failure of systems. However, it remains an empirical question whether current methods can tackle such problems reliably in a realistic scenario where zero OOD training data is available. In this study, we propose ProtoInfoMax, a new architecture that extends Prototypical Networks to simultaneously process in-domain and OOD sentences via Mutual Information Maximization (InfoMax) objective. Experimental results show that our proposed method can substantially improve performance up to 20% for OOD detection in low resource settings of text classification. We also show that ProtoInfoMax is less prone to typical overconfidence errors of Neural Networks, leading to more reliable prediction results.

pdf bib
DAGN: Discourse-Aware Graph Network for Logical Reasoning
Yinya Huang | Meng Fang | Yu Cao | Liwei Wang | Xiaodan Liang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Recent QA with logical reasoning questions requires passage-level relations among the sentences. However, current approaches still focus on sentence-level relations interacting among tokens. In this work, we explore aggregating passage-level clues for solving logical reasoning QA by using discourse-based information. We propose a discourse-aware graph network (DAGN) that reasons relying on the discourse structure of the texts. The model encodes discourse information as a graph with elementary discourse units (EDUs) and discourse relations, and learns the discourse-aware features via a graph network for downstream QA tasks. Experiments are conducted on two logical reasoning QA datasets, ReClor and LogiQA, and our proposed DAGN achieves competitive results. The source code is available at https://github.com/Eleanor-H/DAGN.

2020

pdf bib
Pretrained Language Models for Dialogue Generation with Multiple Input Sources
Yu Cao | Wei Bi | Meng Fang | Dacheng Tao
Findings of the Association for Computational Linguistics: EMNLP 2020

Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses conditioned on multiple sources. Previous work simply concatenates all input sources or averages information from different input sources. In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2. We explore various methods to fuse multiple separate attention information corresponding to different sources. Our experimental results show that proper fusion methods deliver higher relevance with dialogue history than simple fusion baselines.

2019

pdf bib
Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
Joey Tianyi Zhou | Hao Zhang | Di Jin | Hongyuan Zhu | Meng Fang | Rick Siow Mong Goh | Kenneth Kwok
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We propose a new neural transfer method termed Dual Adversarial Transfer Network (DATNet) for addressing low-resource Named Entity Recognition (NER). Specifically, two variants of DATNet, i.e., DATNet-F and DATNet-P, are investigated to explore effective feature fusion between high and low resource. To address the noisy and imbalanced training data, we propose a novel Generalized Resource-Adversarial Discriminator (GRAD). Additionally, adversarial training is adopted to boost model generalization. In experiments, we examine the effects of different components in DATNet across domains and languages and show that significant improvement can be obtained especially for low-resource data, without augmenting any additional hand-crafted features and pre-trained language model.

pdf bib
Bridging the Gap: Improve Part-of-speech Tagging for Chinese Social Media Texts with Foreign Words
Dingmin Wang | Meng Fang | Yan Song | Juntao Li
Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5)

pdf bib
BAG: Bi-directional Attention Entity Graph Convolutional Network for Multi-hop Reasoning Question Answering
Yu Cao | Meng Fang | Dacheng Tao
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Multi-hop reasoning question answering requires deep comprehension of relationships between various documents and queries. We propose a Bi-directional Attention Entity Graph Convolutional Network (BAG), leveraging relationships between nodes in an entity graph and attention information between a query and the entity graph, to solve this task. Graph convolutional networks are used to obtain a relation-aware representation of nodes for entity graphs built from documents with multi-level features. Bidirectional attention is then applied on graphs and queries to generate a query-aware nodes representation, which will be used for the final prediction. Experimental evaluation shows BAG achieves state-of-the-art accuracy performance on the QAngaroo WIKIHOP dataset.

2017

pdf bib
Learning how to Active Learn: A Deep Reinforcement Learning Approach
Meng Fang | Yuan Li | Trevor Cohn
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. This is usually done using heuristic selection methods, however the effectiveness of such methods is limited and moreover, the performance of heuristics varies between datasets. To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes the role of the active learning heuristic. Importantly, our method allows the selection policy learned using simulation to one language to be transferred to other languages. We demonstrate our method using cross-lingual named entity recognition, observing uniform improvements over traditional active learning algorithms.

pdf bib
Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary
Meng Fang | Trevor Cohn
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language, and its associated annotated corpora. However, parallel data is not readily available for many languages, limiting the applicability of these approaches. We address these drawbacks in our framework which takes advantage of cross-lingual word embeddings trained solely on a high coverage dictionary. We propose a novel neural network model for joint training from both sources of data based on cross-lingual word embeddings, and show substantial empirical improvements over baseline techniques. We also propose several active learning heuristics, which result in improvements over competitive benchmark methods.

2016

pdf bib
Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection
Meng Fang | Trevor Cohn
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning