Wei Gao


2023

pdf
Prompt to be Consistent is Better than Self-Consistent? Few-Shot and Zero-Shot Fact Verification with Pre-trained Language Models
Fengzhu Zeng | Wei Gao
Findings of the Association for Computational Linguistics: ACL 2023

Few-shot or zero-shot fact verification only relies on a few or no labeled training examples. In this paper, we propose a novel method called ProToCo, to Prompt pre-trained language models (PLMs) To be Consistent, for improving the factuality assessment capability of PLMs in the few-shot and zero-shot settings. Given a claim-evidence pair, ProToCo generates multiple variants of the claim with different relations and frames a simple consistency mechanism as constraints for making compatible predictions across these variants. We update PLMs by using parameter-efficient fine-tuning (PEFT), leading to more accurate predictions in few-shot and zero-shot fact verification tasks.Our experiments on three public verification datasets show that ProToCo significantly outperforms state-of-the-art few-shot fact verification baselines. With a small number of unlabeled instances, ProToCo also outperforms the strong zero-shot learner T0 on zero-shot verification. Compared to large PLMs using in-context learning (ICL) method, ProToCo outperforms OPT-30B and the Self-Consistency-enabled OPT-6.7B model in both few- and zero-shot settings.

2022

pdf
DialMed: A Dataset for Dialogue-based Medication Recommendation
Zhenfeng He | Yuqiang Han | Zhenqiu Ouyang | Wei Gao | Hongxu Chen | Guandong Xu | Jian Wu
Proceedings of the 29th International Conference on Computational Linguistics

Medication recommendation is a crucial task for intelligent healthcare systems. Previous studies mainly recommend medications with electronic health records (EHRs). However, some details of interactions between doctors and patients may be ignored or omitted in EHRs, which are essential for automatic medication recommendation. Therefore, we make the first attempt to recommend medications with the conversations between doctors and patients. In this work, we construct DIALMED, the first high-quality dataset for medical dialogue-based medication recommendation task. It contains 11, 996 medical dialogues related to 16 common diseases from 3 departments and 70 corresponding common medications. Furthermore, we propose a Dialogue structure and Disease knowledge aware Network (DDN), where a QA Dialogue Graph mechanism is designed to model the dialogue structure and the knowledge graph is used to introduce external disease knowledge. The extensive experimental results demonstrate that the proposed method is a promising solution to recommend medications with medical dialogues. The dataset and code are available at https://github.com/f-window/DialMed.

pdf
Early Rumor Detection Using Neural Hawkes Process with a New Benchmark Dataset
Fengzhu Zeng | Wei Gao
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Little attention has been paid on EArly Rumor Detection (EARD), and EARD performance was evaluated inappropriately on a few datasets where the actual early-stage information is largely missing. To reverse such situation, we construct BEARD, a new Benchmark dataset for EARD, based on claims from fact-checking websites by trying to gather as many early relevant posts as possible. We also propose HEARD, a novel model based on neural Hawkes process for EARD, which can guide a generic rumor detection model to make timely, accurate and stable predictions. Experiments show that HEARD achieves effective EARD performance on two commonly used general rumor detection datasets and our BEARD dataset.

pdf
DialogConv: A Lightweight Fully Convolutional Network for Multi-view Response Selection
Yongkang Liu | Shi Feng | Wei Gao | Daling Wang | Yifei Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Current end-to-end retrieval-based dialogue systems are mainly based on Recurrent Neural Networks or Transformers with attention mechanisms. Although promising results have been achieved, these models often suffer from slow inference or huge number of parameters. In this paper, we propose a novel lightweight fully convolutional architecture, called DialogConv, for response selection. DialogConv is exclusively built on top of convolution to extract matching features of context and response. Dialogues are modeled in 3D views, where DialogConv performs convolution operations on embedding view, word view and utterance view to capture richer semantic information from multiple contextual views. On the four benchmark datasets, compared with state-of-the-art baselines, DialogConv is on average about 8.5x smaller in size, and 79.39x and 10.64x faster on CPU and GPU devices, respectively. At the same time, DialogConv achieves the competitive effectiveness of response selection.

2021

pdf
Boundary Detection with BERT for Span-level Emotion Cause Analysis
Xiangju Li | Wei Gao | Shi Feng | Yifei Zhang | Daling Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf
Debunking Rumors on Twitter with Tree Transformer
Jing Ma | Wei Gao
Proceedings of the 28th International Conference on Computational Linguistics

Rumors are manufactured with no respect for accuracy, but can circulate quickly and widely by “word-of-post” through social media conversations. Conversation tree encodes important information indicative of the credibility of rumor. Existing conversation-based techniques for rumor detection either just strictly follow tree edges or treat all the posts fully-connected during feature learning. In this paper, we propose a novel detection model based on tree transformer to better utilize user interactions in the dialogue where post-level self-attention plays the key role for aggregating the intra-/inter-subtree stances. Experimental results on the TWITTER and PHEME datasets show that the proposed approach consistently improves rumor detection performance.

2019

pdf
Using Customer Service Dialogues for Satisfaction Analysis with Context-Assisted Multiple Instance Learning
Kaisong Song | Lidong Bing | Wei Gao | Jun Lin | Lujun Zhao | Jiancheng Wang | Changlong Sun | Xiaozhong Liu | Qiong Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Customers ask questions and customer service staffs answer their questions, which is the basic service model via multi-turn customer service (CS) dialogues on E-commerce platforms. Existing studies fail to provide comprehensive service satisfaction analysis, namely satisfaction polarity classification (e.g., well satisfied, met and unsatisfied) and sentimental utterance identification (e.g., positive, neutral and negative). In this paper, we conduct a pilot study on the task of service satisfaction analysis (SSA) based on multi-turn CS dialogues. We propose an extensible Context-Assisted Multiple Instance Learning (CAMIL) model to predict the sentiments of all the customer utterances and then aggregate those sentiments into service satisfaction polarity. After that, we propose a novel Context Clue Matching Mechanism (CCMM) to enhance the representations of all customer utterances with their matched context clues, i.e., sentiment and reasoning clues. We construct two CS dialogue datasets from a top E-commerce platform. Extensive experimental results are presented and contrasted against a few previous models to demonstrate the efficacy of our model.

pdf
Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks
Jing Ma | Wei Gao | Shafiq Joty | Kam-Fai Wong
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Claim verification is generally a task of verifying the veracity of a given claim, which is critical to many downstream applications. It is cumbersome and inefficient for human fact-checkers to find consistent pieces of evidence, from which solid verdict could be inferred against the claim. In this paper, we propose a novel end-to-end hierarchical attention network focusing on learning to represent coherent evidence as well as their semantic relatedness with the claim. Our model consists of three main components: 1) A coherence-based attention layer embeds coherent evidence considering the claim and sentences from relevant articles; 2) An entailment-based attention layer attends on sentences that can semantically infer the claim on top of the first attention; and 3) An output layer predicts the verdict based on the embedded evidence. Experimental results on three public benchmark datasets show that our proposed model outperforms a set of state-of-the-art baselines.

2018

pdf
Rumor Detection on Twitter with Tree-structured Recursive Neural Networks
Jing Ma | Wei Gao | Kam-Fai Wong
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automatic rumor detection is technically very challenging. In this work, we try to learn discriminative features from tweets content by following their non-sequential propagation structure and generate more powerful representations for identifying different type of rumors. We propose two recursive neural models based on a bottom-up and a top-down tree-structured neural networks for rumor representation learning and classification, which naturally conform to the propagation layout of tweets. Results on two public Twitter datasets demonstrate that our recursive neural models 1) achieve much better performance than state-of-the-art approaches; 2) demonstrate superior capacity on detecting rumors at very early stage.

pdf
Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning
Weichao Wang | Shi Feng | Wei Gao | Daling Wang | Yifei Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Sentiment expression in microblog posts can be affected by user’s personal character, opinion bias, political stance and so on. Most of existing personalized microblog sentiment classification methods suffer from the insufficiency of discriminative tweets for personalization learning. We observed that microblog users have consistent individuality and opinion bias in different languages. Based on this observation, in this paper we propose a novel user-attention-based Convolutional Neural Network (CNN) model with adversarial cross-lingual learning framework. The user attention mechanism is leveraged in CNN model to capture user’s language-specific individuality from the posts. Then the attention-based CNN model is incorporated into a novel adversarial cross-lingual learning framework, in which with the help of user properties as bridge between languages, we can extract the language-specific features and language-independent features to enrich the user post representation so as to alleviate the data insufficiency problem. Results on English and Chinese microblog datasets confirm that our method outperforms state-of-the-art baseline algorithms with large margins.

2017

pdf
Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning
Jing Ma | Wei Gao | Kam-Fai Wong
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

How fake news goes viral via social media? How does its propagation pattern differ from real stories? In this paper, we attempt to address the problem of identifying rumors, i.e., fake information, out of microblog posts based on their propagation structure. We firstly model microblog posts diffusion with propagation trees, which provide valuable clues on how an original message is transmitted and developed over time. We then propose a kernel-based method called Propagation Tree Kernel, which captures high-order patterns differentiating different types of rumors by evaluating the similarities between their propagation tree structures. Experimental results on two real-world datasets demonstrate that the proposed kernel-based approach can detect rumors more quickly and accurately than state-of-the-art rumor detection models.

2016

pdf
QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Giovanni Da San Martino | Wei Gao | Fabrizio Sebastiani
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf
Topic Extraction from Microblog Posts Using Conversation Structures
Jing Li | Ming Liao | Wei Gao | Yulan He | Kam-Fai Wong
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf
QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English
Massimo Nicosia | Simone Filice | Alberto Barrón-Cedeño | Iman Saleh | Hamdy Mubarak | Wei Gao | Preslav Nakov | Giovanni Da San Martino | Alessandro Moschitti | Kareem Darwish | Lluís Màrquez | Shafiq Joty | Walid Magdy
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf
Using Tweets to Help Sentence Compression for News Highlights Generation
Zhongyu Wei | Yang Liu | Chen Li | Wei Gao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf
Using Content-level Structures for Summarizing Microblog Repost Trees
Jing Li | Wei Gao | Zhongyu Wei | Baolin Peng | Kam-Fai Wong
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf
Simple Effective Microblog Named Entity Recognition: Arabic as an Example
Kareem Darwish | Wei Gao
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Despite many recent papers on Arabic Named Entity Recognition (NER) in the news domain, little work has been done on microblog NER. NER on microblogs presents many complications such as informality of language, shortened named entities, brevity of expressions, and inconsistent capitalization (for cased languages). We introduce simple effective language-independent approaches for improving NER on microblogs, based on using large gazetteers, domain adaptation, and a two-pass semi-supervised method. We use Arabic as an example language to compare the relative effectiveness of the approaches and when best to use them. We also present a new dataset for the task. Results of combining the proposed approaches show an improvement of 35.3 F-measure points over a baseline system trained on news data and an improvement of 19.9 F-measure points over the same system but trained on microblog data.

pdf
Utilizing Microblogs for Automatic News Highlights Extraction
Zhongyu Wei | Wei Gao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf
An Empirical Study on Uncertainty Identification in Social Media Context
Zhongyu Wei | Junwen Chen | Wei Gao | Binyang Li | Lanjun Zhou | Yulan He | Kam-Fai Wong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf
Information-theoretic Multi-view Domain Adaptation
Pei Yang | Wei Gao | Qi Tan | Kam-Fai Wong
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Cross-Lingual Identification of Ambiguous Discourse Connectives for Resource-Poor Language
Lanjun Zhou | Wei Gao | Binyang Li | Zhongyu Wei | Kam-Fai Wong
Proceedings of COLING 2012: Posters

2011

pdf
Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
Lanjun Zhou | Binyang Li | Wei Gao | Zhongyu Wei | Kam-Fai Wong
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
Peng Li | Yinglin Wang | Wei Gao | Jing Jiang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf
Query Weighting for Ranking Model Adaptation
Peng Cai | Wei Gao | Aoying Zhou | Kam-Fai Wong
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2009

pdf
Exploiting Bilingual Information to Improve Web Search
Wei Gao | John Blitzer | Ming Zhou | Kam-Fai Wong
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2005

pdf
NIL Is Not Nothing: Recognition of Chinese Network Informal Language Expressions
Yunqing Xia | Kam-Fai Wong | Wei Gao
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing