2025
pdf
bib
abs
PQR: Improving Dense Retrieval via Potential Query Modeling
Junfeng Kang
|
Rui Li
|
Qi Liu
|
Yanjiang Chen
|
Zheng Zhang
|
Junzhe Jiang
|
Heng Yu
|
Yu Su
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dense retrieval has now become the mainstream paradigm in information retrieval. The core idea of dense retrieval is to align document embeddings with their corresponding query embeddings by maximizing their dot product. The current training data is quite sparse, with each document typically associated with only one or a few labeled queries. However, a single document can be retrieved by multiple different queries. Aligning a document with just one or a limited number of labeled queries results in a loss of its semantic information. In this paper, we propose a training-free Potential Query Retrieval (PQR) framework to address this issue. Specifically, we use a Gaussian mixture distribution to model all potential queries for a document, aiming to capture its comprehensive semantic information. To obtain this distribution, we introduce three sampling strategies to sample a large number of potential queries for each document and encode them into a semantic space. Using these sampled queries, we employ the Expectation-Maximization algorithm to estimate parameters of the distribution. Finally, we also propose a method to calculate similarity scores between user queries and documents under the PQR framework. Extensive experiments demonstrate the effectiveness of the proposed method.
pdf
bib
abs
UniRAG: Unified Query Understanding Method for Retrieval Augmented Generation
Rui Li
|
Liyang He
|
Qi Liu
|
Zheng Zhang
|
Heng Yu
|
Yuyang Ye
|
Linbo Zhu
|
Yu Su
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-Augmented Generation (RAG) technology effectively addresses the issues of knowledge update lag and hallucinations in large language models (LLMs) by integrating internal and external knowledge. Existing query augmentation methods improve RAG’s performance in handling complex queries but face two key challenges: (1) the separation of query augmentation and encoding tasks, which hinders information sharing and introduces cumulative errors, and (2) the difficulty of selecting the optimal augmentation strategy for different scenarios. In this work, we propose UniRAG, a unified framework for query understanding in RAG. UniRAG employs a decoder-only LLM to jointly perform query augmentation and encoding, eliminating task separation. To facilitate adaptive query augmentation, we categorize existing techniques into query paraphrasing, query expansion, and query abstraction. Our model learns to select the optimal augmentation strategy based on user queries, leveraging retrieval and generation outputs as feedback. Experimental results show that UniRAG significantly outperforms traditional query augmentation methods in five knowledge-intensive benchmark tasks in both closed and open domain question answering.
pdf
bib
abs
Towards Harmonized Uncertainty Estimation for Large Language Models
Rui Li
|
Jing Long
|
Muge Qi
|
Heming Xia
|
Lei Sha
|
Peiyi Wang
|
Zhifang Sui
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
To facilitate robust and trustworthy deployment of large language models (LLMs), it is essential to quantify the reliability of their generations through uncertainty estimation. While recent efforts have made significant advancements by leveraging the internal logic and linguistic features of LLMs to estimate uncertainty scores, our empirical analysis highlights the pitfalls of these methods to strike a harmonized estimation between indication, balance, and calibration, which hinders their broader capability for accurate uncertainty estimation. To address this challenge, we propose CUE (Corrector for Uncertainty Estimation): A straightforward yet effective method that employs a lightweight model trained on data aligned with the target LLM’s performance to adjust uncertainty scores. Comprehensive experiments across diverse models and tasks demonstrate its effectiveness, which achieves consistent improvements of up to 60% over existing methods.
pdf
bib
abs
Multi-perspective Preference Alignment of LLMs for Programming-Community Question Answering
Hongyu Yang
|
Jiahui Hou
|
Liyang He
|
Rui Li
Proceedings of the 31st International Conference on Computational Linguistics
Programming-Community Question Answering (PCQA) aims to tackle issues through generating functional code and guiding descriptions. It involves multiple candidates, with different users having varying preferences for them. Additionally, one may contain outdated APIs. These undoubtedly present a challenge for responsing that meet user preferences. Recently, Reinforcement Learning from Human Feedback demonstrates its ability to precisely control the behavior of large language models (LLMs) to yield human-like responses. However, applying it to LLMs in domain-specific PCQA remains unexplored. In this work, we propose Multi-perspective Preference Alignment for Programming-Community Question Answering to generate user-centric responses, called MupPCQA. It includes three stages: Preference Standardization to control content quality, Preference Integration to consider diverse user tendencies, Preference Timeliness Mitigation to alleviate outdated answers. Extensive experiments on a high-quality, real-world PCQA dataset validate its accuracy and preference. Compared to its base model, MupPCQA shows an improvement of nearly 11% in BLEU, with increases of 20% and 17.5% in BERTScore and CodeBERTScore.
pdf
bib
abs
Automated Clinical Data Extraction with Knowledge Conditioned LLMs
Diya Li
|
Asim Kadav
|
Aijing Gao
|
Rui Li
|
Richard Bourgon
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.
pdf
bib
abs
TrendSim: Simulating Trending Topics in Social Media Under Poisoning Attacks with LLM-based Multi-agent System
Zeyu Zhang
|
Jianxun Lian
|
Chen Ma
|
Yaning Qu
|
Ye Luo
|
Lei Wang
|
Rui Li
|
Xu Chen
|
Yankai Lin
|
Le Wu
|
Xing Xie
|
Ji-Rong Wen
Findings of the Association for Computational Linguistics: NAACL 2025
Trending topics have become a significant part of modern social media, attracting users to participate in discussions of breaking events. However, they also bring in a new channel for poisoning attacks, resulting in negative impacts on society. Therefore, it is urgent to study this critical problem and develop effective strategies for defense. In this paper, we propose TrendSim, an LLM-based multi-agent system to simulate trending topics in social media under poisoning attacks. Specifically, we create a simulation environment for trending topics that incorporates a time-aware interaction mechanism, centralized message dissemination, and an interactive system. Moreover, we develop LLM-based humanoid agents to simulate users in social media, and propose prototype-based attackers to replicate poisoning attacks. Besides, we evaluate TrendSim from multiple aspects to validate its effectiveness. Based on TrendSim, we conduct simulation experiments to study four critical problems about poisoning attacks on trending topics.
pdf
bib
abs
CA-GAR: Context-Aware Alignment of LLM Generation for Document Retrieval
Heng Yu
|
Junfeng Kang
|
Rui Li
|
Qi Liu
|
Liyang He
|
Zhenya Huang
|
Shuanghong Shen
|
Junyu Lu
Findings of the Association for Computational Linguistics: ACL 2025
Information retrieval has evolved from traditional sparse and dense retrieval methods to approaches driven by large language models (LLMs). Recent techniques, such as Generation-Augmented Retrieval (GAR) and Generative Document Retrieval (GDR), leverage LLMs to enhance retrieval but face key challenges: GAR’s generated content may not always align with the target document corpus, while GDR limits the generative capacity of LLMs by constraining outputs to predefined document identifiers. To address these issues, we propose Context-Aware Generation-Augmented Retrieval (CA-GAR), which enhances LLMs by integrating corpus information into their generation process. CA-GAR optimizes token selection by incorporating relevant document information and leverages a Distribution Alignment Strategy to extract corpus information using a lexicon-based approach. Experimental evaluations on seven tasks from the BEIR benchmark and four non-English languages from Mr.TyDi demonstrate that CA-GAR outperforms existing methods.
pdf
bib
abs
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models
Liyang He
|
Chenglong Liu
|
Rui Li
|
Zhenya Huang
|
Shulan Ruan
|
Jun Zhou
|
Enhong Chen
Findings of the Association for Computational Linguistics: ACL 2025
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using annotated datasets like NLI. Yet, the reliance on manual labels limits scalability. Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency. However, they overlook ranking information crucial for fine-grained semantic distinctions. To tackle this challenge, we propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence. Then, we refine exist sentence embedding model by integrating ranking information and semantic information. Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
pdf
bib
abs
How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain Simulation
Rui Li
|
Heming Xia
|
Xinfeng Yuan
|
Qingxiu Dong
|
Lei Sha
|
Wenjie Li
|
Zhifang Sui
Findings of the Association for Computational Linguistics: ACL 2025
Recently, LLMs have garnered increasing attention across academic disciplines for their potential as human digital twins, virtual proxies designed to replicate individuals and autonomously perform tasks such as decision-making, problem-solving, and reasoning on their behalf.However, current evaluations of LLMs primarily emphasize dialogue simulation while overlooking human behavior simulation, which is crucial for digital twins.To address this gap, we introduce BehaviorChain, the first benchmark for evaluating LLMs’ ability to simulate continuous human behavior.BehaviorChain comprises diverse, high-quality, persona-based behavior chains, totaling 15,846 distinct behaviors across 1,001 unique personas, each with detailed history and profile metadata.For evaluation, we integrate persona metadata into LLMs and employ them to iteratively infer contextually appropriate behaviors within dynamic scenarios provided by BehaviorChain. Comprehensive evaluation results demonstrated that even state-of-the-art models struggle with accurately simulating continuous human behavior.
2024
pdf
bib
abs
Wonder at Chemotimelines 2024: MedTimeline: An End-to-End NLP System for Timeline Extraction from Clinical Narratives
Liwei Wang
|
Qiuhao Lu
|
Rui Li
|
Sunyang Fu
|
Hongfang Liu
Proceedings of the 6th Clinical Natural Language Processing Workshop
Extracting timeline information from clinical narratives is critical for cancer research and practice using electronic health records (EHRs). In this study, we apply MedTimeline, our end-to-end hybrid NLP system combining large language model, deep learning with knowledge engineering, to the ChemoTimeLine challenge subtasks. Our experiment results in 0.83, 0.90, 0.84, and 0.53, 0.63, 0.39, respectively, for subtask1 and subtask2 in breast, melanoma and ovarian cancer.
pdf
bib
abs
A Survey on In-context Learning
Qingxiu Dong
|
Lei Li
|
Damai Dai
|
Ce Zheng
|
Jingyuan Ma
|
Rui Li
|
Heming Xia
|
Jingjing Xu
|
Zhiyong Wu
|
Baobao Chang
|
Xu Sun
|
Lei Li
|
Zhifang Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.
pdf
bib
abs
Optimizing Code Retrieval: High-Quality and Scalable Dataset Annotation through Large Language Models
Rui Li
|
Qi Liu
|
Liyang He
|
Zheng Zhang
|
Hao Zhang
|
Shengyu Ye
|
Junyu Lu
|
Zhenya Huang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Code retrieval aims to identify code from extensive codebases that semantically aligns with a given query code snippet. Collecting a broad and high-quality set of query and code pairs is crucial to the success of this task. However, existing data collection methods struggle to effectively balance scalability and annotation quality. In this paper, we first analyze the factors influencing the quality of function annotations generated by Large Language Models (LLMs). We find that the invocation of intra-repository functions and third-party APIs plays a significant role. Building on this insight, we propose a novel annotation method that enhances the annotation context by incorporating the content of functions called within the repository and information on third-party API functionalities. Additionally, we integrate LLMs with a novel sorting method to address the multi-level function call relationships within repositories. Furthermore, by applying our proposed method across a range of repositories, we have developed the Query4Code dataset. The quality of this synthesized dataset is validated through both model training and human evaluation, demonstrating high-quality annotations. Moreover, cost analysis confirms the scalability of our annotation method.
pdf
bib
abs
When Generative Adversarial Networks Meet Sequence Labeling Challenges
Yu Tong
|
Ge Chen
|
Guokai Zheng
|
Rui Li
|
Jiang Dazhi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The current framework for sequence labeling encompasses a feature extractor and a sequence tagger. This study introduces a unified framework named SLGAN, which harnesses the capabilities of Generative Adversarial Networks to address the challenges associated with Sequence Labeling tasks. SLGAN not only mitigates the limitation of GANs in backpropagating loss to discrete data but also exhibits strong adaptability to various sequence labeling tasks. Unlike traditional GANs, the discriminator within SLGAN does not discriminate whether data originates from the discriminator or the generator; instead, it focuses on predicting the correctness of each tag within the tag sequence. We conducted evaluations on six different tasks spanning four languages, including Chinese, Japanese, and Korean Word Segmentation, Chinese and English Named Entity Recognition, and Chinese Part-of-Speech Tagging. Our experimental results illustrate that SLGAN represents a versatile and highly effective solution, consistently achieving state-of-the-art or competitive performance results, irrespective of the specific task or language under consideration.
pdf
bib
abs
RePair: Automated Program Repair with Process-based Feedback
Yuze Zhao
|
Zhenya Huang
|
Yixiao Ma
|
Rui Li
|
Kai Zhang
|
Hao Jiang
|
Qi Liu
|
Linbo Zhu
|
Yu Su
Findings of the Association for Computational Linguistics: ACL 2024
The gap between the trepidation of program reliability and the expense of repairs underscore the indispensability for Automated Program Repair (APR). APR is instrumental in transforming vulnerable programs into more robust ones, bolstering program reliability while simultaneously diminishing the financial burden of manual repairs. Commercial-scale language models (LM) have taken APR to unprecedented levels. However, due to the limitations of model capabilities by parameters, a one-step substantial modification may not achieve the desired effect for models with parameters less than 100B. Moreover, humans interact with the LLM through explicit prompts, which hinders the LLM from receiving feedback from compiler and test cases to automatically optimize its repair policies. Explicit prompts from humans not only increase additional manpower costs, but also pose potential misunderstandings between human’s intent and LMs.Based on the above considerations, we are exploring how to ensure small-scale LM still outperform through process supervision and feedback. We start by constructing a dataset named CodeNet4Repair, replete with multiple repair records, which supervises the fine-tuning of a foundational mode. Building upon the encouraging outcomes of reinforcement learning, we develop a reward model that serves as a critic, providing feedback for the fine-tuned LM’s action, progressively optimizing its policy. During inference, we require the LM to generate solutions iteratively until the repair effect no longer improves or hits the maximum step limit. The experimental results show that this process-based feedback not only outperforms larger outcome-based generation methods, but also nearly matches the performance of closed-source commercial large-scale LMs.
pdf
bib
abs
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Rui Li
|
Peiyi Wang
|
Jingyuan Ma
|
Di Zhang
|
Lei Sha
|
Zhifang Sui
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) have gained increasing attention for their remarkable capacity, alongside concerns about safety arising from their potential to produce harmful content. Red teaming aims to find prompts that could elicit harmful responses from LLMs, and is essential to discover and mitigate safety risks before real-world deployment. However, manual red teaming is both time-consuming and expensive, rendering it unscalable. In this paper, we propose RTPE, a scalable evolution framework to evolve red teaming prompts across both breadth and depth dimensions, facilitating the automatic generation of numerous high-quality and diverse red teaming prompts. Specifically, in-breadth evolving employs a novel enhanced in-context learning method to create a multitude of quality prompts, whereas in-depth evolving applies customized transformation operations to enhance both content and form of prompts, thereby increasing diversity. Extensive experiments demonstrate that RTPE surpasses existing representative automatic red teaming methods on both attack success rate and diversity. In addition, based on 4,800 red teaming prompts created by RTPE, we further provide a systematic analysis of 8 representative LLMs across 8 sensitive topics.
pdf
bib
abs
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Zhexin Zhang
|
Yida Lu
|
Jingyuan Ma
|
Di Zhang
|
Rui Li
|
Pei Ke
|
Hao Sun
|
Lei Sha
|
Zhifang Sui
|
Hongning Wang
|
Minlie Huang
Findings of the Association for Computational Linguistics: EMNLP 2024
The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs’ responses in an aligned, customizable and explainable manner. In this paper, we propose ShieldLM, an LLM-based safety detector, which aligns with common safety standards, supports customizable detection rules, and provides explanations for its decisions. To train ShieldLM, we compile a large bilingual dataset comprising 14,387 query-response pairs, annotating the safety of responses based on various safety standards. Through extensive experiments, we demonstrate that ShieldLM surpasses strong baselines across four test sets, showcasing remarkable customizability and explainability. Besides performing well on standard detection datasets, ShieldLM has also been shown to be effective as a safety evaluator for advanced LLMs. ShieldLM is released at
https://github.com/thu-coai/ShieldLM to support accurate and explainable safety detection under various safety standards.
pdf
bib
abs
Breakthrough from Nuance and Inconsistency: Enhancing Multimodal Sarcasm Detection with Context-Aware Self-Attention Fusion and Word Weight Calculation.
Hongfei Xue
|
Linyan Xu
|
Yu Tong
|
Rui Li
|
Jiali Lin
|
Dazhi Jiang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Multimodal sarcasm detection has received considerable attention due to its unique role in social networks. Existing methods often rely on feature concatenation to fuse different modalities or model the inconsistencies among modalities. However, sarcasm is often embodied in local and momentary nuances in a subtle way, which causes difficulty for sarcasm detection. To effectively incorporate these nuances, this paper presents Context-Aware Self-Attention Fusion (CAAF) to integrate local and momentary multimodal information into specific words. Furthermore, due to the instantaneous nature of sarcasm, the connotative meanings of words post-multimodal integration generally deviate from their denotative meanings. Therefore, Word Weight Calculation (WWC) is presented to compute the weight of specific words based on CAAF’s fusion nuances, illustrating the inconsistency between connotation and denotation. We evaluate our method on the MUStARD dataset, achieving an accuracy of 76.9 and an F1 score of 76.1, which surpasses the current state-of-the-art IWAN model by 1.7 and 1.6 respectively.
pdf
bib
abs
Feature Structure Matching for Multi-source Sentiment Analysis with Efficient Adaptive Tuning
Rui Li
|
Cheng Liu
|
Yu Tong
|
Jiang Dazhi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Recently, fine-tuning the large pre-trained language models on the labeled sentiment dataset achieves appealing performance. However, the obtained model may not generalize well to the other domains due to the domain shift, and it is expensive to update the entire parameters within the large models. Although some existing domain matching methods are proposed to alleviate the above issues, there are multiple relevant source domains in practice which makes the whole training more costly and complicated. To this end, we focus on the efficient unsupervised multi-source sentiment adaptation task which is more challenging and beneficial for real-world applications. Specifically, we propose to extract multi-layer features from the large pre-trained model, and design a dynamic parameters fusion module to exploit these features for both efficient and adaptive tuning. Furthermore, we propose a novel feature structure matching constraint, which enforces similar feature-wise correlations across different domains. Compared with the traditional domain matching methods which tend to pull all feature instances close, we show that the proposed feature structure matching is more robust and generalizable in the multi-source scenario. Extensive experiments on several multi-source sentiment analysis benchmarks demonstrate the effectiveness and superiority of our proposed framework.
2023
pdf
bib
abs
To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion
Rui Li
|
Xu Chen
|
Chaozhuo Li
|
Yanming Shen
|
Jianan Zhao
|
Yujing Wang
|
Weihao Han
|
Hao Sun
|
Weiwei Deng
|
Qi Zhang
|
Xing Xie
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Embedding models have shown great power in knowledge graph completion (KGC) task. By learning structural constraints for each training triple, these methods implicitly memorize intrinsic relation rules to infer missing links. However, this paper points out that the multi-hop relation rules are hard to be reliably memorized due to the inherent deficiencies of such implicit memorization strategy, making embedding models underperform in predicting links between distant entity pairs. To alleviate this problem, we present Vertical Learning Paradigm (VLP), which extends embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction. Rather than solely relying on the implicit memory, VLP directly provides additional cues to improve the generalization ability of embedding models, especially making the distant link prediction significantly easier. Moreover, we also propose a novel relative distance based negative sampling technique (ReD) for more effective optimization. Experiments demonstrate the validity and generality of our proposals on two standard benchmarks. Our code is available at
https://github.com/rui9812/VLP.
pdf
bib
abs
Learning More from Mixed Emotions: A Label Refinement Method for Emotion Recognition in Conversations
Jintao Wen
|
Geng Tu
|
Rui Li
|
Dazhi Jiang
|
Wenhua Zhu
Transactions of the Association for Computational Linguistics, Volume 11
One-hot labels are commonly employed as ground truth in Emotion Recognition in Conversations (ERC). However, this approach may not fully encompass all the emotions conveyed in a single utterance, leading to suboptimal performance. Regrettably, current ERC datasets lack comprehensive emotionally distributed labels. To address this issue, we propose the Emotion Label Refinement (EmoLR) method, which utilizes context- and speaker-sensitive information to infer mixed emotional labels. EmoLR comprises an Emotion Predictor (EP) module and a Label Refinement (LR) module. The EP module recognizes emotions and provides context/speaker states for the LR module. Subsequently, the LR module calculates the similarity between these states and ground-truth labels, generating a refined label distribution (RLD). The RLD captures a more comprehensive range of emotions than the original one-hot labels. These refined labels are then used for model training in place of the one-hot labels. Experimental results on three public conversational datasets demonstrate that our EmoLR achieves state-of-the-art performance.
2022
pdf
bib
abs
Asymmetric Mutual Learning for Multi-source Unsupervised Sentiment Adaptation with Dynamic Feature Network
Rui Li
|
Cheng Liu
|
Dazhi Jiang
Proceedings of the 29th International Conference on Computational Linguistics
Recently, fine-tuning the pre-trained language model (PrLM) on labeled sentiment datasets demonstrates impressive performance. However, collecting labeled sentiment dataset is time-consuming, and fine-tuning the whole PrLM brings about much computation cost. To this end, we focus on multi-source unsupervised sentiment adaptation problem with the pre-trained features, which is more practical and challenging. We first design a dynamic feature network to fully exploit the extracted pre-trained features for efficient domain adaptation. Meanwhile, with the difference of the traditional source-target domain alignment methods, we propose a novel asymmetric mutual learning strategy, which can robustly estimate the pseudo-labels of the target domain with the knowledge from all the other source models. Experiments on multiple sentiment benchmarks show that our method outperforms the recent state-of-the-art approaches, and we also conduct extensive ablation studies to verify the effectiveness of each the proposed module.
2021
pdf
bib
abs
Treasures Outside Contexts: Improving Event Detection via Global Statistics
Rui Li
|
Wenlin Zhao
|
Cheng Yang
|
Sen Su
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Event detection (ED) aims at identifying event instances of specified types in given texts, which has been formalized as a sequence labeling task. As far as we know, existing neural-based ED models make decisions relying entirely on the contextual semantic features of each word in the inputted text, which we find is easy to be confused by the varied contexts in the test stage. To this end, we come up with the idea of introducing a set of statistical features from word-event co-occurrence frequencies in the entire training set to cooperate with contextual features. Specifically, we propose a Semantic and Statistic-Joint Discriminative Network (SS-JDN) consisting of a semantic feature extractor, a statistical feature extractor, and a joint event discriminator. In experiments, SS-JDN effectively exceeds ten recent strong baselines on ACE2005 and KBP2015 datasets. Further, we perform extensive experiments to comprehensively probe SS-JDN.
2016
pdf
bib
Multi-Granularity Chinese Word Embedding
Rongchao Yin
|
Quan Wang
|
Peng Li
|
Rui Li
|
Bin Wang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
2012
pdf
bib
Annotation Schemes to Encode Domain Knowledge in Medical Narratives
Wilson McCoy
|
Cecilia Ovesdotter Alm
|
Cara Calvelli
|
Rui Li
|
Jeff B. Pelz
|
Pengcheng Shi
|
Anne Haake
Proceedings of the Sixth Linguistic Annotation Workshop