2025
pdf
bib
abs
Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models
Zhuojun Ding
|
Wei Wei
|
Chenghao Fan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework’s effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.
pdf
bib
abs
Multi-level Association Refinement Network for Dialogue Aspect-based Sentiment Quadruple Analysis
Zeliang Tong
|
Wei Wei
|
Xiaoye Qu
|
Rikui Huang
|
Zhixin Chen
|
Xingyu Yan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dialogue Aspect-based Sentiment Quadruple (DiaASQ) analysis aims to identify all quadruples (i.e., target, aspect, opinion, sentiment) from the dialogue. This task is challenging as different elements within a quadruple may manifest in different utterances, requiring precise handling of associations at both the utterance and word levels. However, most existing methods tackling it predominantly leverage predefined dialogue structure (e.g., reply) and word semantics, resulting in a surficial understanding of the deep sentiment association between utterances and words. In this paper, we propose a novel Multi-level Association Refinement Network (MARN) designed to achieve more accurate and comprehensive sentiment associations between utterances and words. Specifically, for utterances, we dynamically capture their associations with enriched semantic features through a holistic understanding of the dialogue, aligning them more closely with sentiment associations within elements in quadruples. For words, we develop a novel cross-utterance syntax parser (CU-Parser) that fully exploits syntactic information to enhance the association between word pairs within and across utterances. Moreover, to address the scarcity of labeled data in DiaASQ, we further introduce a multi-view data augmentation strategy to enhance the performance of MARN under low-resource conditions. Experimental results demonstrate that MARN achieves state-of-the-art performance and maintains robustness even under low-resource conditions.
pdf
bib
abs
Cooperative or Competitive? Understanding the Interaction between Attention Heads From A Game Theory Perspective
Xiaoye Qu
|
Zengqi Yu
|
Dongrui Liu
|
Wei Wei
|
Daizong Liu
|
Jianfeng Dong
|
Yu Cheng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Despite the remarkable success of attention-based large language models (LLMs), the precise interaction mechanisms between attention heads remain poorly understood. In contrast to prevalent methods that focus on individual head contributions, we rigorously analyze the intricate interplay among attention heads through a novel framework based on the Harsanyi dividend, a concept from cooperative game theory. Our analysis reveals that significant positive Harsanyi dividends are sparsely distributed across head combinations, indicating that most heads do not contribute cooperatively. Moreover, certain head combinations exhibit negative dividends, indicating implicit competitive relationships. To further optimize the interactions among attention heads, we propose a training-free Game-theoretic Attention Calibration (GAC) method. Specifically, GAC selectively retains heads demonstrating significant cooperative gains and applies fine-grained distributional adjustments to the remaining heads. Comprehensive experiments across 17 benchmarks demonstrate the effectiveness of our proposed GAC and its superior generalization capabilities across diverse model families, scales, and modalities. Crucially, the discovered interaction phenomena offer a path toward a deeper understanding of the behaviors of LLMs.
pdf
bib
abs
GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search
Xianshu Peng
|
Wei Wei
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Enhancing large models for complex multi-hop question-answering has become a research focus in the Retrieval-augmented generation (RAG) area. Many existing approaches aim to mimic human thought processes by enabling large models to perform retrieval-augmented generation step by step. However, these methods can only perform single chain reasoning, which lacks the ability for multi-path exploration, strategic look-ahead, stepwise evaluation, and global selection. In addition, to effectively decompose complex problems, these methods can only rely on labor-intensive intermediate annotations for supervised fine-tuning. To address these issues, we propose GRAT, an algorithm guided by Monte Carlo Tree Search (MCTS) and process rewards. GRAT not only enables self-evaluation and self-correction but also assigns fine-grained rewards to each intermediate step in the search path. These fine-grained annotations can be used for model self-training, which enables GRAT to continuously self-update its problem analysis and reasoning capabilities. We conducted experiments on four multihop QA datasets: HotPotQA, 2WikiMultiHopQA, MuSiQue, and Bamboogle, demonstrating that GRAT outperforms various RAG-based methods. Additionally, incorporating self-training significantly enhances GRAT’s reasoning performance.
pdf
bib
abs
Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought Reasoning
Chuming Shen
|
Wei Wei
|
Dong Wang
|
Zhong-Hao Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Cross-domain aspect-based sentiment analysis (ABSA) aims at learning specific knowledge from a source domain to perform various ABSA tasks on a target domain. Recent works mainly focus on how to use domain adaptation techniques to transfer the domain-agnostic features from the labeled source domain to the unlabeled target domain. However, it would be unwise to manually collect a large number of unlabeled data from the target domain, where such data may not be available owing to the facts like data security concerns in banking or insurance. To alleviate this issue, we propose ZeroABSA, a unified zero-shot learning framework for cross-domain ABSA that effectively eliminates dependency on target-domain annotations. Specifically, ZeroABSA consists of two novel components, namely, (1) A hybrid data augmentation module leverages large language models (LLMs) to synthesize high-quality, domain-adaptive target-domain data, by evaluating the generated samples across vocabulary richness, semantic coherence and sentiment/domain consistency, followed by iterative refinement; (2) A domain-aware chain-of-thought (COT) prompting strategy trains models on augmented data while explicitly modeling domain-invariant reasoning to bridge the well-known cross-domain gap. Extensive evaluations across four diverse domains demonstrate that ZeroABSA surpasses the-state-of-the-arts, which effectively advances the practicality of cross-domain ABSA in real-world scenarios where labeled target-domain data is unavailable.
2024
pdf
bib
abs
Personalized Topic Selection Model for Topic-Grounded Dialogue
Shixuan Fan
|
Wei Wei
|
Xiaofei Wen
|
Xian-Ling Mao
|
Jixiong Chen
|
Dangyang Chen
Findings of the Association for Computational Linguistics: ACL 2024
Recently, the topic-grounded dialogue (TGD) system has become increasingly popular as its powerful capability to actively guide users to accomplish specific tasks through topic-guided conversations. Most existing works utilize side information (e.g. topics or personas) in isolation to enhance the topic selection ability. However, due to disregarding the noise within these auxiliary information sources and their mutual influence, current models tend to predict user-uninteresting and contextually irrelevant topics. To build user-engaging and coherent dialogue agent, we propose a personalized topic selection model for topic-grounded dialogue, named PETD, which takes account of the interaction of side information to selectively aggregate such information for more accurately predicting subsequent topics. Specifically, we evaluate the correlation between global topics and personas and selectively incorporate the global topics aligned with user personas. Furthermore, we propose a contrastive learning based persona selector to filter relevant personas under the constraint of lacking pertinent persona annotations. Throughout the selection and generation, diverse relevant side information is considered. Extensive experiments demonstrate that our proposed method can generate engaging and diverse responses, outperforming state-of-the-art baselines across various evaluation metrics.
pdf
bib
abs
Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models
Zhenyi Lu
|
Jie Tian
|
Wei Wei
|
Xiaoye Qu
|
Yu Cheng
|
Wenfeng Xie
|
Dangyang Chen
Findings of the Association for Computational Linguistics: ACL 2024
Text classification is a crucial task encountered frequently in practical scenarios, yet it is still under-explored in the era of large language models (LLMs). This study shows that LLMs are vulnerable to changes in the number and arrangement of options in text classification. Our extensive empirical analyses reveal that the key bottleneck arises from ambiguous decision boundaries and inherent biases towards specific tokens and positions.To mitigate these issues, we make the first attempt and propose a novel two-stage classification framework for LLMs. Our approach is grounded in the empirical observation that pairwise comparisons can effectively alleviate boundary ambiguity and inherent bias. Specifically, we begin with a self-reduction technique to efficiently narrow down numerous options, which contributes to reduced decision space and a faster comparison process. Subsequently, pairwise contrastive comparisons are employed in a chain-of-thought manner to draw out nuances and distinguish confusable options, thus refining the ambiguous decision boundary.Extensive experiments on four datasets (Banking77, HWU64, LIU54, and Clinic150) verify the effectiveness of our framework. Furthermore, benefitting from our framework, various LLMs can achieve consistent improvements. Our code and data are available in
https://github.com/Chuge0335/PC-CoT.
pdf
bib
Modeling Historical Relevant and Local Frequency Context for Representation-Based Temporal Knowledge Graph Forecasting
Shengzhe Zhang
|
Wei Wei
|
Rikui Huang
|
Wenfeng Xie
|
Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
pdf
bib
abs
CNEQ: Incorporating numbers into Knowledge Graph Reasoning
Xianshu Peng
|
Wei Wei
|
Kaihe Xu
|
Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
Complex logical reasoning over knowledge graphs lies at the heart of many semantic downstream applications and thus has been extensively explored in recent years. However, nearly all of them overlook the rich semantics of numerical entities (e.g., magnitude, unit, and distribution) and are simply treated as common entities, or even directly removed. It may severely hinder the performance of answering queries involving numerical comparison or numerical computation. To address this issue, we propose the Complex Number and Entity Query model (CNEQ), which comprises a Number-Entity Predictor and an Entity Filter. The Number-Entity Predictor can independently learn the structural and semantic features of entities and numerical values, thereby enabling better prediction of entities as well as numerical values. The Entity Filter can compare or calculate numerical values to filter out entities that meet certain numerical constraints. To evaluate our model, we generated a variety of multi-hop complex logical queries including numerical values on three widely-used Knowledge Graphs: FB15K, DB15K, and YAGO15K. Experimental results demonstrate that CNEQ achieves state-of-the-art results.
pdf
bib
abs
CCIIPLab at SIGHAN-2024 dimABSA Task: Contrastive Learning-Enhanced Span-based Framework for Chinese Dimensional Aspect-Based Sentiment Analysis
Zeliang Tong
|
Wei Wei
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)
This paper describes our system and findings for SIGHAN-2024 Shared Task Chinese Dimensional Aspect-Based Sentiment Analysis (dimABSA). Our team CCIIPLab proposes an Contrastive Learning-Enhanced Span-based (CL-Span) framework to boost the performance of extracting triplets/quadruples and predicting sentiment intensity. We first employ a span-based framework that integrates contextual representations and incorporates rotary position embedding. This approach fully considers the relational information of entire aspect and opinion terms, and enhancing the model’s understanding of the associations between tokens. Additionally, we utilize contrastive learning to predict sentiment intensities in the valence-arousal dimensions with greater precision. To improve the generalization ability of the model, additional datasets are used to assist training. Experiments have validated the effectiveness of our approach. In the official test results, our system ranked 2nd among the three subtasks.
2023
pdf
bib
abs
TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
Wendi Li
|
Wei Wei
|
Xiaoye Qu
|
Xian-Ling Mao
|
Ye Yuan
|
Wenfeng Xie
|
Dangyang Chen
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Conversational recommender systems (CRS) aim to timely trace the dynamic interests of users through dialogues and generate relevant responses for item recommendations. Recently, various external knowledge bases (especially knowledge graphs) are incorporated into CRS to enhance the understanding of conversation contexts. However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. TREA constructs a multi-hierarchical scalable tree as the reasoning structure to clarify the causal relationships between mentioned entities, and fully utilizes historical conversations to generate more reasonable and suitable responses for recommended results. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach.
pdf
bib
abs
Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework
Lifan Yuan
|
YiChi Zhang
|
Yangyi Chen
|
Wei Wei
Findings of the Association for Computational Linguistics: ACL 2023
Despite recent success on various tasks, deep learning techniques still perform poorly on adversarial examples with small perturbations. While optimization-based methods for adversarial attacks are well-explored in the field of computer vision, it is impractical to directly apply them in natural language processing due to the discrete nature of the text. To address the problem, we propose a unified framework to extend the existing optimization-based adversarial attack methods in the vision domain to craft textual adversarial samples. In this framework, continuously optimized perturbations are added to the embedding layer and amplified in the forward propagation process. Then the final perturbed latent representations are decoded with a masked language model head to obtain potential adversarial samples. In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD). We find our algorithm effective even using proxy gradient information. Therefore, we perform the more challenging transfer black-box attack and conduct comprehensive experiments to evaluate our attack algorithm with several models on three benchmark datasets. Experimental results demonstrate that our method achieves overall better performance and produces more fluent and grammatical adversarial samples compared to strong baseline methods. The code and data are available at
https://github.com/Phantivia/T-PGD.
pdf
bib
abs
AttenWalker: Unsupervised Long-Document Question Answering via Attention-based Graph Walking
Yuxiang Nie
|
Heyan Huang
|
Wei Wei
|
Xian-Ling Mao
Findings of the Association for Computational Linguistics: ACL 2023
Annotating long-document question answering (long-document QA) pairs is time-consuming and expensive. To alleviate the problem, it might be possible to generate long-document QA pairs via unsupervised question answering (UQA) methods. However, existing UQA tasks are based on short documents, and can hardly incorporate long-range information. To tackle the problem, we propose a new task, named unsupervised long-document question answering (ULQA), aiming to generate high-quality long-document QA instances in an unsupervised manner. Besides, we propose AttenWalker, a novel unsupervised method to aggregate and generate answers with long-range dependency so as to construct long-document QA pairs. Specifically, AttenWalker is composed of three modules, i.e. span collector, span linker and answer aggregator. Firstly, the span collector takes advantage of constituent parsing and reconstruction loss to select informative candidate spans for constructing answers. Secondly, with the help of the attention graph of a pre-trained long-document model, potentially interrelated text spans (that might be far apart) could be linked together via an attention-walking algorithm. Thirdly, in the answer aggregator, linked spans are aggregated into the final answer via the mask-filling ability of a pre-trained model. Extensive experiments show that AttenWalker outperforms previous methods on NarrativeQA and Qasper. In addition, AttenWalker also shows strong performance in the few-shot learning setting.
2022
pdf
bib
abs
Cross-Lingual Phrase Retrieval
Heqi Zheng
|
Xiao Zhang
|
Zewen Chi
|
Heyan Huang
|
Yan Tan
|
Tian Lan
|
Wei Wei
|
Xian-Ling Mao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Cross-lingual retrieval aims to retrieve relevant text across languages. Current methods typically achieve cross-lingual retrieval by learning language-agnostic text representations in word or sentence level. However, how to learn phrase representations for cross-lingual phrase retrieval is still an open problem. In this paper, we propose , a cross-lingual phrase retriever that extracts phrase representations from unlabeled example sentences. Moreover, we create a large-scale cross-lingual phrase retrieval dataset, which contains 65K bilingual phrase pairs and 4.2M example sentences in 8 English-centric language pairs. Experimental results show that outperforms state-of-the-art baselines which utilize word-level or sentence-level representations. also shows impressive zero-shot transferability that enables the model to perform retrieval in an unseen language pair during training. Our dataset, code, and trained models are publicly available at github.com/cwszz/XPR/.
pdf
bib
abs
BiSyn-GAT+: Bi-Syntax Aware Graph Attention Network for Aspect-based Sentiment Analysis
Shuo Liang
|
Wei Wei
|
Xian-Ling Mao
|
Fei Wang
|
Zhiyong He
Findings of the Association for Computational Linguistics: ACL 2022
Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that aims to align aspects and corresponding sentiments for aspect-specific sentiment polarity inference. It is challenging because a sentence may contain multiple aspects or complicated (e.g., conditional, coordinating, or adversative) relations. Recently, exploiting dependency syntax information with graph neural networks has been the most popular trend. Despite its success, methods that heavily rely on the dependency tree pose challenges in accurately modeling the alignment of the aspects and their words indicative of sentiment, since the dependency tree may provide noisy signals of unrelated associations (e.g., the “conj” relation between “great” and “dreadful” in Figure 2). In this paper, to alleviate this problem, we propose a Bi-Syntax aware Graph Attention Network (BiSyn-GAT+). Specifically, BiSyn-GAT+ fully exploits the syntax information (e.g., phrase segmentation and hierarchical structure) of the constituent tree of a sentence to model the sentiment-aware context of every single aspect (called intra-context) and the sentiment relations across aspects (called inter-context) for learning. Experiments on four benchmark datasets demonstrate that BiSyn-GAT+ outperforms the state-of-the-art methods consistently.