Boyan Xu

2025

pdf bib abs
Dr.ECI: Infusing Large Language Models with Causal Knowledge for Decomposed Reasoning in Event Causality Identification
Ruichu Cai | Shengyin Yu | Jiahao Zhang | Wei Chen | Boyan Xu | Keli Zhang
Proceedings of the 31st International Conference on Computational Linguistics

Despite the demonstrated potential of Large Language Models (LLMs) in diverse NLP tasks, their causal reasoning capability appears inadequate when evaluated within the context of the event causality identification (ECI) task. The ECI tasks pose significant complexity for LLMs and necessitate comprehensive causal priors for accurate identification. To improve the performance of LLMs for causal reasoning, we propose a multi-agent Decomposed reasoning framework for Event Causality Identification, designated as Dr.ECI. In the discovery stage, Dr.ECI incorporates specialized agents such as Causal Explorer and Mediator Detector, which capture implicit causality and indirect causality more effectively. In the reasoning stage, Dr.ECI introduces the agents Direct Reasoner and Indirect Reasoner, which leverage the knowledge of the generalized causal structure specific to the ECI. Extensive evaluations demonstrate the state-of-the-art performance of Dr.ECI comparing with baselines based on LLMs and supervised training. Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/Dr.ECI.

pdf bib abs
CACA: Context-Aware Cross-Attention Network for Extractive Aspect Sentiment Quad Prediction
Bingfeng Chen | Haoran Xu | Yongqi Luo | Boyan Xu | Ruichu Cai | Zhifeng Hao
Proceedings of the 31st International Conference on Computational Linguistics

Aspect Sentiment Quad Prediction(ASQP) enhances the scope of aspect-based sentiment analysis by introducing the necessity to predict both explicit and implicit aspect and opinion terms. Existing leading generative ASQP approaches do not modeling the contextual relationship of the review sentence to predict implicit terms. However, introducing the contextual information into the pre-trained language models framework is non-trivial due to the inflexibility of the generative encoder-decoder architecture. To well utilize the contextual information, we propose an extractive ASQP framework, CACA, which features with Context-Aware Cross-Attention Network. When implicit terms are present, the Context-Aware Cross-Attention Network enhances the alignment of aspects and opinions, through alternating updates of explicit and implicit representations. Additionally, contrastive learning is introduced in the implicit representation learning process. Experimental results on three benchmarks demonstrate the effectiveness of CACA. Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/CACA.

pdf bib
𝒮²IT: Stepwise Syntax Integration Tuning for Large Language Models in Aspect Sentiment Quad Prediction
Bingfeng Chen | Chenjie Qiu | Yifeng Xie | Boyan Xu | Ruichu Cai | Zhifeng Hao
Findings of the Association for Computational Linguistics: NAACL 2025

pdf bib abs
Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL
Bingfeng Chen | Shaobin Shi | Yongqi Luo | Boyan Xu | Ruichu Cai | Zhifeng Hao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Generative language models have shown significant potential in single-turn Text-to-SQL. However, their performance does not extend equivalently to multi-turn Text-to-SQL. This is primarily due to generative language models’ inadequacy in handling the complexities of context information and dynamic schema linking in multi-turn interactions. In this paper, we propose a framework named Track-SQL, which enhances generative language models with dual-extractive modules designed to track schema and contextual changes in multi-turn Text-to-SQL. Specifically, Track-SQL incorporates a Semantic-enhanced Schema Extractor and a Schema-aware Context Extractor. Experimental results demonstrate that Track-SQL achieves state-of-the-art performance on the SparC and CoSQL datasets. Furthermore, detailed ablation studies reveal that Track-SQL significantly improves execution accuracy in multi-turn interactions by 7.1% and 9.55% on these datasets, respectively. Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/Track-SQL.

pdf bib abs
Handling Missing Entities in Zero-Shot Named Entity Recognition: Integrated Recall and Retrieval Augmentation
Ruichu Cai | Junhao Lu | Zhongjie Chen | Boyan Xu | Zhifeng Hao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Zero-shot Named Entity Recognition (ZS-NER) aims to recognize entities in unseen domains without specific annotated data. A key challenge is handling missing entities while ensuring accurate type recognition, hindered by: 1) the pre-training assumption that each entity has a single type, overlooking diversity, and 2) insufficient contextual knowledge for type reasoning. To address this, we propose IRRA (Integrated Recall and Retrieval Augmentation), a novel two-stage framework leveraging large language model techniques. In the Recall Augmented Entity Extracting stage, we built a perturbed dataset to induce the model to exhibit missing or erroneous extracted entities. Based on this, we trained an enhanced model to correct these errors. This approach can improve the ZS-NER’s recall rate. In the Retrieval Augmented Type Correcting stage, we employ Retrieval-Augmented Generation techniques to locate entity-related unannotated contexts, with the additional contextual information significantly improving the accuracy of type correcting. Extensive evaluations demonstrate the state-of-the-art performance of our IRRA, with significant improvements in zero-shot cross-domain settings validated through both auto-evaluated metrics and analysis. Our implementation will be open-sourced athttps://github.com/DMIRLAB-Group/IRRA.

2024

pdf bib abs
S²GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis
Bingfeng Chen | Qihan Ouyang | Yongqi Luo | Boyan Xu | Ruichu Cai | Zhifeng Hao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Previous graph-based approaches in Aspect-based Sentiment Analysis(ABSA) have demonstrated impressive performance by utilizing graph neural networks and attention mechanisms to learn structures of static dependency trees and dynamic latent trees. However, incorporating both semantic and syntactic information simultaneously within complex global structures can introduce irrelevant contexts and syntactic dependencies during the process of graph structure learning, potentially resulting in inaccurate predictions. In order to address the issues above, we propose S²GSL, incorporating Segment to Syntactic enhanced Graph Structure Learning for ABSA. Specifically, S²GSL is featured with a segment-aware semantic graph learning and a syntax-based latent graph learning enabling the removal of irrelevant contexts and dependencies, respectively. We further propose a self-adaptive aggregation network that facilitates the fusion of two graph learning branches, thereby achieving complementarity across diverse structures. Experimental results on four benchmarks demonstrate the effectiveness of our framework.

2020

Existing leading code comment generation approaches with the structure-to-sequence framework ignores the type information of the interpretation of the code, e.g., operator, string, etc. However, introducing the type information into the existing framework is non-trivial due to the hierarchical dependence among the type information. In order to address the issues above, we propose a Type Auxiliary Guiding encoder-decoder framework for the code comment generation task which considers the source code as an N-ary tree with type information associated with each node. Specifically, our framework is featured with a Type-associated Encoder and a Type-restricted Decoder which enables adaptive summarization of the source code. We further propose a hierarchical reinforcement learning method to resolve the training difficulties of our proposed framework. Extensive evaluations demonstrate the state-of-the-art performance of our framework with both the auto-evaluated metrics and case studies.