2025
pdf
bib
abs
Disentangle to Decay: Linear Attention with Trainable Decay Factor
Haibo Tong
|
Chenyang Zhang
|
Jiayi Lin
|
Bingxuan Hou
|
Qingqing Hong
|
Junli Wang
Proceedings of the 31st International Conference on Computational Linguistics
Linear attention enhances inference efficiency of Transformer and has attracted research interests as an efficient backbone of language models. Existing linear attention based models usually exploit decay factor based positional encoding (PE), where attention scores decay exponentially with increasing relative distance. However, most work manually designs a non-trainable decay factor of exponential calculation, which limits further optimization. Our analysis reveals directly training decay factor is unstable because of large gradients. To address this, we propose a novel PE for linear attention named Disentangle to Decay (D2D). D2D disentangles decay factor into two parts to achieve further optimization and stable training. Moreover, D2D can be transformed into recurrent form for efficient inference. Experiments demonstrate that D2D achieves stable training of decay factor, and enhances performance of linear attention in both normal context length and length extrapolation scenarios.
pdf
bib
abs
Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State
Dongyu Zhang
|
Qingqing Hong
|
Bingxuan Hou
|
Jiayi Lin
|
Chenyang Zhang
|
Jialin Li
|
Junli Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models are prone to generating hallucination that deviates from factual information. Existing studies mainly focus on detecting the presence of hallucinations but lack a systematic classification approach, which hinders deeper exploration of their characteristics. To address this, we introduce the concept of belief state, which quantifies the model’s confidence in its own responses. We define the belief state of the model based on self-consistency, leveraging answer repetition rates to label confident and uncertain states. Based on this, we categorize factuality hallucination into two types: Overconfident Hallucination and Unaware Hallucination. Furthermore, we propose BAFH, a factuality hallucination type detection method. By training a classifier on model’s hidden states, we establish a link between hidden states and belief states, enabling efficient and automatic hallucination type detection. Experimental results demonstrate the effectiveness of BAFH and the differences between hallucination types.
pdf
bib
abs
REGULAR: A Framework for Relation-Guided Multi-Span Question Generation
Jiayi Lin
|
Chenyang Zhang
|
Bingxuan Hou
|
Dongyu Zhang
|
Qingqing Hong
|
Junli Wang
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
To alleviate the high cost of manually annotating Question Answering (QA) datasets, Question Generation (QG) requires the model to generate a question related to the given answer and passage. This work primarily focuses on Multi-Span Question Generation (MSQG), where the generated question corresponds to multiple candidate answers. Existing QG methods may not suit MSQG as they typically overlook the correlation between the candidate answers and generate trivial questions, which limits the quality of the synthetic datasets. Based on the observation that relevant entities typically share the same relationship with the same entity, we propose REGULAR, a framework of RElation-GUided MuLti-SpAn Question GeneRation. REGULAR first converts passages into relation graphs and extracts candidate answers from the relation graphs. Then, REGULAR utilizes a QG model to generate a set of candidate questions and a QA model to obtain the best question. We construct over 100,000 questions using Wikipedia corpora, named REGULAR-WIKI, and conduct experiments to compare our synthetic datasets with other synthetic QA datasets. The experiment results show that models trained with REGULAR-WIKI achieve the best performance. We also conduct ablation studies and statistical analysis to verify the quality of our synthetic dataset. Our code and data are available at https://github.com/PluseLin/REGULAR.
pdf
bib
abs
A Lightweight Multi Aspect Controlled Text Generation Solution For Large Language Models
Chenyang Zhang
|
Jiayi Lin
|
Haibo Tong
|
Bingxuan Hou
|
Dongyu Zhang
|
Jialin Li
|
Junli Wang
Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)
Multi-Aspect Controllable Text Generation (MCTG) introduces fine-grained multiple constraints in natural language generation, i.e. control attributes in topics, sentiments, and detoxification.MCTG demonstrates application prospects for trustworthy generation of Large Language Models (LLMs) but is limited by generalization issues.Existing work exploits additional structures and strategies for solutions, requiring LLMs’ modifications.To activate LLMs’ MCTG ability, we propose a lightweight MCTG pipeline based on data augmentation and instruction tuning.We analyze aspect bias and correlations in traditional datasets and address these concerns with augmented control attributes and sentences.Augmented datasets are feasible for instruction tuning.We conduct experiments for various LLMs backbone and parameter sizes, demonstrating general effectiveness on MCTG performance.
2024
pdf
bib
Correct after Answer: Enhancing Multi-Span Question Answering with Post-Processing Method
Jiayi Lin
|
Chenyang Zhang
|
Haibo Tong
|
Dongyu Zhang
|
Qingqing Hong
|
Bingxuan Hou
|
Junli Wang
Findings of the Association for Computational Linguistics: EMNLP 2024