Qingqing Hong
2025
Disentangle to Decay: Linear Attention with Trainable Decay Factor
Haibo Tong
|
Chenyang Zhang
|
Jiayi Lin
|
Bingxuan Hou
|
Qingqing Hong
|
Junli Wang
Proceedings of the 31st International Conference on Computational Linguistics
Linear attention enhances inference efficiency of Transformer and has attracted research interests as an efficient backbone of language models. Existing linear attention based models usually exploit decay factor based positional encoding (PE), where attention scores decay exponentially with increasing relative distance. However, most work manually designs a non-trainable decay factor of exponential calculation, which limits further optimization. Our analysis reveals directly training decay factor is unstable because of large gradients. To address this, we propose a novel PE for linear attention named Disentangle to Decay (D2D). D2D disentangles decay factor into two parts to achieve further optimization and stable training. Moreover, D2D can be transformed into recurrent form for efficient inference. Experiments demonstrate that D2D achieves stable training of decay factor, and enhances performance of linear attention in both normal context length and length extrapolation scenarios.
Rhetorical Device-Aware Sarcasm Detection with Counterfactual Data Augmentation
Qingqing Hong
|
Dongyu Zhang
|
Jiayi Lin
|
Dapeng Yin
|
Shuyue Zhu
|
Junli Wang
Findings of the Association for Computational Linguistics: ACL 2025
Sarcasm is a complex form of sentiment expression widely used in human daily life. Previous work primarily defines sarcasm as a form of verbal irony, which covers only a subset of real-world sarcastic expressions. However, sarcasm serves multifaceted functions and manifests itself through various rhetorical devices, such as echoic mention, rhetorical question and hyperbole. To fully capture its complexity, this paper investigates fine-grained sarcasm classification through the lens of rhetorical devices, and introduces RedSD, a RhEtorical Device-Aware Sarcasm Dataset with counterfactually augmented data.To construct the dataset, we extract sarcastic dialogues from situation comedies (i.e., sitcoms), and summarize nine rhetorical devices commonly employed in sarcasm. We then propose a rhetorical device-aware counterfactual data generation pipeline facilitated by both Large Language Models (LLMs) and human revision. Additionally, we propose duplex counterfactual augmentation that generates counterfactuals for both sarcastic and non-sarcastic dialogues, to further enhance the scale and diversity of the dataset.Experimental results on the dataset demonstrate that fine-tuned models exhibit a more balanced performance compared to zero-shot models, including GPT-3.5 and LLaMA 3.1, underscoring the importance of integrating various rhetorical devices in sarcasm detection. Our dataset is avaliable at https://github.com/qqHong73/RedSD.
2024
Correct after Answer: Enhancing Multi-Span Question Answering with Post-Processing Method
Jiayi Lin
|
Chenyang Zhang
|
Haibo Tong
|
Dongyu Zhang
|
Qingqing Hong
|
Bingxuan Hou
|
Junli Wang
Findings of the Association for Computational Linguistics: EMNLP 2024
Search
Fix author
Co-authors
- Jiayi Lin 3
- Junli Wang 3
- Bingxuan Hou 2
- Haibo Tong 2
- Chenyang Zhang 2
- show all...