Dongyu Zhang
Other people with similar names: Dongyu Zhang , Dongyu Zhang
2025
Rhetorical Device-Aware Sarcasm Detection with Counterfactual Data Augmentation
Qingqing Hong
|
Dongyu Zhang
|
Jiayi Lin
|
Dapeng Yin
|
Shuyue Zhu
|
Junli Wang
Findings of the Association for Computational Linguistics: ACL 2025
Sarcasm is a complex form of sentiment expression widely used in human daily life. Previous work primarily defines sarcasm as a form of verbal irony, which covers only a subset of real-world sarcastic expressions. However, sarcasm serves multifaceted functions and manifests itself through various rhetorical devices, such as echoic mention, rhetorical question and hyperbole. To fully capture its complexity, this paper investigates fine-grained sarcasm classification through the lens of rhetorical devices, and introduces RedSD, a RhEtorical Device-Aware Sarcasm Dataset with counterfactually augmented data.To construct the dataset, we extract sarcastic dialogues from situation comedies (i.e., sitcoms), and summarize nine rhetorical devices commonly employed in sarcasm. We then propose a rhetorical device-aware counterfactual data generation pipeline facilitated by both Large Language Models (LLMs) and human revision. Additionally, we propose duplex counterfactual augmentation that generates counterfactuals for both sarcastic and non-sarcastic dialogues, to further enhance the scale and diversity of the dataset.Experimental results on the dataset demonstrate that fine-tuned models exhibit a more balanced performance compared to zero-shot models, including GPT-3.5 and LLaMA 3.1, underscoring the importance of integrating various rhetorical devices in sarcasm detection. Our dataset is avaliable at https://github.com/qqHong73/RedSD.
Bold Claims or Self-Doubt? Factuality Hallucination Type Detection via Belief State
Dongyu Zhang
|
Qingqing Hong
|
Bingxuan Hou
|
Jiayi Lin
|
Chenyang Zhang
|
Jialin Li
|
Junli Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models are prone to generating hallucination that deviates from factual information. Existing studies mainly focus on detecting the presence of hallucinations but lack a systematic classification approach, which hinders deeper exploration of their characteristics. To address this, we introduce the concept of belief state, which quantifies the model’s confidence in its own responses. We define the belief state of the model based on self-consistency, leveraging answer repetition rates to label confident and uncertain states. Based on this, we categorize factuality hallucination into two types: Overconfident Hallucination and Unaware Hallucination. Furthermore, we propose BAFH, a factuality hallucination type detection method. By training a classifier on model’s hidden states, we establish a link between hidden states and belief states, enabling efficient and automatic hallucination type detection. Experimental results demonstrate the effectiveness of BAFH and the differences between hallucination types.
Search
Fix author
Co-authors
- Qingqing Hong 2
- Jiayi Lin 2
- Junli Wang 2
- Bingxuan Hou 1
- Jialin Li 1
- show all...