Cunhang Fan
2026
RTCFake: Speech Deepfake Detection in Real-Time Communication
Jun Xue | Zhuolin Yi | Yihuan Huang | Yanzhen Ren | Yujie Chen | Cunhang Fan | Zicheng Su | Yongcheng Zhang | Bo Cai
Findings of the Association for Computational Linguistics: ACL 2026
Jun Xue | Zhuolin Yi | Yihuan Huang | Yanzhen Ren | Yujie Chen | Cunhang Fan | Zicheng Su | Yongcheng Zhang | Bo Cai
Findings of the Association for Computational Linguistics: ACL 2026
With the rapid advancement of speech generation technologies, the threat posed by speech deepfakes in real-time communication (RTC) scenarios has intensified. However, existing detection studies mainly focus on offline simulations and struggle to cope with the complex distortions introduced during RTC transmission, including unknown speech enhancement processes (e.g., noise suppression) and codec compression. To address this challenge, we present the first large-scale speech deepfake dataset tailored for RTC scenarios, termed RTCFake, totaling approximately 600 hours. The dataset is constructed by transmitting speech through multiple mainstream social media and conferencing platforms (e.g., Zoom), enabling precise pairing between offline and online speech. In addition, we propose a phoneme-guided consistency learning (PCL) strategy that enforces models to learn platform-invariant semantic structural representations. In this paper, the RTCFake dataset is divided into training, development, and evaluation sets. The evaluation set further includes both unseen RTC platforms and unseen complex noise conditions, thereby providing a more realistic and challenging evaluation benchmark for speech deepfake detection. Furthermore, the proposed PCL strategy achieves significant improvements in both cross-platform generalization and noise robustness, offering an effective and generalizable modeling paradigm.
ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models
Cunhang Fan | Jun Zhang | Xue Zhang | Shuai Zhang | Zhao Lv | Jianhua Tao | Zhengqi Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Cunhang Fan | Jun Zhang | Xue Zhang | Shuai Zhang | Zhao Lv | Jianhua Tao | Zhengqi Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) often generate factually incorrect content, known as “hallucinations”, which undermine the reliability and safety of their outputs. Existing hallucination detection methods either depend on external knowledge sources, incurring high computational costs and limiting real-time applicability, or extract the model’s internal states, leading to poor generalization. To address these issues, this paper proposes ReFL, a hallucination detection framework. ReFL leverages corrective in-context learning to dynamically guide LLMs to recognize their own prediction errors and adjust internal representations, critically without updating model weights. Specifically, by introducing a corrective in-context learning strategy, where triplets of input text, model prediction, and ground-truth label are embedded into the prompt to make the model explicitly aware of its own errors. The model reflects on prior outputs to adjust its internal states and generate semantically structured representations better aligned with factuality. This feedback mechanism encourages the model to shape a more coherent semantic space and enhances the LLM’s internal sensitivity to hallucinations. Experimental results on two benchmark datasets demonstrate that ReFL consistently outperforms existing methods, achieving state-of-the-art performance.
2024
UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models
Zhanyue Qin | Haochuan Wang | Deyuan Liu | Ziyang Song | Cunhang Fan | Zhao Lv | Jinlin Wu | Zhen Lei | Zhiying Tu | Dianhui Chu | Xiaoyan Yu | Dianbo Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Zhanyue Qin | Haochuan Wang | Deyuan Liu | Ziyang Song | Cunhang Fan | Zhao Lv | Jinlin Wu | Zhen Lei | Zhiying Tu | Dianhui Chu | Xiaoyan Yu | Dianbo Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Sequential decision-making refers to algorithms that take into account the dynamics of the environment, where early decisions affect subsequent decisions. With large language models (LLMs) demonstrating powerful capabilities between tasks, we can’t help but ask: Can Current LLMs Effectively Make Sequential Decisions? In order to answer this question, we propose the UNO Arena based on the card game UNO to evaluate the sequential decision-making capability of LLMs and explain in detail why we choose UNO. In UNO Arena, We evaluate the sequential decision-making capability of LLMs dynamically with novel metrics based Monte Carlo methods. We set up random players, DQN-based reinforcement learning players, and LLM players (e.g. GPT-4, Gemini-pro) for comparison testing. Furthermore, in order to improve the sequential decision-making capability of LLMs, we propose the TUTRI player, which can involves having LLMs reflect their own actions with the summary of game history and the game strategy. Numerous experiments demonstrate that the TUTRI player achieves a notable breakthrough in the performance of sequential decision-making compared to the vanilla LLM player.
Bilateral Masking with prompt for Knowledge Graph Completion
Yonghui Kong | Cunhang Fan | Yujie Chen | Shuai Zhang | Zhao Lv | Jianhua Tao
Findings of the Association for Computational Linguistics: NAACL 2024
Yonghui Kong | Cunhang Fan | Yujie Chen | Shuai Zhang | Zhao Lv | Jianhua Tao
Findings of the Association for Computational Linguistics: NAACL 2024
The pre-trained language model (PLM) has achieved significant success in the field of knowledge graph completion (KGC) by effectively modeling entity and relation descriptions. In recent studies, the research in this field has been categorized into methods based on word matching and sentence matching, with the former significantly lags behind. However, there is a critical issue in word matching methods, which is that these methods fail to obtain satisfactory single embedding representations for entities.To address this issue and enhance entity representation, we propose the Bilateral Masking with prompt for Knowledge Graph Completion (BMKGC) approach.Our methodology employs prompts to narrow the distance between the predicted entity and the known entity. Additionally, the BMKGC model incorporates a bi-encoder architecture, enabling simultaneous predictions at both the head and tail. Furthermore, we propose a straightforward technique to augment positive samples, mitigating the problem of degree bias present in knowledge graphs and thereby improving the model’s robustness. Experimental results conclusively demonstrate that BMKGC achieves state-of-the-art performance on the WN18RR dataset.
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu | Zhanyue Qin | Hairu Wang | Zhao Yang | Zecheng Wang | Fangying Rong | Qingbin Liu | Yanchao Hao | Bo Li | Xi Chen | Cunhang Fan | Zhao Lv | Dianhui Chu | Zhiying Tu | Dianbo Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Deyuan Liu | Zhanyue Qin | Hairu Wang | Zhao Yang | Zecheng Wang | Fangying Rong | Qingbin Liu | Yanchao Hao | Bo Li | Xi Chen | Cunhang Fan | Zhao Lv | Dianhui Chu | Zhiying Tu | Dianbo Sui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments. Current compression techniques, such as parameter pruning, often fail to effectively utilize the knowledge from pruned parameters. To address these challenges, we propose Manifold-Based Knowledge Alignment and Layer Merging Compression (MKA), a novel approach that uses manifold learning and the Information Bottleneck (IB) measure to merge similar layers, reducing model size while preserving essential performance. We evaluate MKA on multiple benchmark datasets and various LLMs. Our findings show that MKA not only preserves model performance but also achieves substantial compression ratios, outperforming traditional pruning methods. Moreover, when coupled with quantization, MKA delivers even greater compression. Specifically, on the MMLU dataset using the Llama3-8B model, MKA achieves a compression ratio of 43.75% with a minimal performance decrease of only 2.82%. The proposed MKA method offers a resource-efficient and performance-preserving model compression technique for LLMs. We make our code available at https://github.com/SempraETY/Pruning-via-Merging
Search
Fix author
Co-authors
- Zhao Lv 4
- Dianhui Chu 2
- Deyuan Liu 2
- Zhanyue Qin 2
- Dianbo Sui 2
- Jianhua Tao 2
- Zhiying Tu 2
- Bo Cai 1
- Yujie Chen 1
- Yujie Chen 1
- Xi Chen 1
- Yanchao Hao 1
- Yihuan Huang 1
- Yonghui Kong 1
- Zhen Lei 1
- Bo Li 1
- Qingbin Liu 1
- Yanzhen Ren 1
- Fangying Rong 1
- Ziyang Song 1
- Zicheng Su 1
- Haochuan Wang 1
- Hairu Wang 1
- Zecheng Wang 1
- Zhengqi Wen 1
- Jinlin Wu 1
- Jun Xue 1
- Zhao Yang 1
- Zhuolin Yi 1
- Xiaoyan Yu 1
- Yongcheng Zhang 1
- Shuai Zhang 1
- Jun Zhang 1
- Xue Zhang 1
- Shuai Zhang 1