Yixiang Qiu
2026
C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts
Chenxi Qing | Junxi Wu | Zheng Liu | Yixiang Qiu | Hongyao Yu | Bin Chen | Hao Wu | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
Chenxi Qing | Junxi Wu | Zheng Liu | Yixiang Qiu | Hongyao Yu | Bin Chen | Hao Wu | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the domain of Chinese corpora, challenges remain, including limited model diversity and data homogeneity. To address these issues, we propose C-ReD: a comprehensive Chinese Real-prompt AI-generated text Detection benchmark. Experiments demonstrate that C-ReD not only enables reliable in-domain detection but also supports strong generalization to unseen LLMs and external Chinese datasets—addressing critical gaps in model diversity, domain coverage, and prompt realism that have limited prior Chinese detection benchmarks. We release our resources at https://github.com/HeraldofLight/C-ReD.
2025
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Hao Fang | Jiawei Kong | Tianqu Zhuang | Yixiang Qiu | Kuofeng Gao | Bin Chen | Shu-Tao Xia | Yaowei Wang | Min Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Hao Fang | Jiawei Kong | Tianqu Zhuang | Yixiang Qiu | Kuofeng Gao | Bin Chen | Shu-Tao Xia | Yaowei Wang | Min Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The misuse of large language models (LLMs), such as academic plagiarism, has driven the development of detectors to identify LLM-generated texts. To bypass these detectors, paraphrase attacks have emerged to purposely rewrite these texts to evade detection. Despite the success, existing methods require substantial data and computational budgets to train a specialized paraphraser, and their attack efficacy greatly reduces when faced with advanced detection algorithms. To address this, we propose Contrastive Paraphrase Attack (CoPA), a training-free method that effectively deceives text detectors using off-the-shelf LLMs. The first step is to carefully craft instructions that encourage LLMs to produce more human-like texts. Nonetheless, we observe that the inherent statistical biases of LLMs can still result in some generated texts carrying certain machine-like attributes that can be captured by detectors. To overcome this, CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by the LLM. By subtracting the machine-like patterns from the human-like distribution during the decoding process, CoPA is able to produce sentences that are less discernible by text detectors. Our theoretical analysis suggests the superiority of the proposed attack. Extensive experiments validate the effectiveness of CoPA in fooling text detectors across various scenarios.