Zhiyi Yin
2025
PRDetect: Perturbation-Robust LLM-generated Text Detection Based on Syntax Tree
Xiang Li
|
Zhiyi Yin
|
Hexiang Tan
|
Shaoling Jing
|
Du Su
|
Yi Cheng
|
Huawei Shen
|
Fei Sun
Findings of the Association for Computational Linguistics: NAACL 2025
As LLM-generated text becomes increasingly prevalent on the internet, often containing hallucinations or biases, detecting such content has emerged as a critical area of research.Recent methods have demonstrated impressive performance in detecting text generated entirely by LLMs.However, in real-world scenarios, users often introduce perturbations to the LLM-generated text, and the robustness of existing detection methods against these perturbations has not been sufficiently explored.This paper empirically investigates this challenge and finds that even minor perturbations can severely degrade the performance of current detection methods. To address this issue, we find that the syntactic tree is minimally affected by disturbances and exhibits distinct differences between human-written and LLM-generated text.Therefore, we propose a detection method based on syntactic trees, which can capture features invariant to perturbations.It demonstrates significantly improved robustness against perturbation on the HC3 and GPT-3.5-mixed datasets.Moreover, it also has the shortest time expenditure.We provide the code and data at https://github.com/thulx18/PRDetect.
Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject
Zenghao Duan
|
Wenbin Duan
|
Zhiyi Yin
|
Yinghan Shen
|
Shaoling Jing
|
Jie Zhang
|
Huawei Shen
|
Xueqi Cheng
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Search
Fix data
Co-authors
- Shaoling Jing 2
- Huawei Shen 2
- Yi Cheng 1
- Xueqi Cheng 1
- Zenghao Duan 1
- show all...