Robust AI-Generated Text Detection by Restricted Embeddings
Kristian Kuznetsov, Eduard Tulchinskii, Laida Kushnareva, German Magai, Serguei Barannikov, Sergey Nikolenko, Irina Piontkovskaya
Abstract
Growing amount and quality of AI-generated texts makes detecting such content more difficult. In most real-world scenarios, the domain (style and topic) of generated data and the generator model are not known in advance. In this work, we focus on the robustness of classifier-based detectors of AI-generated text, namely their ability to transfer to unseen generators or semantic domains. We investigate the geometry of the embedding space of Transformer-based text encoders and show that clearing out harmful linear subspaces helps to train a robust classifier, ignoring domain-specific spurious features. We investigate several subspace decomposition and feature selection strategies and achieve significant improvements over state of the art methods in cross-domain and cross-generator transfer. Our best approaches for head-wise and coordinate-based subspace removal increase the mean out-of-distribution (OOD) classification score by up to 9% and 14% in particular setups for RoBERTa and BERT embeddings respectively. We release our code and data: https://github.com/SilverSolver/RobustATD- Anthology ID:
- 2024.findings-emnlp.992
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 17036–17055
- Language:
- URL:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2024.findings-emnlp.992/
- DOI:
- 10.18653/v1/2024.findings-emnlp.992
- Cite (ACL):
- Kristian Kuznetsov, Eduard Tulchinskii, Laida Kushnareva, German Magai, Serguei Barannikov, Sergey Nikolenko, and Irina Piontkovskaya. 2024. Robust AI-Generated Text Detection by Restricted Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 17036–17055, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Robust AI-Generated Text Detection by Restricted Embeddings (Kuznetsov et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2024.findings-emnlp.992.pdf