Jiayi Gui

2025

pdf bib abs
AIDER: a Robust and Topic-Independent Framework for Detecting AI-Generated Text
Jiayi Gui | Baitong Cui | Xiaolian Guo | Ke Yu | Xiaofei Wu
Proceedings of the 31st International Conference on Computational Linguistics

The human-level fluency achieved by large language models in text generation has intensified the challenge of distinguishing between human-written and AI-generated texts. While current fine-tuned detectors exist, they often lack robustness against adversarial attacks and struggle with out-of-distribution topics, limiting their practical applicability. This study introduces AIDER, a robust and topic-independent AI-generated text detection framework. AIDER leverages the ALBERT model for topic content disentanglement, enhancing transferability to unseen topics. It incorporates an augmentor that generates robust adversarial data for training, coupled with contrastive learning techniques to boost resilience. Comprehensive experiments demonstrate AIDER’s significant superiority over state-of-the-art methods, exhibiting exceptional robustness against adversarial attacks with minimal performance degradation. AIDER consistently achieves high accuracy in non-augmented scenarios and demonstrates remarkable generalizability to unseen topics. These attributes establish AIDER as a powerful and versatile tool for LLM-generated text detection across diverse real-world applications, addressing critical challenges in the evolving landscape of AI-generated content.

Large Language Models (LLMs) have demonstrated notable capabilities across various tasks, showcasing complex problem-solving abilities. Understanding and executing complex rules, along with multi-step planning, are fundamental to logical reasoning and critical for practical LLM agents and decision-making systems. However, evaluating LLMs as effective rule-based executors and planners remains underexplored. In this paper, we introduce LogicGame, a novel benchmark designed to evaluate the comprehensive rule understanding, execution, and planning capabilities of LLMs. Unlike traditional benchmarks, LogicGame provides diverse games that contain a series of rules with an initial state, requiring models to comprehend and apply predefined regulations to solve problems. We create simulated scenarios in which models execute or plan operations to achieve specific outcomes. These game scenarios are specifically designed to distinguish logical reasoning from mere knowledge by relying exclusively on predefined rules. This separation allows for a pure assessment of rule-based reasoning capabilities. The evaluation considers not only final outcomes but also intermediate steps, providing a comprehensive assessment of model performance. Moreover, these intermediate steps are deterministic and can be automatically verified. LogicGame defines game scenarios with varying difficulty levels, from simple rule applications to complex reasoning chains, in order to offer a precise evaluation of model performance on rule understanding and multi-step execution. Utilizing LogicGame, we test various LLMs and identify notable shortcomings in their rule-based logical reasoning abilities.

Co-authors

Ke Yu 1

Venues

coling1
findings1

Fix author