DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution

L. D. M. S. Sai Teja, N. Siva Gopala Krishna, Ufaq Khan, Muhammad Haris Khan, Atul Mishra


Abstract
In the age of advanced large language models (LLMs), the boundaries between human and AI-generated text are becoming increasingly blurred. We address the challenge of segmenting mixed-authorship text, that is identifying transition points in text where authorship shifts from human to AI or vice-versa, a problem with critical implications for authenticity, trust, and human oversight. We introduce a novel framework, called Info-Mask for mixed authorship detection that integrates stylometric cues, perplexity-driven signals, and structured boundary modeling to accurately segment collaborative human-AI content. To evaluate the robustness of our system against adversarial perturbations, we construct and release an adversarial benchmark dataset Mixed-text Adversarial setting for Segmentation (MAS), designed to probe the limits of existing detectors. Beyond segmentation accuracy, we introduce Human-Interpretable Attribution (HIA) overlays that highlight how stylometric features inform boundary predictions, and we conduct a small-scale human study assessing their usefulness. Across multiple architectures, Info-Mask significantly improves span-level robustness under adversarial conditions, establishing new baselines while revealing remaining challenges. Our findings highlight both the promise and limitations of adversarially robust, interpretable mixed-authorship detection, with implications for trust and oversight in human-AI co-authorship.
Anthology ID:
2026.findings-eacl.326
Volume:
Findings of the Association for Computational Linguistics: EACL 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6189–6206
Language:
URL:
https://preview.aclanthology.org/check-for-anonymous-pdfs/2026.findings-eacl.326/
DOI:
10.18653/v1/2026.findings-eacl.326
Bibkey:
Cite (ACL):
L. D. M. S. Sai Teja, N. Siva Gopala Krishna, Ufaq Khan, Muhammad Haris Khan, and Atul Mishra. 2026. DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution. In Findings of the Association for Computational Linguistics: EACL 2026, pages 6189–6206, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution (Sai Teja et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/check-for-anonymous-pdfs/2026.findings-eacl.326.pdf
Checklist:
 2026.findings-eacl.326.checklist.pdf