Jianing Zhu
2026
Copyright Detective: A Forensic System to Evidence LLMs Flickering Copyright Leakage Risks
Guangwei Zhang | Jianing Zhu | Cheng Qian | Neil Zhenqiang Gong | Rada Mihalcea | Zhaozhuo Xu | Jingrui He | Jiaqi W. Ma | Chaowei Xiao | Bo Li | Ahmed Abbasi | Dongwon Lee | Heng Ji | Denghui Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Guangwei Zhang | Jianing Zhu | Cheng Qian | Neil Zhenqiang Gong | Rada Mihalcea | Zhaozhuo Xu | Jingrui He | Jiaqi W. Ma | Chaowei Xiao | Bo Li | Ahmed Abbasi | Dongwon Lee | Heng Ji | Denghui Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
We present **Copyright Detective**, the first interactive forensic system for detecting, analyzing, and visualizing potential copyright risks in LLM outputs. The system treats copyright infringement versus compliance as an **evidence discovery** process rather than a static classification task due to the complex nature of copyright law. It integrates multiple detection paradigms, including content recall testing, paraphrase-level similarity analysis, persuasive jailbreak probing, and unlearning verification, within a unified and extensible framework. Through interactive prompting, response collection, and iterative workflows, our system enables systematic auditing of verbatim memorization and paraphrase-level leakage, supporting responsible deployment and transparent evaluation of LLM copyright risks even with black-box access. In our experiments with GPT-4o-mini, we demonstrate that the specific persuasive strategy "Pathos" shifts the leakage distribution from about 0.1 (ROUGE-L) to 0.7. Our live system is hosted on [Streamlit server](https://copyright-detective.streamlit.app), with a [demonstration video](https://youtu.be/z9Lh4kNDHiM) included as supplementary material.