TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents

Hyundong Jin; Sicheol Sung; Shinwoo Park; SeungYeop Baik; Yo-Sub Han

doi:10.18653/v1/2025.findings-emnlp.1027

TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents

Hyundong Jin, Sicheol Sung, Shinwoo Park, SeungYeop Baik, Yo-Sub Han

Abstract

The reasoning, writing, text-editing, and retrieval capabilities of proprietary large language models (LLMs) have advanced rapidly, providing users with an ever-expanding set of functionalities. However, this growing utility has also led to a serious societal concern: the over-reliance on LLMs. In particular, users increasingly delegate tasks such as homework, assignments, or the processing of sensitive documents to LLMs without meaningful engagement. This form of over-reliance and misuse is emerging as a significant social issue. In order to mitigate these issues, we propose a method injecting imperceptible phantom tokens into documents, which causes LLMs to generate outputs that appear plausible to users but are in fact incorrect. Based on this technique, we introduce TrapDoc, a framework designed to deceive over-reliant LLM users. Through empirical evaluation, we demonstrate the effectiveness of our framework on proprietary LLMs, comparing its impact against several baselines. TrapDoc serves as a strong foundation for promoting more responsible and thoughtful engagement with language models.

Anthology ID:: 2025.findings-emnlp.1027
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18881–18897
Language:
URL:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1027/
DOI:: 10.18653/v1/2025.findings-emnlp.1027
Bibkey:
Cite (ACL):: Hyundong Jin, Sicheol Sung, Shinwoo Park, SeungYeop Baik, and Yo-Sub Han. 2025. TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18881–18897, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: TrapDoc: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents (Jin et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1027.pdf
Checklist:: 2025.findings-emnlp.1027.checklist.pdf

PDF Cite Search Checklist Fix data