Dzung Pham


2026

As AI text detectors are increasingly used to flag LLM-generated writing, a natural question arises: are there forms of high-quality generated narrative that can evade such detection? We introduce Frankentexts, a long-form narrative generation paradigm that treats an LLM as a composer of existing texts rather than as an author. Given a writing prompt and thousands of randomly sampled human-written snippets, the model assembles a coherent narrative where most tokens (e.g., 90%) are copied verbatim from the source passages. Despite the extreme challenge of the task, we observe through extensive automatic and human evaluation that Frankentexts improve over vanilla LLM generations in key writing quality metrics such as diversity and novelty while remaining mostly coherent and relevant to the prompt. Furthermore, Frankentexts pose a fundamental challenge to current AI text detectors: 72% of Frankentexts produced by our best configuration (Gemini-2.5-Pro with 5K input snippets) are misclassified as human-written by Pangram, a state-of-the-art detector. Human annotators praise Frankentexts for their inventive premises, vivid descriptions, and dry humor; however, they still identify issues with abrupt tonal shifts and uneven grammar across segments. Overall, the emergence of high-quality yet low-detectability Frankentexts challenges established authorship norms while raising concerns about the publishing economy.