Carlos Levya Capote


2025

pdf bib
Masks and Mimicry: Strategic Obfuscation and Impersonation Attacks on Authorship Verification
Kenneth Alperin | Rohan Leekha | Adaku Uchendu | Trang Nguyen | Srilakshmi Medarametla | Carlos Levya Capote | Seth Aycock | Charlie Dagli
Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities

The increasing use of Artificial Intelligence(AI) technologies, such as Large LanguageModels (LLMs) has led to nontrivial improvementsin various tasks, including accurate authorshipidentification of documents. However,while LLMs improve such defense techniques,they also simultaneously provide a vehicle formalicious actors to launch new attack vectors.To combat this security risk, we evaluate theadversarial robustness of authorship models(specifically an authorship verification model)to potent LLM-based attacks. These attacksinclude untargeted methods - authorship obfuscationand targeted methods - authorshipimpersonation. For both attacks, the objectiveis to mask or mimic the writing style of an authorwhile preserving the original texts’ semantics,respectively. Thus, we perturb an accurateauthorship verification model, and achievemaximum attack success rates of 92% and 78%for both obfuscation and impersonation attacks,respectively.