Hongyan Chang


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Context-Aware Membership Inference Attacks against Pre-trained Large Language Models
Hongyan Chang | Ali Shahin Shamsabadi | Kleomenis Katevas | Hamed Haddadi | Reza Shokri
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Membership Inference Attacks (MIAs) on pre-trained Large Language Models (LLMs) aim at determining if a data point was part of the model’s training set. Prior MIAs that are built for classification models fail at LLMs, due to ignoring the generative nature of LLMs across token sequences. In this paper, we present a novel attack on pre-trained LLMs that adapts MIA statistical tests to the perplexity dynamics of subsequences within a data point. Our method significantly outperforms prior approaches, revealing context-dependent memorization patterns in pre-trained LLMs.

pdf bib
Watermark Smoothing Attacks against Language Models
Hongyan Chang | Hamed Hassani | Reza Shokri
Findings of the Association for Computational Linguistics: EMNLP 2025

Watermarking is a key technique for detecting AI-generated text. In this work, we study its vulnerabilities and introduce the Smoothing Attack, a novel watermark removal method. By leveraging the relationship between the model’s confidence and watermark detectability, our attack selectively smoothes the watermarked content, erasing watermark traces while preserving text quality. We validate our attack on open-source models ranging from 1.3 B to 30B parameters on 10 different watermarks, demonstrating its effectiveness. Our findings expose critical weaknesses in existing watermarking schemes and highlight the need for stronger defenses.