Detecting Training Data of Large Language Models via Expectation Maximization

Gyuwan Kim; Yang Li; Evangelia Spiliopoulou; Jie Ma; William Yang Wang

Detecting Training Data of Large Language Models via Expectation Maximization

Gyuwan Kim, Yang Li, Evangelia Spiliopoulou, Jie Ma, William Yang Wang

Abstract

Membership inference attacks (MIAs) aim to determine whether a specific example was used to train a given language model. While prior work has explored prompt-based attacks such as ReCALL, these methods rely heavily on the assumption that using known non-members as prompts reliably suppresses the model’s responses to non-member queries. We propose EM-MIA, a new membership inference approach that iteratively refines prefix effectiveness and membership scores using an expectation-maximization strategy without requiring labeled non-member examples. To support controlled evaluation, we introduce OLMoMIA, a benchmark that enables analysis of MIA robustness under systematically varied distributional overlap and difficulty. Experiments on WikiMIA and OLMoMIA show that EM-MIA outperforms existing baselines, particularly in settings with clear distributional separability. We highlight scenarios where EM-MIA succeeds in practical settings with partial distributional overlap, while failure cases expose fundamental limitations of current MIA methods under near-identical conditions. We release our code and evaluation pipeline to encourage reproducible and robust MIA research.

Anthology ID:: 2026.eacl-long.49
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1115–1129
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.49/
DOI:
Bibkey:
Cite (ACL):: Gyuwan Kim, Yang Li, Evangelia Spiliopoulou, Jie Ma, and William Yang Wang. 2026. Detecting Training Data of Large Language Models via Expectation Maximization. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1115–1129, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Detecting Training Data of Large Language Models via Expectation Maximization (Kim et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.49.pdf

PDF Cite Search Fix data