Ali Satvaty
2026
EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs
Ali Satvaty | Suzan Verberne | Fatih Turkmen
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Ali Satvaty | Suzan Verberne | Fatih Turkmen
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Membership inference attacks (MIA) aim to infer whether a particular data point is part of the training dataset of a model. In this paper, we propose a new task in the context of LLM privacy: entity-level discovery of membership risk focused on sensitive information (PII, credit card numbers, etc). Existing methods for MIA can detect the presence of entire prompts or documents in the LLM training data, but they fail to capture risks at a finer granularity. We propose the “EL-MIA” framework for auditing entity-level membership risks in LLMs. We construct a benchmark dataset for the evaluation of MIA methods on this task. Using this benchmark, we conduct a systematic comparison of existing MIA techniques as well as two newly proposed methods. We provide a comprehensive analysis of the results, trying to explain the relation of the entity level MIA susceptability with the model scale, training epochs, and other surface level factors. Our findings reveal that existing MIA methods are limited when it comes to entity-level membership inference of the sensitive attributes, while this susceptibility can be outlined with relatively straightforward methods, highlighting the need for stronger adversaries to stress test the provided threat model.
2025
Memorization is Language-Sensitive: Analyzing Memorization and Inference Risks of LLMs in a Multilingual Setting
Ali Satvaty | Anna Visman | Daniel Seidel | Suzan Verberne | Fatih Turkmen
Proceedings of the First Workshop on Large Language Model Memorization (L2M2)
Ali Satvaty | Anna Visman | Daniel Seidel | Suzan Verberne | Fatih Turkmen
Proceedings of the First Workshop on Large Language Model Memorization (L2M2)
Large Language Models (LLMs) are known to memorize and reproduce parts of their training data during inference, raising significant privacy and safety concerns. While this phenomenon has been extensively studied to explain its contributing factors and countermeasures, its implications in multilingual contexts remain largely unexplored. In this work, we investigate cross-lingual differences in memorization behaviors of multilingual LLMs. Specifically, we examine both discoverable memorization and susceptibility to perplexity ratio attacks using Pythia models of varying sizes, evaluated on two parallel multilingual datasets. Our results reveal that lower-resource languages consistently exhibit higher vulnerability to perplexity ratio attacks, indicating greater privacy risks. In contrast, patterns of discoverable memorization appear to be influenced more strongly by the model’s pretraining or fine-tuning phases than by language resource level alone. These findings highlight the nuanced interplay between language resource availability and memorization in multilingual LLMs, providing insights toward developing safer and more privacy-preserving language models across diverse linguistic settings.
2022
Docalog: Multi-document Dialogue System using Transformer-based Span Retrieval
Sayed Hesam Alavian | Ali Satvaty | Sadra Sabouri | Ehsaneddin Asgari | Hossein Sameti
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Sayed Hesam Alavian | Ali Satvaty | Sadra Sabouri | Ehsaneddin Asgari | Hossein Sameti
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
Information-seeking dialogue systems, including knowledge identification and response generation, aim to respond to users with fluent, coherent, and informative answers based on users’ needs. This paper discusses our proposed approach, Docalog, for the DialDoc-22 (MultiDoc2Dial) shared task. Docalog identifies the most relevant knowledge in the associated document, in a multi-document setting. Docalog, is a three-stage pipeline consisting of (1) a document retriever model (DR. TEIT), (2) an answer span prediction model, and (3) an ultimate span picker deciding on the most likely answer span, out of all predicted spans. In the test phase of MultiDoc2Dial 2022, Docalog achieved f1-scores of 36.07% and 28.44% and SacreBLEU scores of 23.70% and 20.52%, respectively on the MDD-SEEN and MDD-UNSEEN folds.