Volkan Volkan Cevher


2025

pdf bib
SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models
Anil Ramakrishna | Yixin Wan | Xiaomeng Jin | Kai - Wei Chang | Zhiqi Bu | Bhanukiran Vinzamuri | Volkan Volkan Cevher | Mingyi Hong | Rahul Gupta
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

We introduce SemEval-2025 Task 4: unlearn- ing sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) un- learn short form synthetic biographies contain- ing personally identifiable information (PII), in- cluding fake names, phone number, SSN, email and home addresses, and (3) unlearn real docu- ments sampled from the target model’s training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.