SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models
Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai - Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Volkan Cevher, Mingyi Hong, Rahul Gupta
Abstract
We introduce SemEval-2025 Task 4: unlearn- ing sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) un- learn short form synthetic biographies contain- ing personally identifiable information (PII), in- cluding fake names, phone number, SSN, email and home addresses, and (3) unlearn real docu- ments sampled from the target model’s training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.- Anthology ID:
- 2025.semeval-1.329
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2584–2596
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.329/
- DOI:
- Cite (ACL):
- Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai - Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Volkan Cevher, Mingyi Hong, and Rahul Gupta. 2025. SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2584–2596, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models (Ramakrishna et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.329.pdf