Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization

Jan Bronec, Jindřich Helcl


Abstract
We present a submission to the SemEval 2025 shared task on unlearning sensitive content from LLMs. Our approach employs negative preference optimization using low-rank adaptation. We show that we can utilize this combination to cheaply compute additional regularization terms, which help with unlearning stabilization. The results of our approach significantly exceed the shared task baselines.
Anthology ID:
2025.semeval-1.187
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1415–1422
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.187/
DOI:
Bibkey:
Cite (ACL):
Jan Bronec and Jindřich Helcl. 2025. Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1415–1422, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization (Bronec & Helcl, SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.187.pdf