DE-CLIP: Few-Shot Anomaly Detection via Difference-Guided Embedding Editing

Yage Zhang, Yukun Jiang, Michael Backes, Yang Zhang


Abstract
Anomaly detection (AD) plays a critical role in applications such as automated industrial inspection and medical image analysis. Empowered by the strong pre-trained vision-language model, CLIP, recent years have witnessed the emergence of several CLIP-based few-shot AD methods.Due to the overlap between the embedding distributions of normal and anomalous samples, many existing approaches introduce additional model training for more discriminative text embeddings.However, we demonstrate that such training is not necessary.Specifically, we find that this embedding overlap can be separated by introducing a  ̲Difference-guided vector for embedding  ̲Editing (DiffEdit).Based on this finding, we propose DE-CLIP, a simple yet effective framework based on DiffEdit, which directly edits text embeddings based on the textual and visual differences between normal and anomalous samples, resulting in more discriminative embeddings for AD.Extensive experiments on industrial and medical datasets demonstrate the superiority of our proposed DE-CLIP compared with existing baselines.For instance, on MVTec dataset, DE-CLIP achieves 96.6% and 96.7% AUROC on anomaly classification and segmentation, surpassing both training-based and training-free methods.In addition, we observe that introducing DiffEdit into other training-free baselines could also significantly improve their performance, highlighting the potential of DiffEdit to promote better AD.
Anthology ID:
2026.acl-long.110
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2396–2407
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.110/
DOI:
Bibkey:
Cite (ACL):
Yage Zhang, Yukun Jiang, Michael Backes, and Yang Zhang. 2026. DE-CLIP: Few-Shot Anomaly Detection via Difference-Guided Embedding Editing. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2396–2407, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
DE-CLIP: Few-Shot Anomaly Detection via Difference-Guided Embedding Editing (Zhang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.110.pdf
Checklist:
 2026.acl-long.110.checklist.pdf