Text Anomaly Detection with Simplified Isolation Kernel

Yang Cao, Sikun Yang, Yujiu Yang, Lianyong Qi, Ming Liu


Abstract
Two-step approaches combining pre-trained large language model embeddings and anomaly detectors demonstrate strong performance in text anomaly detection by leveraging rich semantic representations. However, high-dimensional dense embeddings extracted by large language models pose challenges due to substantial memory requirements and high computation time. To address this challenge, we introduce the Simplified Isolation Kernel (SIK), which maps high-dimensional dense embeddings to lower-dimensional sparse representations while preserving crucial anomaly characteristics. SIK has linear-time complexity and significantly reduces space complexity through its innovative boundary-focused feature mapping.Experiments across 7 datasets demonstrate that SIK achieves better detection performance than 11 SOTA anomaly detection algorithms while maintaining computational efficiency and low memory cost. All code and demonstrations are available at https://github.com/charles-cao/SIK.
Anthology ID:
2025.findings-emnlp.680
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12702–12713
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.680/
DOI:
10.18653/v1/2025.findings-emnlp.680
Bibkey:
Cite (ACL):
Yang Cao, Sikun Yang, Yujiu Yang, Lianyong Qi, and Ming Liu. 2025. Text Anomaly Detection with Simplified Isolation Kernel. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 12702–12713, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Text Anomaly Detection with Simplified Isolation Kernel (Cao et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.680.pdf
Checklist:
 2025.findings-emnlp.680.checklist.pdf