AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection

Suhee Yoon; Sanghyu Yoon; Ye Seul Sim; Seungdong Yoa; Dongmin Kim; Soonyoung Lee; Hankook Lee; Woohyung Lim

AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection

Suhee Yoon, Sanghyu Yoon, Ye Seul Sim, Seungdong Yoa, Dongmin Kim, Soonyoung Lee, Hankook Lee, Woohyung Lim

Abstract

In the tabular domain, which is the predominant data format in real-world applications, anomalies are extremely rare or difficult to collect, as their identification often requires domain expertise. Consequently, evaluating tabular anomaly detection models is challenging, since anomalies may be absent even in evaluation sets. To tackle this challenge, prior works have generated synthetic anomaly generation rely on statistical patterns, they often overlook domain semantics and struggle to reflect the complex, domain-specific nature of real-world anomalies. We propose AutoAnoEval, a novel evaluation framework for tabular AD that constructs pseudo-evaluation sets with semantically grounded synthetic anomalies. Our approach leverages an iterative interaction between a Large Language Model (LLM) and a decision tree (DT): the LLM generates realistic anomaly conditions based on contextual semantics, while the DT provides structural guidance by capturing feature interactions inherent in the tabular data. This iterative loop ensures the generation of diverse anomaly conditions, ranging from easily detectable outliers to subtle cases near the decision boundary. Extensive experiments on 20 tabular AD benchmarks demonstrate that AutoAnoEval achieves superior model selection performance, with high ranking alignment and minimal performance gaps compared to evaluations on anomalies encountered in practical applications.

Anthology ID:: 2026.findings-eacl.183
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3546–3560
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.183/
DOI:
Bibkey:
Cite (ACL):: Suhee Yoon, Sanghyu Yoon, Ye Seul Sim, Seungdong Yoa, Dongmin Kim, Soonyoung Lee, Hankook Lee, and Woohyung Lim. 2026. AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection. In Findings of the Association for Computational Linguistics: EACL 2026, pages 3546–3560, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection (Yoon et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.183.pdf
Checklist:: 2026.findings-eacl.183.checklist.pdf

PDF Cite Search Checklist Fix data