Beishui Liao
2025
Out-of-Distribution Detection via LLM-Guided Outlier Generation for Text-attributed Graph
Xiangwei Lv
|
Mengze Li
|
Jingyuan Chen
|
Zhiang Dong
|
Sirui Han
|
Beishui Liao
Findings of the Association for Computational Linguistics: ACL 2025
Text-Attributed Graphs (TAGs), which are characterized with text attributes, are widely used in the real world. When evaluating fully trained models designed for TAG predictions, they may perform significantly unsatisfactory on samples outside the In-Distribution (ID) data, which may raise serious security issues. To tackle it, Out-Of-Distribution (OOD) detection is introduced to the TAGs field, which aims to utilize a detector to classify OOD and ID samples. Recent studies attempt to introduce extra OOD datasets to regularize the detection model. However, due to the vastness of the OOD data space, high-quality OOD samples for training the detector are scarce and difficult to obtain in the real world. Thus, we utilize Large Language Models (LLMs) to generate the OOD training samples with high quality. There are two issues in this process: (1) LLMs tend to generate OOD-node samples significantly different from ID ones, with a limited learning value for OOD and ID relations. (2) Due to the inherent structure of TAGs, obtained OOD nodes need to be integrated with existing nodes by generating edges using LLMs. However, the large number of nodes makes reasoning over each node pair computationally unbearable. Toward these issues, we introduce LLMGuard with challenging OOD-node generation and lightweight edge predictors. Extensive experiments prove the effectiveness of LLMGuard. The source code is available.