Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning

Zheng Chen, Zhaoxin Feng, Jianfei Ma, Jiexi Xu, Bo Li


Abstract
Large language models (LLMs) often demonstrate strong performance by leveraging implicit knowledge acquired during pretraining. Analogical reasoning, which solves new problems by referencing similar known examples, offers a structured way to utilize this knowledge, but can also lead to subtle factual errors and hallucinations. In this work, we investigate whether LLMs can recognize the reliability of their own analogical outputs using black-box uncertainty estimation (UE). We evaluate six UE metrics across two reasoning-intensive tasks: mathematical problem solving (GSM8K) and code generation (Codeforces). Our results show that Kernel Language Entropy (KLE) and Lexical Similarity (LexSim) are the most robust indicators of correctness. Moreover, while analogical prompting increases model confidence over direct prompting, most uncertainty arises during the analogy transfer step. These findings highlight the limitations of analogical knowledge transfer in LLMs and demonstrate the potential of UE methods for detecting hallucinated reasoning in black-box settings.
Anthology ID:
2025.knowfm-1.8
Volume:
Proceedings of the 3rd Workshop on Towards Knowledgeable Foundation Models (KnowFM)
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Yuji Zhang, Canyu Chen, Sha Li, Mor Geva, Chi Han, Xiaozhi Wang, Shangbin Feng, Silin Gao, Isabelle Augenstein, Mohit Bansal, Manling Li, Heng Ji
Venues:
KnowFM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
84–93
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.knowfm-1.8/
DOI:
Bibkey:
Cite (ACL):
Zheng Chen, Zhaoxin Feng, Jianfei Ma, Jiexi Xu, and Bo Li. 2025. Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning. In Proceedings of the 3rd Workshop on Towards Knowledgeable Foundation Models (KnowFM), pages 84–93, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning (Chen et al., KnowFM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.knowfm-1.8.pdf