Shaofei Wang
2026
Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments
Hao Mi | Qiang Sheng | Shaofei Wang | Beizhe Hu | Yifan Sun | Zhengjia Wang | Hengqi Zeng | Yang Li | Danding Wang | Juan Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hao Mi | Qiang Sheng | Shaofei Wang | Beizhe Hu | Yifan Sun | Zhengjia Wang | Hengqi Zeng | Yang Li | Danding Wang | Juan Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) are prone to factual hallucinations, risking their reliability in real-world applications. Existing hallucination detectors mainly extract micro-level intrinsic patterns for uncertainty quantification or elicit macro-level self-judgments through verbalized prompts. However, these methods address only a single facet of the hallucination, focusing either on implicit neural uncertainty or explicit symbolic reasoning, thereby treating these inherently coupled behaviors in isolation and failing to exploit their interdependence for a holistic view. In this paper, we propose LaaB (Logical Consistency-as-a-Bridge), a framework that bridges neural features and symbolic judgments for hallucination detection. LaaB introduces a "meta-judgment" process to map symbolic labels back into the feature space. By leveraging the inherent logical bridge where response and meta-judgment labels are either the same or opposite based on the self-judgment’s semantics, LaaB aligns and integrates dual-view signals via mutual learning and enhances the hallucination detection. Extensive experiments on 4 public datasets, across 4 LLMs, against 8 baselines demonstrate the superiority of LaaB.
2025
Logic-Thinker: Teaching Large Language Models to Think more Logically.
Chengyao Wen | Qiang Cheng | Shaofei Wang | Zhizhen Liu | Deng Zhao | Lei Liang
Findings of the Association for Computational Linguistics: EMNLP 2025
Chengyao Wen | Qiang Cheng | Shaofei Wang | Zhizhen Liu | Deng Zhao | Lei Liang
Findings of the Association for Computational Linguistics: EMNLP 2025
Recent Large Reasoning Models (LRMs) have demonstrated the ability to generate long chains of thought (LongCoT) before arriving at a final conclusion. Despite remarkable breakthroughs in complex reasoning capabilities, LongCoT still faces challenges such as redundancy and logical incoherence. To address these issues, we aim to equip large language models (LLMs) with rigorous and concise logical reasoning capabilities. In this work, we propose Logic-Thinker, a neural-symbolic reasoning framework that employs symbolic solvers to precisely solve problems and transforms their internal solving processes into concise and rigorous chains of thought, referred to as ThinkerCoT. Our experimental results demonstrate that Logic-Thinker achieves state-of-the-art performance in logical reasoning problems. Additionally, LLMs fine-tuned with ThinkerCoT outperform models distilled from QwQ32B on logic reasoning tasks, achieving an overall accuracy improvement of 3.6% while reducing token output by 73%-91%. Furthermore, ThinkerCoT enhances the comprehensive reasoning capabilities of LLMs, as evidenced by performance improvements on reasoning benchmarks such as GPQA and AIME.