Xiaohong Su


2026

With the surge of online misinformation, Large Language Models (LLMs) and Reasoning Large Language Models (RLMs) serving as Automatic Fact-Checking (AFC) systems have emerged as a prominent paradigm for reliable, explainable verification. However, our empirical study reveals that this paradigm faces a critical risk asymmetry challenge when deployed in real-world under resource-constrained environments. While Hotspot Perception Ability (HPA), the capacity to dynamically allocate reasoning resources based on social impact, is essential to mitigate this risk, existing benchmarks lack the social metadata and evaluation framework to meet this urgent evaluation needs, thereby hindering the advancement of these AFC systems. To bridge this gap, we introduce TrendFact, the first benchmark capable of evaluating HPA and three fact-checking tasks. It consists of 7,643 curated samples sourced from trending platforms and professional datasets, with an evidence library containing 366,634 entries. To enable HPA assessment, we propose two novel metrics: the Explanation Consistency Score (ECS) to evaluate the reliability of verification reasoning, and the Hotspot Claim Perception Index (HCPI) to quantify the overall HPA of AFC systems. Extensive experiments demonstrate that existing AFC systems exhibit limited performance on TrendFact. Furthermore, our proposed FactISR framework effectively enhances HPA and computational efficiency for RLM-driven systems.

2024

Fact verification constitutes a pivotal application in the effort to combat the dissemination of disinformation, a concern that has recently garnered considerable attention. However, previous studies in the field of fact verification, particularly those focused on question-answering dialogue, have exhibited limitations, such as failing to fully exploit the potential of question structures and ignoring relevant label information during the verification process. In this paper, we introduce Label-Infused Iterative Information Interacting (LI4), a novel approach designed for the task of question-answering dialogue based fact verification. LI4 consists of two meticulously designed components, namely the Iterative Information Refining and Filtering Module (IIRF) and the Fact Label Embedding Module (FLEM). The IIRF uses the Interactive Gating Mechanism to iteratively filter out the noise of question and evidence, concurrently refining the claim information. The FLEM is conceived to strengthen the understanding ability of the model towards labels by injecting label knowledge. We evaluate the performance of the proposed LI4 on HEALTHVER, FAVIQ, and COLLOQUIAL. The experimental results confirm that our LI4 model attains remarkable progress, manifesting as a new state-of-the-art performance.