Xiaoming Li


2026

Recent advances in Large Vision–language Models (VLMs) suggest their potential for multimodal misinformation detection. However, existing multimodal misinformation detectors often fail to effectively integrate them, relying instead on passive aggregation of multimodal features and social signals. Such correlation-driven paradigms are vulnerable to spurious associations and multimodal noise, and lack explicit verification mechanisms. In this paper, we propose Logic-Guided Adaptive Reasoning (LoGAR), a verification-oriented framework that integrates VLMs into multimodal misinformation detection through explicit rationale-guided reasoning. LoGAR leverages a VLM to generate an explicit verification rationale, which serves as a global semantic anchor to condition the entire reasoning process. Concretely, the rationale functions as an active query to guide multimodal feature fusion and as a conditioning signal to modulate message passing over heterogeneous social graphs, enabling hypothesis-aware evidence aggregation. Furthermore, LoGAR introduces an instance-aware adaptive depth mechanism that dynamically determines the required reasoning depth. Experimental results on multiple multimodal misinformation benchmarks demonstrate that LoGAR consistently outperforms state-of-the-art methods while significantly reducing computational cost.

2017

Recent years have witnessed the proliferation of Massive Open Online Courses (MOOCs). With massive learners being offered MOOCs, there is a demand that the forum contents within MOOCs need to be classified in order to facilitate both learners and instructors. Therefore we investigate a significant application, which is to associate forum threads to subtitles of video clips. This task can be regarded as a document ranking problem, and the key is how to learn a distinguishable text representation from word sequences and learners’ behavior sequences. In this paper, we propose a novel cascade model, which can capture both the latent semantics and latent similarity by modeling MOOC data. Experimental results on two real-world datasets demonstrate that our textual representation outperforms state-of-the-art unsupervised counterparts for the application.

2016

2015

2014

2013

2012

2011

2010