Yang Liu

Other people with similar names: Yang Janet Liu (Georgetown University; 刘洋), Yang Liu (Tsinghua), Yang Liu (Fudan), Yang Liu (BIGAI), Yang Liu, Yang Liu (Hunan), Yang Liu, Yang Liu (3M Health Information Systems), Yang Liu, Yang Liu (UC Santa Cruz), Yang Liu (South China University of Technology), Yang Liu, Yang Liu, Yang Liu (NTU), Yang Liu (Sun Yat-sen University), Yang Liu (North Carolina Central University), Yang Liu (Beijing Language and Culture University), Yang Liu (National University of Defense Technology), Yang Liu (Edinburgh Ph.D., Microsoft), Yang Liu (University of Helsinki), Yang Liu (The Chinese University of Hong Kong (Shenzhen)), Yang Liu (刘扬) (刘扬; Ph.D Purdue; ICSI, Dallas, Facebook, Liulishuo, Amazon), Yang Liu (刘洋) (刘洋; ICT, Tsinghua, Beijing Academy of Artificial Intelligence), Yang Liu (Microsoft Cognitive Services Research), Yang Liu (刘扬) (Peking University), Yang Liu (Samsung Research Center Beijing), Yang Liu (Tianjin University, China), Yang Liu (Univ. of Michigan, UC Santa Cruz), Yang Liu (Wilfrid Laurier University)

Unverified author pages with similar names: Yang Liu


2026

Multimodal content combining textual and visual information poses significant challenges for rumor detection on social media. Compared to traditional spatial domain features, frequency domain features have attracted increasing attention due to their stronger discriminative capabilities. However, existing methods still fall short in capturing cross-modal semantic inconsistencies and often overlook inherent noise in multimodal features, which limits overall detection performance. To address these issues, we propose a novel multimodal rumor detection method based on multi-scale spectral selection and entropy-guided uncertainty fusion. Specifically, we first apply the Discrete Cosine Transform (DCT) to image and text features to convert them into the frequency domain. Then, multi-scale convolutional filters are employed to extract fine-grained information across different frequency scales. Next, modality separation is performed to capture both shared and modality-specific features, enabling more effective cross-modal representation learning. Finally, entropy is used to estimate the uncertainty of each prediction branch, calculate confidence scores, and perform adaptive weighted fusion accordingly. Experimental results on multiple benchmark datasets demonstrate that our method outperforms existing state-of-the-art approaches in multimodal rumor detection, demonstrating stronger detection capability and robustness.
With the widespread proliferation of the Internet, the spread of fake news has accelerated significantly, evolving from single-text content to multimodal forms that include images and videos. The task of Multimodal Fake News Detection (MFND) takes both text and relevant images as input for fake news identification. However, issues such as image noise and inaccurate focus of visual features often lead to insufficient attention to critical information within images during multimodal fusion. To effectively address these challenges, we propose a covariance matrix-driven image channel allocation method. This method first expands the number of original channel maps, then evaluates the importance of image channels through the covariance matrix and assigns importance scores to the expanded channel maps, thereby redirecting the focus of visual features. Subsequently, we design a multimodal fusion strategy based on a multilayer co-attention mechanism to achieve dynamic fusion across modalities. Finally, a contrastive learning loss is introduced to enhance the alignment between textual and visual modalities. Extensive experiments demonstrate that our method achieves state-of-the-art performance on three public multimodal fake news detection benchmark datasets.