He Geng
2026
ProMedical: Hierarchical Fine-Grained Criteria Modeling for Medical LLM Alignment via Explicit Injection
He Geng | Yangmin Huang | Lixian Lai | Qianyun Du | Hui Chu | Zhiyang He | Jiaxue Hu | Xiaodong Tao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
He Geng | Yangmin Huang | Lixian Lai | Qianyun Du | Hui Chu | Zhiyang He | Jiaxue Hu | Xiaodong Tao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Aligning Large Language Models (LLMs) with high-stakes medical standards remains a significant challenge, primarily due to the dissonance between coarse-grained preference signals and the complex, multi-dimensional nature of clinical protocols. To bridge this gap, we introduce ProMedical, a unified alignment framework grounded in fine-grained clinical criteria. We first construct ProMedical-Preference-50k, a dataset generated via a human-in-the-loop pipeline that augments medical instructions with rigorous, physician-derived rubrics. Leveraging this corpus, we propose the Explicit Criteria Injection paradigm to train a multi-dimensional reward model. Unlike traditional scalar reward models, our approach explicitly disentangles safety constraints from general proficiency, enabling precise guidance during reinforcement learning. To rigorously validate this framework, we establish ProMedical-Bench, a held-out evaluation suite anchored by double-blind expert adjudication. Empirical evaluations demonstrate that optimizing the Qwen3-8B base model via ProMedical-RM-guided GRPO yields substantial gains, improving overall accuracy by 22.3% and safety compliance by 21.7%, effectively rivaling proprietary frontier models. Furthermore, the aligned policy generalizes robustly to external benchmarks, demonstrating performance comparable to state-of-the-art models on UltraMedical. We publicly release our datasets, reward models, and benchmarks to facilitate reproducible research in safety-aware medical alignment.
2025
A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models
Shuliang Liu | Hongyi Liu | Aiwei Liu | Duan Bingchen | Zheng Qi | Yibo Yan | He Geng | Peijie Jiang | Jia Liu | Xuming Hu
Findings of the Association for Computational Linguistics: ACL 2025
Shuliang Liu | Hongyi Liu | Aiwei Liu | Duan Bingchen | Zheng Qi | Yibo Yan | He Geng | Peijie Jiang | Jia Liu | Xuming Hu
Findings of the Association for Computational Linguistics: ACL 2025
The widespread deployment of large language models (LLMs) across critical domains has amplified the societal risks posed by algorithmically generated misinformation. Unlike traditional false content, LLM-generated misinformation can be self-reinforcing, highly plausible, and capable of rapid propagation across multiple languages, which traditional detection methods fail to mitigate effectively. This paper introduces a proactive defense paradigm, shifting from passive post hoc detection to anticipatory mitigation strategies. We propose a Three Pillars framework: (1) Knowledge Credibility, fortifying the integrity of training and deployed data; (2) Inference Reliability, embedding self-corrective mechanisms during reasoning; and (3) Input Robustness, enhancing the resilience of model interfaces against adversarial attacks. Through a comprehensive survey of existing techniques and a comparative meta-analysis, we demonstrate that proactive defense strategies offer up to 63% improvement over conventional methods in misinformation prevention, despite non-trivial computational overhead and generalization challenges. We argue that future research should focus on co-designing robust knowledge foundations, reasoning certification, and attack-resistant interfaces to ensure LLMs can effectively counter misinformation across varied domains.
VLA-Mark: A cross modal watermark for large vision-language alignment models
Shuliang Liu | Zheng Qi | Jesse Jiaxi Xu | Yibo Yan | Junyan Zhang | He Geng | Aiwei Liu | Peijie Jiang | Jia Liu | Yik-Cheung Tam | Xuming Hu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Shuliang Liu | Zheng Qi | Jesse Jiaxi Xu | Yibo Yan | Junyan Zhang | He Geng | Aiwei Liu | Peijie Jiang | Jia Liu | Yik-Cheung Tam | Xuming Hu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Vision-language models demand watermarking solutions that protect intellectual property without compromising multimodal coherence. Existing text watermarking methods disrupt visual-textual alignment through biased token selection and static strategies, leaving semantic-critical concepts vulnerable. We propose VLA-Mark, a vision-aligned framework that embeds detectable watermarks while preserving semantic fidelity through cross-modal coordination. Our approach integrates multiscale visual-textual alignment metrics, combining localized patch affinity, global semantic coherence, and contextual attention patterns, to guide watermark injection without model retraining. An entropy-sensitive mechanism dynamically balances watermark strength and semantic preservation, prioritizing visual grounding during low-uncertainty generation phases. Experiments show 7.4% lower PPL and 26.6% higher BLEU than conventional methods, with near-perfect detection (98.8% AUC). The framework demonstrates 96.1% attack resilience against attacks such as paraphrasing and synonym substitution, while maintaining text-visual consistency, establishing new standards for quality-preserving multimodal watermarking.