Xuan Kan
2026
Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge
Junjie Wu | Xuan Kan | Zihao He | Shunwen Tan | Bo Pan | Kaitai Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Junjie Wu | Xuan Kan | Zihao He | Shunwen Tan | Bo Pan | Kaitai Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Multimodal Large Language Models (MLLMs) have been widely adopted as MLLM-as-aJudges due to their strong alignment with human judgment across various visual tasks. However, most existing judge models are optimized for single-task scenarios and struggle to generalize to diverse contexts, which is a critical requirement for reliable evaluation. To address this limitation, we propose Multi-Task Reinforcement Learning for MLLM-as-a-Judge (MT-RL-Judge), a framework that jointly optimizes the judge model across multiple tasks, leveraging the generalization capabilities of RL. Experimental results against several strong baselines demonstrate that MT-RL-Judge outperforms strong baselines in both judgment consistency and correlation with human preferences. Furthermore, our approach exhibits robust generalization on out-of-distribution tasks, further validating its effectiveness.
2024
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu | Hejie Cui | Yue Yu | Xuan Kan | Wenqi Shi | Yuchen Zhuang | May Dongmei Wang | Wei Jin | Joyce Ho | Carl Yang
Findings of the Association for Computational Linguistics: ACL 2024
Ran Xu | Hejie Cui | Yue Yu | Xuan Kan | Wenqi Shi | Yuchen Zhuang | May Dongmei Wang | Wei Jin | Joyce Ho | Carl Yang
Findings of the Association for Computational Linguistics: ACL 2024
Clinical natural language processing faces challenges like complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation with LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 8 clinical NLP tasks and 18 datasets reveals that ClinGen consistently enhances performance across various tasks by 7.7%-8.7% on average, effectively aligning the distribution of real datasets and enriching the diversity of generated training instances.