Nijia Han
2026
MEUR: A Benchmark for Evaluating Vision-Language Models on Multimodal Event Understanding and Reasoning
Zimu Wang | Yuqi Wang | Tong Chen | Changyu Zeng | Hongbin Na | Nijia Han | Fuyu Xing | Qi Chen | Qiufeng Wang | Anh Nguyen | Shuihua Wang | Ling Chen | Jionglong Su | Haiyang Zhang | Wei Wang
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Zimu Wang | Yuqi Wang | Tong Chen | Changyu Zeng | Hongbin Na | Nijia Han | Fuyu Xing | Qi Chen | Qiufeng Wang | Anh Nguyen | Shuihua Wang | Ling Chen | Jionglong Su | Haiyang Zhang | Wei Wang
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Event understanding and reasoning play critical roles in thoroughly evaluating the capabilities of Vision-Language Models (VLMs); however, existing Visual Question Answering (VQA) datasets predominantly focus on entity-centric questions, while event- or action-related questions are limited in scale and suffer from significant shortcut issues. We introduce MEUR, the first Multimodal Event Understanding and Reasoning dataset consisting of 1,200 images and 4,217 questions, necessitating VLMs with a diverse range of multimodal understanding and reasoning capabilities to answer, ranging from basic event recognition to more complex tasks such as counting and comparison. To streamline the annotation process, we propose a novel semi-automated pipeline that combines advanced VLMs with human annotators, achieving high quality and efficiency. We conduct extensive experiments on state-of-the-art non-thinking and thinking VLMs to demonstrate their capabilities and limitations in multimodal event understanding and reasoning. Furthermore, we provide a detailed error analysis that points out promising directions for future research.
TCMPHal: A Large-scale Dataset for Hallucination Detection in Traditional Chinese Medicine Pharmacy
Nijia Han | Zimu Wang | Ziwen Xie | Wei Wang | Jia Meng | John Moraros | Shuihua Wang
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Nijia Han | Zimu Wang | Ziwen Xie | Wei Wang | Jia Meng | John Moraros | Shuihua Wang
Proceedings of the Fifteenth Language Resources and Evaluation Conference
The rapid proliferation of large language models (LLMs) in medicine highlights their potential to revolutionize research in Traditional Chinese Medicine (TCM). While these models have shown great promise in assisting TCM practitioners by answering herb-related questions, generating syndrome-differentiation reports, and recommending classical formulas, a persistent challenge that arises is the issue of hallucination, where LLMs might produce content that appears plausible yet inaccurate. This issue has received limited attention within the context of TCM research, leaving a significant gap in understanding how hallucination manifests within the unique theoretical frameworks and diagnostic principles. Motivated by this phenomenon, we present TCMPHal, the first dataset specifically curated for hallucination detection in TCM pharmacy, comprising 10,000 high-quality question-answer pairs with hallucination annotations. Our experimental results across diverse LLMs, under standard, knowledge-based, and search engine-augmented conditions, demonstrate the capabilities and limitations of these models. A notable observation is that, for thinking LLMs, incorporating search engine results yields minimal improvement over their intrinsic reasoning abilities. We further conduct an in-depth error analysis, paving the way for future research directions in this domain. We release the TCMPHal dataset at https://github.com/hanninaa/TCMP.
2025
FinDebate: Multi-Agent Collaborative Intelligence for Financial Analysis
Tianshi Cai | Guanxu Li | Nijia Han | Ce Huang | Zimu Wang | Changyu Zeng | Yuqi Wang | Jingshi Zhou | Haiyang Zhang | Qi Chen | Yushan Pan | Shuihua Wang | Wei Wang
Proceedings of The 10th Workshop on Financial Technology and Natural Language Processing
Tianshi Cai | Guanxu Li | Nijia Han | Ce Huang | Zimu Wang | Changyu Zeng | Yuqi Wang | Jingshi Zhou | Haiyang Zhang | Qi Chen | Yushan Pan | Shuihua Wang | Wei Wang
Proceedings of The 10th Workshop on Financial Technology and Natural Language Processing
2024
Knowledge Distillation from Monolingual to Multilingual Models for Intelligent and Interpretable Multilingual Emotion Detection
Yuqi Wang | Zimu Wang | Nijia Han | Wei Wang | Qi Chen | Haiyang Zhang | Yushan Pan | Anh Nguyen
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Yuqi Wang | Zimu Wang | Nijia Han | Wei Wang | Qi Chen | Haiyang Zhang | Yushan Pan | Anh Nguyen
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis
Emotion detection from text is a crucial task in understanding natural language with wide-ranging applications. Existing approaches for multilingual emotion detection from text face challenges with data scarcity across many languages and a lack of interpretability. We propose a novel method that leverages both monolingual and multilingual pre-trained language models to improve performance and interpretability. Our approach involves 1) training a high-performing English monolingual model in parallel with a multilingual model and 2) using knowledge distillation to transfer the emotion detection capabilities from the monolingual teacher to the multilingual student model. Experiments on a multilingual dataset demonstrate significant performance gains for refined multilingual models like XLM-RoBERTa and E5 after distillation. Furthermore, our approach enhances interpretability by enabling better identification of emotion-trigger words. Our work presents a promising direction for building accurate, robust and explainable multilingual emotion detection systems.
MTSwitch: A Web-based System for Translation between Molecules and Texts
Nijia Han | Zimu Wang | Yuqi Wang | Haiyang Zhang | Daiyun Huang | Wei Wang
Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations
Nijia Han | Zimu Wang | Yuqi Wang | Haiyang Zhang | Daiyun Huang | Wei Wang
Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations
We introduce MTSwitch, a web-based system for the bidirectional translation between molecules and texts, leveraging various large language models (LLMs). It supports two crucial tasks, including molecule captioning (explaining the properties of a molecule) and molecule generation (designing a molecule based on specific properties). To the best of our knowledge, MTSwitch is currently the first accessible system that allows users to translate between molecular representations and descriptive text contents. The system and a screencast can be found in https://github.com/hanninaa/MTSwitch.
Exploring Faithful and Informative Commonsense Reasoning and Moral Understanding in Children’s Stories
Zimu Wang | Wang Yuqi | Nijia Han | Qi Chen | Haiyang Zhang | Yushan Pan | Qiufeng Wang | Wei Wang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Zimu Wang | Wang Yuqi | Nijia Han | Qi Chen | Haiyang Zhang | Yushan Pan | Qiufeng Wang | Wei Wang
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“Commonsense reasoning and moral understanding are crucial tasks in artificial intelligence (AI) and natural language processing (NLP). However, existing research often falls short in terms of faithfulness and informativeness during the reasoning process. We propose a novel framework for performing commonsense reasoning and moral understanding using large language models (LLMs), involving constructing guided prompts by incorporating relevant knowledge for commonsense reasoning and extracting facts from stories for moral understanding. We conduct extensive experiments on the Commonsense Reasoning and Moral Understanding in Children’s Stories (CRMUS) dataset with widely recognised LLMs under both zero-shot and fine-tuning settings, demonstrating the effectiveness of our proposed method. Furthermore, we analyse the adaptability of different LLMs in extracting facts for moral understanding performance.”