Lei Cao
2026
DORA: A Dual-Objective Reinforcement Learning Framework for Effective and Efficient Multimodal Agentic Search
Guangming Qin | Yuhao Deng | Yukun Zhao | Zhenyang Li | Junfeng Wang | Dawei Yin | Ye Yuan | Guoren Wang | Yizhou Yan | Chengliang Chai | Lei Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Guangming Qin | Yuhao Deng | Yukun Zhao | Zhenyang Li | Junfeng Wang | Dawei Yin | Ye Yuan | Guoren Wang | Yizhou Yan | Chengliang Chai | Lei Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The most recent research uses reinforcement learning (RL) to post-train Multi-modal Large Language Models (MLLMs) such that these models are able to iteratively call search engines to dynamically access external knowledge when handling complex Visual Question Answering (VQA) tasks. However, existing methods face two major limitations in effectiveness and efficiency: i) For effectiveness, the objective of these methods, which only considers the correctness of the generated final response, overlooks the quality of intermediate search results, thus leading to suboptimal search strategies. ii) For efficiency, existing methods often unnecessarily invoke search calls during reasoning, making the inference inefficient. To address these issues, we propose , a customized dual-objective reinforcement learning framework to improve the search strategies of MLLMs, enhancing their search quality yet minimizing search frequency. The key ideas include (1) a reward function that promotes correct reasoning trajectories with fewer search calls; and (2) dual optimization objectives that jointly optimize search quality and answer correctness. Extensive experiments on 3 real-world datasets demonstrate that DORA outperforms state-of-the-art methods, achieving up to 8.4% higher accuracy while reducing the number of search calls by 9.7%.
2019
Latent Suicide Risk Detection on Microblog via Suicide-Oriented Word Embeddings and Layered Attention
Lei Cao | Huijun Zhang | Ling Feng | Zihan Wei | Xin Wang | Ningyun Li | Xiaohao He
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Lei Cao | Huijun Zhang | Ling Feng | Zihan Wei | Xin Wang | Ningyun Li | Xiaohao He
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Despite detection of suicidal ideation on social media has made great progress in recent years, people’s implicitly and anti-real contrarily expressed posts still remain as an obstacle, constraining the detectors to acquire higher satisfactory performance. Enlightened by the hidden “tree holes” phenomenon on microblog, where people at suicide risk tend to disclose their inner real feelings and thoughts to the microblog space whose authors have committed suicide, we explore the use of tree holes to enhance microblog-based suicide risk detection from the following two perspectives. (1) We build suicide-oriented word embeddings based on tree hole contents to strength the sensibility of suicide-related lexicons and context based on tree hole contents. (2) A two-layered attention mechanism is deployed to grasp intermittently changing points from individual’s open blog streams, revealing one’s inner emotional world more or less. Our experimental results show that with suicide-oriented word embeddings and attention, microblog-based suicide risk detection can achieve over 91% accuracy. A large-scale well-labelled suicide data set is also reported in the paper.