Long Li
Other people with similar names: Long Li
Unverified author pages with similar names: Long Li
2026
MMAC: A Multilingual, Multimodal Alignment Framework for Cultural Grounding Evaluation
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The global deployment of Large Language Models (LLMs) underscores the urgent need to evaluate their cultural alignment. However, assessing genuine "cultural awareness" across modalities (text, vision, speech) and languages remains a significant challenge. To comprehensively investigate this domain, we propose MMAC, a systematic framework that encompasses a tri-modally aligned cultural benchmark creation pipeline and a five-dimensional evaluation protocol to assess cross-country awareness disparities, evaluate cross-lingual and cross-modal consistency, and verify cultural knowledge generalization and grounding validity. Given the prevailing Western cultural bias in current models, we focus on 8 Asian countries as our dataset foundation to more acutely reveal potential cultural deficiencies in LLMs. Our dataset, MMAC-bench, features 27,000 human-curated questions across 10 languages. Crucially, it is the first dataset aligned at the input level across text, image, and speech, enabling direct cross-modal transfer tests. Each question consists of multiple-choice options accompanied by open-ended generated explanations, where 79% require multi-step reasoning grounded in cultural context, moving beyond simple memorization. We probe the causes of modal divergence, offering insights into fostering culturally robust MLLMs.
I²B-LPO: Latent Policy Optimization via Iterative Information Bottleneck
Huilin Deng | Hongchen Luo | Yue Zhu | Long Li | Zhuoyue Chen | Xinghao Zhao | Ming LI | Chuyang Zhao | Jihai Zhang | MengChang Wang | Yang Cao | Yu Kang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Huilin Deng | Hongchen Luo | Yue Zhu | Long Li | Zhuoyue Chen | Xinghao Zhao | Ming LI | Chuyang Zhao | Jihai Zhang | MengChang Wang | Yang Cao | Yu Kang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Despite recent advances in Reinforcement learning with verifiable rewards (RLVR) for large language model (LLM) reasoning, most methods suffer from exploration collapse, as the semantic homogeneity of random rollouts traps models in narrow, over-optimized behaviors. Existing methods leverage policy entropy to encourage exploration, but face inherent limitations: global entropy regularization is susceptible to reward hacking, inducing meaningless verbosity, whereas local token-selective updates struggle with the strong inductive bias of pre-trained models. To this end, we propose Latent Policy Optimization via Iterative Information Bottleneck ( I²B-LPO), which shifts from statistical perturbation of token distributions to topological branching of reasoning trajectories. I²BLPO triggers latent branching at high-entropy states to diversify reasoning trajectories and applies the Information Bottleneck as a trajectory filter and self-reward to ensure concise and informative exploration. Empirical results on four mathematical benchmarks demonstrate that I²B-LPO achieves state-of-the-art performance, with margins of up to 5.3% in accuracy and 7.4% in diversity metrics. Code is available at https://github.com/denghuilin-cyber/IIB-LPO.
2025
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Long Li | Weiwen Xu | Jiayan Guo | Ruochen Zhao | Xingxuan Li | Yuqian Yuan | Boqiang Zhang | Yuming Jiang | Yifei Xin | Ronghao Dang | Yu Rong | Deli Zhao | Tian Feng | Lidong Bing
Findings of the Association for Computational Linguistics: EMNLP 2025
Long Li | Weiwen Xu | Jiayan Guo | Ruochen Zhao | Xingxuan Li | Yuqian Yuan | Boqiang Zhang | Yuming Jiang | Yifei Xin | Ronghao Dang | Yu Rong | Deli Zhao | Tian Feng | Lidong Bing
Findings of the Association for Computational Linguistics: EMNLP 2025
Research ideation is crucial for scientific progress, but the exponential increase in scientific literature makes it challenging to stay updated and identify impactful directions. Recent developments in large language models(LLMs) offer a promising avenue to automate this process. However, existing methods for idea generation either trivially prompt LLMs or expose LLMs to extensive literature without indicating useful information. Inspired by human research processes, we propose a Chain-of-Ideas (CoI) agent, an LLM-based agent that organizes relevant literature in a chain structure to effectively mirror the progressive development in a research domain. This organization helps LLMs better grasp current advancements, thereby improving ideation capabilities. Further, we present Idea Arena, a protocol for evaluating idea-generation methods from different perspectives, which aligns closely with the preferences of human researchers. Experiments show that CoI agent consistently outperforms existing methods and matches human quality in idea generation. Moreover, CoI agent is budget-friendly, requiring only $0.50 to generate a candidate idea and its experimental design.
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Yu Sun | Xingyu Qian | Weiwen Xu | Hao Zhang | Chenghao Xiao | Long Li | Deli Zhao | Wenbing Huang | Tingyang Xu | Qifeng Bai | Yu Rong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yu Sun | Xingyu Qian | Weiwen Xu | Hao Zhang | Chenghao Xiao | Long Li | Deli Zhao | Wenbing Huang | Tingyang Xu | Qifeng Bai | Yu Rong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Reasoning-based large language models have excelled in mathematics and programming, yet their potential in knowledge-intensive medical question answering remains underexplored and insufficiently validated in clinical contexts. To bridge this gap, we introduce ReasonMed, the largest medical reasoning dataset to date, comprising 370k high-quality examples distilled from 1.75 million initial reasoning paths generated by complementary LLMs and curated through a cost-efficient easy-medium-difficult (EMD) pipeline. ReasonMed is built through a multi-agent generation, verification, and refinement process, in which an Error Refiner improves reasoning paths by correcting error-prone steps identified by a verifier. Using ReasonMed, we investigate effective strategies for training medical reasoning models and find that integrating detailed CoT reasoning with concise answer summaries yields the most robust fine-tuning results. Models trained on ReasonMed set a new benchmark: ReasonMed-7B surpasses the prior best sub-10B models by 4.17% and even exceeds LLaMA3.1-70B on PubMedQA by 4.60%. When scaled to ReasonMed-14B, it remains highly competitive, underscoring consistent scaling potential. The codes and datasets are available at https://github.com/YuSun-Work/ReasonMed.
Search
Fix author
Co-authors
- Weiwen Xu 3
- Yu Rong 2
- Deli Zhao 2
- Qifeng Bai 1
- Lidong Bing 1
- Yang Cao 1
- Tanmoy Chakraborty 1
- Nancy Chen 1
- Zhuoyue Chen 1
- Ronghao Dang 1
- Huilin Deng 1
- Pham Minh Duc 1
- Tian Feng 1
- Xiaoxue Gao 1
- Jiayan Guo 1
- Yujia Hu 1
- Wenbing Huang 1
- Koji Inoue 1
- Yuming Jiang 1
- Jimin Jung 1
- Yu Kang 1
- Wiwik Karlina 1
- Tatsuya Kawahara 1
- Dongjun Kim 1
- Ming LI 1
- Roy Ka-Wei Lee 1
- Xingxuan Li 1
- Chang Liu 1
- Huiyao Liu 1
- Rui Liu 1
- Zhengyuan Liu 1
- Hongchen Luo 1
- Tuan Luong 1
- Palash Nandi 1
- Xingyu Qian 1
- Ojasva Saxena 1
- Jaehyung Seo 1
- Ryuichi Sumida 1
- Yu Sun 1
- Bryan Chen Zhengyu Tan 1
- Xiyan Tao 1
- Wei Tian (田巍) 1
- Keertana Arun Vasan 1
- Chaojun Wang 1
- MengChang Wang 1
- Nadya Yuki Wangsajaya 1
- Chenghao Xiao 1
- Xing Xie 1
- Yifei Xin 1
- Fan Xu (徐凡) 1
- Tingyang Xu 1
- Jing Yao 1
- Lingyu Ye 1
- Xiaoyuan Yi 1
- Yuqian Yuan 1
- Boqiang Zhang 1
- Hao Zhang 1
- Jihai Zhang 1
- Chuyang Zhao 1
- Ruochen Zhao 1
- Xinghao Zhao 1
- Weihua Zheng 1
- Yue Zhu 1
- Bowei Zou (邹博伟) 1