Xing Xie
Other people with similar names: Xing Xie
Unverified author pages with similar names: Xing Xie
2026
MMAC: A Multilingual, Multimodal Alignment Framework for Cultural Grounding Evaluation
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Weihua Zheng | Zhengyuan Liu | Tanmoy Chakraborty | Weiwen Xu | Xiaoxue Gao | Bryan Chen Zhengyu Tan | Bowei Zou | Chang Liu | Yujia Hu | Xing Xie | Xiaoyuan Yi | Jing Yao | Chaojun Wang | Long Li | Rui Liu | Huiyao Liu | Koji Inoue | Ryuichi Sumida | Tatsuya Kawahara | Fan Xu | Lingyu Ye | Wei Tian | Dongjun Kim | Jimin Jung | Jaehyung Seo | Nadya Yuki Wangsajaya | Pham Minh Duc | Ojasva Saxena | Palash Nandi | Xiyan Tao | Wiwik Karlina | Tuan Luong | Keertana Arun Vasan | Roy Ka-Wei Lee | Nancy F. Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The global deployment of Large Language Models (LLMs) underscores the urgent need to evaluate their cultural alignment. However, assessing genuine "cultural awareness" across modalities (text, vision, speech) and languages remains a significant challenge. To comprehensively investigate this domain, we propose MMAC, a systematic framework that encompasses a tri-modally aligned cultural benchmark creation pipeline and a five-dimensional evaluation protocol to assess cross-country awareness disparities, evaluate cross-lingual and cross-modal consistency, and verify cultural knowledge generalization and grounding validity. Given the prevailing Western cultural bias in current models, we focus on 8 Asian countries as our dataset foundation to more acutely reveal potential cultural deficiencies in LLMs. Our dataset, MMAC-bench, features 27,000 human-curated questions across 10 languages. Crucially, it is the first dataset aligned at the input level across text, image, and speech, enabling direct cross-modal transfer tests. Each question consists of multiple-choice options accompanied by open-ended generated explanations, where 79% require multi-step reasoning grounded in cultural context, moving beyond simple memorization. We probe the causes of modal divergence, offering insights into fostering culturally robust MLLMs.
Measuring Human Contribution in AI-Assisted Content Generation
Yueqi Xie | Tao Qi | Jingwei Yi | Xiyuan Yang | Ryan Whalen | Junming Huang | Qian Ding | Yu Xie | Xing Xie | Fangzhao Wu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yueqi Xie | Tao Qi | Jingwei Yi | Xiyuan Yang | Ryan Whalen | Junming Huang | Qian Ding | Yu Xie | Xing Xie | Fangzhao Wu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the growing prevalence of generative AI, an increasing amount of content is no longer exclusively generated by humans but by generative AI models with human guidance. This shift presents notable challenges for the delineation of originality due to the varying degrees of human contribution in AI-assisted works. This study raises the research question of measuring human contribution in AI-assisted content generation and introduces a framework to address this question that is grounded in information theory. By calculating mutual information between human input and AI-assisted output relative to self-information of AI-assisted output, we quantify the proportional information contribution of humans in content generation. Our experimental results demonstrate that the proposed measure effectively discriminates between varying degrees of human contribution across multiple creative domains. To further enhance real-world applicability, we extend the framework to estimate the minimal necessary human contribution for any text without requiring human input and validate its effectiveness. We hope that this work lays a foundation for measuring human contributions in AI-assisted content generation in the era of generative AI.
Influence-based Online Experience Selection for Effective RLHF
Yifan Gong | Jing Yao | Xiting Wang | Xunlong Wang | Xiaoyuan Yi | Xing Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yifan Gong | Jing Yao | Xiting Wang | Xunlong Wang | Xiaoyuan Yi | Xing Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reinforcement Learning from Human Feedback (RLHF) has emerged as a crucial technique for aligning large language models (LLMs) with human preferences. However, existing RLHF methods face key challenges, including poor sample efficiency, high computational overhead, and slow convergence. Recent studies highlight the importance of data selection in RL, but how to effectively select the most beneficial experiences for RL training remains an open problem. Existing data selection methods for RL rely on heuristic metrics, failing to establish an interpretable connection between data and optimization objectives. To address this problem, we propose InfOES (Influence-based Online Experience Selection), a novel data selection method for RLHF that dynamically estimates the influence of individual training samples on policy optimization. By incorporating data attribution into the policy gradient, InfOES can identify and filter out detrimental samples on the fly, ensuring effective convergence toward alignment objectives. Our approach is compatible with various RL algorithms (e.g., PPO, GRPO, REINFORCE++). Extensive experiments demonstrate that InfOES significantly enhances training effectiveness, achieving superior alignment performance with fewer optimization steps.
Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment
Bryan Chen Zhengyu Tan | Zhengyuan Liu | Xiaoyuan Yi | Jing Yao | Xing Xie | Nancy F. Chen | Roy Ka-Wei Lee
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Bryan Chen Zhengyu Tan | Zhengyuan Liu | Xiaoyuan Yi | Jing Yao | Xing Xie | Nancy F. Chen | Roy Ka-Wei Lee
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Despite their global prevalence, many Large Language Models (LLMs) are aligned to a monolithic, often Western-centric set of values. This paper investigates the more challenging task of fine-grained value alignment: examining whether LLMs can emulate the distinct cultural values of demographic subgroups. Using Singapore as a case study and the World Values Survey (WVS), we examine the value landscape and show that even state-of-the-art models like GPT-4.1 achieve only 57.4% accuracy in predicting subgroup modal preferences. We construct a dataset of over 20,000 samples to train and evaluate a range of models. We demonstrate that simple fine-tuning on structured numerical preferences yields substantial gains, improving accuracy on unseen, out-of-distribution subgroups by an average of 17.4%. These gains partially transfer to open-ended generation. However, we find significant pre-existing performance biases, where models better emulate young, male, Chinese, and Christian personas. Furthermore, while fine-tuning improves average performance, it widens the disparity between subgroups when measured by distance-aware metrics. Our work offers insights into the limits and fairness implications of subgroup-level cultural alignment.
2025
MoVa: Towards Generalizable Classification of Human Morals and Values
Ziyu Chen | Junfei Sun | Chenxi Li | Tuan Dung Nguyen | Jing Yao | Xiaoyuan Yi | Xing Xie | Chenhao Tan | Lexing Xie
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Ziyu Chen | Junfei Sun | Chenxi Li | Tuan Dung Nguyen | Jing Yao | Xiaoyuan Yi | Xing Xie | Chenhao Tan | Lexing Xie
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Identifying human morals and values embedded in language is essential to empirical studies of communication. However, researchers often face substantial difficulty navigating the diversity of theoretical frameworks and data available for their analysis. Here, we contribute MoVa, a well-documented suite of resources for generalizable classification of human morals and values, consisting of (1) 16 labeled datasets and benchmarking results from four theoretically-grounded frameworks; (2) a lightweight LLM prompting strategy that outperforms fine-tuned models across multiple domains and frameworks; and (3) a new application that helps evaluate psychological surveys. In practice, we specifically recommend a classification strategy, all@once, that scores all related concepts simultaneously, resembling the well-known multi-label classifier chain. The data and methods in MoVa can facilitate many fine-grained interpretations of human and machine communication, with potential implications for the alignment of machine behavior.
Search
Fix author
Co-authors
- Jing Yao 4
- Xiaoyuan Yi 4
- Nancy Chen 2
- Roy Ka-Wei Lee 2
- Zhengyuan Liu 2
- Bryan Chen Zhengyu Tan 2
- Tanmoy Chakraborty 1
- Ziyu Chen 1
- Qian Ding 1
- Pham Minh Duc 1
- Xiaoxue Gao 1
- Yifan Gong 1
- Yujia Hu 1
- Junming Huang 1
- Koji Inoue 1
- Jimin Jung 1
- Wiwik Karlina 1
- Tatsuya Kawahara 1
- Dongjun Kim 1
- Chenxi Li 1
- Long Li 1
- Chang Liu 1
- Huiyao Liu 1
- Rui Liu 1
- Tuan Luong 1
- Palash Nandi 1
- Tuan Dung Nguyen 1
- Tao Qi 1
- Ojasva Saxena 1
- Jaehyung Seo 1
- Ryuichi Sumida 1
- Junfei Sun 1
- Chenhao Tan 1
- Xiyan Tao 1
- Wei Tian 1
- Keertana Arun Vasan 1
- Chaojun Wang 1
- Xiting Wang 1
- Xunlong Wang 1
- Nadya Yuki Wangsajaya 1
- Ryan Whalen 1
- Fangzhao Wu 1
- Lexing Xie 1
- Yu Xie 1
- Yueqi Xie 1
- Fan Xu (徐凡) 1
- Weiwen Xu 1
- Xiyuan Yang 1
- Lingyu Ye 1
- Jingwei Yi 1
- Weihua Zheng 1
- Bowei Zou (邹博伟) 1