Geyang Guo


2025

pdf bib
CARE: Multilingual Human Preference Learning for Cultural Awareness
Geyang Guo | Tarek Naous | Hiromi Wakaki | Yukiko Nishimura | Yuki Mitsufuji | Alan Ritter | Wei Xu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Language Models (LMs) are typically tuned with human preferences to produce helpful responses, but the impact of preference tuning on the ability to handle culturally diverse queries remains understudied. In this paper, we systematically analyze how native human cultural preferences can be incorporated into the preference learning process to train more culturally aware LMs. We introduce CARE, a multilingual resource containing 3,490 culturally specific questions and 31.7k responses with human judgments. We demonstrate how a modest amount of high-quality native preferences improves cultural awareness across various LMs, outperforming larger generic preference data. Our analyses reveal that models with stronger initial cultural performance benefit more from alignment, leading to gaps among models developed in different regions with varying access to culturally relevant data. CARE is publicly available at https://github.com/Guochry/CARE.

2024

pdf bib
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang | Hu Yiwen | Bingqian Li | Wenyang Luo | ZiJing Qin | Haoxiang Sun | Jiapeng Wang | Shiyi Xu | Xiaoxue Cheng | Geyang Guo | Han Peng | Bowen Zheng | Yiru Tang | Yingqian Min | Yushuo Chen | Jie Chen | Ranchi Zhao | Luran Ding | Yuhao Wang | Zican Dong | Xia Chunxuan | Junyi Li | Kun Zhou | Xin Zhao | Ji-Rong Wen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets, and models, and (3) more practical consideration, especially on user-friendliness and efficiency. With our library, users can easily reproduce existing methods, train new models, and conduct comprehensive performance comparisons. To rigorously test LLMBox, we conduct extensive experiments in a diverse coverage of evaluation settings, and experimental results demonstrate the effectiveness and efficiency of our library in supporting various implementations related to LLMs. The detailed introduction and usage guidance can be found at https://github.com/RUCAIBox/LLMBox.