Jingrui He


2025

pdf bib
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?
Zihao Li | Lecheng Zheng | Bowen Jin | Dongqi Fu | Baoyu Jing | Yikun Ban | Jingrui He | Jiawei Han
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

While great success has been achieved in building vision models with Contrastive Language-Image Pre-training (CLIP) over Internet-scale image-text pairs, building transferable Graph Neural Networks (GNNs) with CLIP pipeline is challenging because of the scarcity of labeled data and text supervision, different levels of downstream tasks, and the conceptual gaps between domains. In this work, to address these issues, we propose a multi-modal prompt learning paradigm to effectively adapt pre-trained GNN to downstream tasks and data, given only a few semantically labeled samples, each with extremely weak text supervision. Our new paradigm embeds the graphs directly in the same space as the Large Language Models (LLMs) by learning both graph prompts and text prompts simultaneously. We demonstrate the superior performance of our paradigm in few-shot, multi-task-level, and cross-domain settings. Moreover, we build the first CLIP-style zero-shot classification prototype that can generalize GNNs to unseen classes with extremely weak text supervision.

pdf bib
LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation
Xinrui He | Yikun Ban | Jiaru Zou | Tianxin Wei | Curtiss Cook | Jingrui He
Findings of the Association for Computational Linguistics: ACL 2025

Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation, making them a promising tool for data imputation. However, challenges persist in designing effective prompts for a finetuning-free process and in mitigating biases and uncertainty in LLM outputs. To address these issues, we propose a novel framework, LLM-Forest, which introduces a “forest” of few-shot learning LLM “trees” with their outputs aggregated via confidence-based weighted voting based on LLM self-assessment, inspired by the ensemble learning (Random Forest). This framework is established on a new concept of bipartite information graphs to identify high-quality relevant neighboring entries with both feature and value granularity. Extensive experiments on 9 real-world datasets demonstrate the effectiveness and efficiency of LLM-Forest. The implementation is available at https://github.com/Xinrui17/LLM-Forest

pdf bib
Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity
Zihao Li | Feihao Fang | Xitong Zhang | Jiaru Zou | Zhining Liu | Wei Xiong | Ziwei Wu | Baoyu Jing | Jingrui He
Findings of the Association for Computational Linguistics: EMNLP 2025

The advancement of Large Language Models (LLMs) has made ensuring their trustworthiness increasingly critical, especially in terms of fairness across diverse human groups. While modern LLMs are aligned with user preferences through Reinforcement Learning from Human Feedback (RLHF), the reward models used for alignment are trained on preference data that may both reflect societal biases and suffer from demographic skewness, as labeler populations are often uneven due to systemic accessibility or participation gaps. In this work, we reveal that reward models can exhibit significant discrepancies across different demographic groups, posing a fundamental challenge to fair and robust alignment. Using real-world datasets, we conduct the most comprehensive study to date, auditing various state-of-the-art reward models across nine sensitive attributes, including age, gender, ethnicity, etc. Our evaluation spans both (1) the agreement level between reward models and specific user groups, and (2) the reward model’s preference toward responses associated with different groups. Based on these findings, we propose the first method to mitigate group disparities in reward modeling. Code is available at https://github.com/Violet24K/FaRM.

pdf bib
Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
Yunzhe Qi | Jinjin Tian | Tianci Liu | Ruirui Li | Tianxin Wei | Hui Liu | Xianfeng Tang | Monica Xiao Cheng | Jingrui He
Findings of the Association for Computational Linguistics: EMNLP 2025

The performance of Large Language Models (LLMs) critically depends on designing effective instructions, which is particularly challenging for black-box LLMs with inaccessible internal states. To this end, we introduce Learning to Instruct, a novel paradigm that formulates instruction optimization as an LLM fine-tuning objective for a white-box “instruction engineer” LLM, leveraging its rich learning capacity and vast pre-trained knowledge to enable efficient and effective instruction optimization. Within this paradigm, we propose Automatic Instruction Optimizer (AIO), a novel framework that fine-tunes a white-box LLM into a capable instruction engineer. AIO learns to optimize task-aware, human-comprehensible instructions by incorporating task nuances and feedback from the task-solving black-box LLM. To overcome the challenges of inaccessible black-box gradients and high API costs, AIO introduces a novel zeroth-order (ZO) gradient approximation mechanism guided by Thompson Sampling (TS), which reuses informative black-box LLM feedback for improved query efficiency. Extensive experiments show that AIO generally outperforms strong baselines in both effectiveness and efficiency, establishing Learning to Instruct as a promising new direction for black-box LLM instruction optimization.