Zeyuan Chen


2025

pdf bib
Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
Zeyuan Chen | Haiyan Wu | Kaixin Wu | Wei Chen | Mingjie Zhong | Jia Xu | Zhongyi Liu | Wei Zhang
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

This paper studies the relevance modeling problem by integrating world knowledge stored in the parameters of LLMs with specialized domain knowledge represented by user behavior data for achieving promising performance. The novel framework ProRBP is proposed, which innovatively develops user-driven behavior neighbor retrieval module to learn domain-specific knowledge in time and introduces progressive prompting and aggregation module for considering diverse aspects of the relevance and prediction stability. We explore an industrial implementation to deploy LLMs to handle full-scale search traffics of Alipay with acceptable cost and latency. The comprehensive experiments on real-world industry data and online A/B testing validate the superiority of our proposal and the effectiveness of its main modules.

pdf bib
xLAM: A Family of Large Action Models to Empower AI Agent Systems
Jianguo Zhang | Tian Lan | Ming Zhu | Zuxin Liu | Thai Quoc Hoang | Shirley Kokane | Weiran Yao | Juntao Tan | Akshara Prabhakar | Haolin Chen | Zhiwei Liu | Yihao Feng | Tulika Manoj Awalgaonkar | Rithesh R N | Zeyuan Chen | Ran Xu | Juan Carlos Niebles | Shelby Heinecke | Huan Wang | Silvio Savarese | Caiming Xiong
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Autonomous agents powered by large language models (LLMs) have attracted significant research interest. However, the open-source community faces many challenges in developing specialized models for agent tasks, driven by the scarcity of high-quality agent datasets and the absence of standard protocols in this area. We introduce xLAM, a series of large action models designed for AI agent tasks. The xLAM series includes five models with both dense and mixture-of-expert architectures, ranging from 1B to 8x22B parameters, trained using a scalable, flexible pipeline that unifies, augments, and synthesizes diverse datasets to enhance AI agents’ generalizability and performance across varied environments. Our experimental results demonstrate that xLAM consistently delivers exceptional performance across multiple agent ability benchmarks, notably securing the 1st position on the Berkeley Function-Calling Leaderboard, outperforming GPT-4, Claude-3, and many other models in terms of tool use. By releasing the xLAM series, we aim to advance the performance of open-source LLMs for autonomous AI agents, potentially accelerating progress and democratizing access to high-performance models for agent tasks.

pdf bib
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search
Kaixin Wu | Yixin Ji | Zeyuan Chen | Qiang Wang | Cunxiang Wang | Hong Liu | Baijun Ji | Xu Jia | Zhongyi Liu | Jinjie Gu | Yuan Zhou | Linjian Mo
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Relevance modeling between queries and items stands as a pivotal component in commercial search engines, directly affecting the user experience. Given the remarkable achievements of large language models (LLMs) in various natural language processing (NLP) tasks, LLM-based relevance modeling is gradually being adopted within industrial search systems. Nevertheless, foundational LLMs lack domain-specific knowledge and do not fully exploit the potential of in-context learning. Furthermore, structured item text remains underutilized, and there is a shortage in the supply of corresponding queries and background knowledge. We thereby propose CPRM (Continual Pre-training for Relevance Modeling), a framework designed for the continual pre-training of LLMs to address these issues. Our CPRM framework includes three modules: 1) employing both queries and multi-field item to jointly pre-train for enhancing domain knowledge, 2) applying in-context pre-training, a novel approach where LLMs are pre-trained on a sequence of related queries or items, and 3) conducting reading comprehension on items to produce associated domain knowledge and background information (e.g., generating summaries and corresponding queries) to further strengthen LLMs. Results on offline experiments and online A/B testing demonstrate that our model achieves convincing performance compared to strong baselines.

2022

pdf bib
Field Extraction from Forms with Unlabeled Data
Mingfei Gao | Zeyuan Chen | Nikhil Naik | Kazuma Hashimoto | Caiming Xiong | Ran Xu
Proceedings of the 1st Workshop on Semiparametric Methods in NLP: Decoupling Logic from Knowledge

We propose a novel framework to conduct field extraction from forms with unlabeled data. To bootstrap the training process, we develop a rule-based method for mining noisy pseudo-labels from unlabeled forms. Using the supervisory signal from the pseudo-labels, we extract a discriminative token representation from a transformer-based model by modeling the interaction between text in the form. To prevent the model from overfitting to label noise, we introduce a refinement module based on a progressive pseudo-label ensemble. Experimental results demonstrate the effectiveness of our framework.