2025
pdf
bib
abs
Data Interpreter: An LLM Agent for Data Science
Sirui Hong
|
Yizhang Lin
|
Bang Liu
|
Bangbang Liu
|
Binhao Wu
|
Ceyao Zhang
|
Danyang Li
|
Jiaqi Chen
|
Jiayi Zhang
|
Jinlin Wang
|
Li Zhang
|
Lingyao Zhang
|
Min Yang
|
Mingchen Zhuge
|
Taicheng Guo
|
Tuo Zhou
|
Wei Tao
|
Robert Tang
|
Xiangtao Lu
|
Xiawu Zheng
|
Xinbing Liang
|
Yaying Fei
|
Yuheng Cheng
|
Yongxin Ni
|
Zhibin Gou
|
Zongze Xu
|
Yuyu Luo
|
Chenglin Wu
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Model (LLM)-based agents have excelled in various domains but face significant challenges when applied to data science workflows due to their complex, multi-stage nature. Current LLM-based agents struggle with non-linear relationships, recursive dependencies, implicit data- and logic-dependent reasoning, and managing extensive context. In this paper, we introduce Data Interpreter, an LLM-based agent that addresses these challenges through hierarchical graph-based modeling to represent the complexity and a progressive strategy for step-by-step verification, refinement, and consistent context management. Extensive experiments confirm the effectiveness of Data Interpreter. On InfiAgent-DABench, it boosts performance by 25% (from 75.9% to 94.9%), and on machine learning and open-ended tasks, it lifts accuracy from 88% to 95% and from 60% to 97%, respectively. Moreover, our method surpasses state-of-the-art baselines by 26% on the MATH dataset. We will release the code upon publication.
pdf
bib
abs
SOLAR: Serendipity Optimized Language Model Aligned for Recommendation
Zichen Yuan
|
Lifan Sun
|
Yucen Zhuang
|
Yue Wang
|
Xinyuan Song
|
Tianqi Xu
|
Siyuan Li
|
Junchen Fu
|
Youhua Li
|
Sirui Hong
|
Jiaqi Chen
|
Joemon M. Jose
|
Yongxin Ni
Findings of the Association for Computational Linguistics: EMNLP 2025
Recently, Large Language Models (LLMs) have shown strong potential in recommendation tasks due to their broad world knowledge and reasoning capabilities. However, applying them to serendipity-oriented recommendation remains challenging, mainly due to a domain gap of LLMs in modeling personalized user behavior and the scarcity of labeled serendipitous interactions. In this paper, we introduce **SOLAR** (**S**erendipity-**O**ptimized **L**anguage model **A**ligned for **R**ecommendation), a two-stage framework that addresses these challenges. To alleviate label scarcity, we adopt a weak supervision strategy: a sequential ID-based recommender generates candidate items, which are then reranked by an LLM acting as a preference judge to produce serendipity-aware pseudo-labels. To bridge the domain gap, we propose a domain-adaptive instruction tuning method (SUN) that aligns LLMs with recommendation tasks. Experiments on three real-world datasets show that **SOLAR** consistently improves both accuracy and serendipity over strong baselines, showing its effectiveness in enabling more diverse, user-centric recommendations. Code and dataset are released at [https://github.com/SOLAR2025ARR/SOLAR](https://github.com/SOLAR2025ARR/SOLAR).