Xiaojiang Huang
2026
Analyzing and Internalizing Complex Policy Documents for LLM Agents
Jiateng Liu | Zhenhailong Wang | Xiaojiang Huang | Yingjie Li | Xiang Li | Chenlei Guo | Xing Fan | Ruhi Sarikaya | Heng Ji
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiateng Liu | Zhenhailong Wang | Xiaojiang Huang | Yingjie Li | Xiang Li | Chenlei Guo | Xing Fan | Ruhi Sarikaya | Heng Ji
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language model agents rely on in-context policy documents encoding diverse business rules. As businesses scale, these documents grow, creating substantial computational overhead and motivating internalization methods that embed policy into model priors. Prior work focuses on generic prompts, but we find agentic policies span multiple complexity levels and demand heavier reasoning, posing greater challenges. We introduce an agentic benchmark generator with Controllable Complexity in agent policy across four levels, enabling systematic evaluation of agents under increasing complexity and providing a testbed for policy internalization. Our analysis shows that workflow-governing policy specifications are the hardest to reason over, and that SFT on gold trajectories with chain-of-thought is data-hungry and struggles at high complexity. We propose Category-Aware Policy Continued Pretraining, an automated pipeline that analyzes policies, extracts key specifications, categorizes them into factual, behavioral, and conditional types, and isolates those driving workflow complexity. This enables targeted “therapy” by synthesizing specialized training data for each type and improving internalization via an autoregressive pretraining loss. Extensive experiments show our synthetic data and objective consistently improve performance. Combined with SFT, our method outperforms the baseline across different settings, especially in data-sparse and high-complexity regimes, with gains up to 41% and 22% on Qwen-3-32B. Overall, we achieve 97.3% prompt reduction on our benchmark, and on 𝜏-Bench we further improve performance while reducing prompt requirements with very limited SFT data.
2024
RecMind: Large Language Model Powered Agent For Recommendation
Yancheng Wang | Ziyan Jiang | Zheng Chen | Fan Yang | Yingxue Zhou | Eunah Cho | Xing Fan | Yanbin Lu | Xiaojiang Huang | Yingzhen Yang
Findings of the Association for Computational Linguistics: NAACL 2024
Yancheng Wang | Ziyan Jiang | Zheng Chen | Fan Yang | Yingxue Zhou | Eunah Cho | Xing Fan | Yanbin Lu | Xiaojiang Huang | Yingzhen Yang
Findings of the Association for Computational Linguistics: NAACL 2024
While the recommendation system (RS) has advanced significantly through deep learning, current RS approaches usually train and fine-tune models on task-specific datasets, limiting their generalizability to new recommendation tasks and their ability to leverage external knowledge due to model scale and data size constraints. Thus, we designed an LLM-powered autonomous recommender agent, RecMind, which is capable of leveraging external knowledge, utilizing tools with careful planning to provide zero-shot personalized recommendations. We propose a Self-Inspiring algorithm to improve the planning ability. At each intermediate step, the LLM “self-inspires” to consider all previously explored states to plan for the next step. This mechanism greatly improves the model’s ability to comprehend and utilize historical information in planning for recommendation. We evaluate RecMind’s performance in various recommendation scenarios. Our experiment shows that RecMind outperforms existing zero/few-shot LLM-based recommendation baseline methods in various tasks and achieves comparable performance to a fully trained recommendation model P5.
2023
Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding
Zheng Chen | Ziyan Jiang | Fan Yang | Eunah Cho | Xing Fan | Xiaojiang Huang | Yanbin Lu | Aram Galstyan
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Zheng Chen | Ziyan Jiang | Fan Yang | Eunah Cho | Xing Fan | Xiaojiang Huang | Yanbin Lu | Aram Galstyan
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
A Personalized Query Rewriting system strives to minimize defective queries to ensure robust conversational functionality by considering individual user behavior and preferences. It’s designed as a search-based system, maintaining a user index of past successful interactions with the conversational AI. However, this method faces challenges with unseen interactions, which refers to novel user interactions not covered by the user’s historical index. This paper introduces our Collaborative Query Rewriting approach, which utilizes underlying topological information to assist in rewriting defective queries arising from unseen user interactions. This approach begins by constructing a “User Feedback Interaction Graph” (FIG) using historical user-entity interactions. Subsequently, we traverse through the graph edges to establish an enhanced user index, referred to as the “collaborative user index”. This paper then further explores the use of Large Language Models (LLMs) in conjunction with graph traversal, leading to a significant increase in index coverage for unseen interactions. The effectiveness of our proposed approach has been proven through experiments on a large-scale real-world dataset and online A/B experiments.
2015
Joint Entity Recognition and Disambiguation
Gang Luo | Xiaojiang Huang | Chin-Yew Lin | Zaiqing Nie
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Gang Luo | Xiaojiang Huang | Chin-Yew Lin | Zaiqing Nie
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2014
Collective Tweet Wikification based on Semi-supervised Graph Regularization
Hongzhao Huang | Yunbo Cao | Xiaojiang Huang | Heng Ji | Chin-Yew Lin
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hongzhao Huang | Yunbo Cao | Xiaojiang Huang | Heng Ji | Chin-Yew Lin
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2011
Search
Fix author
Co-authors
- Xing Fan 3
- Zheng Chen 2
- Eunah Cho 2
- Heng Ji 2
- Ziyan Jiang 2
- Chin-Yew Lin 2
- Yanbin Lu 2
- Xiaojun Wan 2
- Jianguo Xiao 2
- Fan Yang 2
- Yunbo Cao 1
- Aram Galstyan 1
- Chenlei Guo 1
- Hongzhao Huang 1
- Houping Jia 1
- Xiang Li 1
- Yingjie Li 1
- Jiateng Liu 1
- Gang Luo 1
- Tengfei Ma 1
- Zaiqing Nie 1
- Ruhi Sarikaya 1
- Yancheng Wang 1
- Zhenhailong Wang 1
- Yuqian Wu 1
- Yingzhen Yang 1
- Yingxue Zhou 1
- Liang Zong 1