Hao Sun
Other people with similar names: Hao Sun, Hao Sun, Hao Sun, Hao Sun
Unverified author pages with similar names: Hao Sun
2026
DUET: Joint Exploration of User–Item Profiles in Recommendation System
Yue Chen | Yifei Sun | Lu Wang | Fangkai Yang | Pu Zhao | Minjie Hong | Yifei Dong | Minghua He | Nan Hu | Jianjin Zhang | Zhiwei Dai | Yuefeng Zhan | Weihao Han | Hao Sun | Qingwei Lin | Weiwei Deng | Feng Sun | Qi Zhang | Saravan Rajmohan | Dongmei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Yue Chen | Yifei Sun | Lu Wang | Fangkai Yang | Pu Zhao | Minjie Hong | Yifei Dong | Minghua He | Nan Hu | Jianjin Zhang | Zhiwei Dai | Yuefeng Zhan | Weihao Han | Hao Sun | Qingwei Lin | Weiwei Deng | Feng Sun | Qi Zhang | Saravan Rajmohan | Dongmei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Traditional recommendation systems represent users and items as dense vectors and learn to align them in a shared latent space for relevance estimation. Recent LLM-based recommenders instead leverage natural-language representations that are easier to interpret and integrate with downstream reasoning modules. This paper studies how to construct effective textual profiles for users and items, and how to align them for recommendation.A central difficulty is that the best profile format is not known a priori: manually designed templates can be brittle and misaligned with task objectives. Moreover, generating user and item profiles independently may produce descriptions that are individually plausible yet semantically inconsistent for a specific user–item pair. We propose Duet, an interaction-aware profile generator that jointly produces user and item profiles conditioned on both user history and item evidence. Duet follows a three-stage procedure: it first turns raw histories and metadata into compact cues, then expands these cues into paired profile prompts and then generate profiles, and finally optimizes the generation policy with reinforcement learning using downstream recommendation performance as feedback. Experiments on three real-world datasets show that Duet consistently outperforms strong baselines, demonstrating the benefits of template-free profile exploration and joint user–item textual alignment. Project page: https://duet-rec.github.io/.
2025
MAIN: Mutual Alignment Is Necessary for instruction tuning
Fanyi Yang | Jianfeng Liu | Xin Zhang | Haoyu Liu | Xixin Cao | Yuefeng Zhan | Hao Sun | Weiwei Deng | Feng Sun | Qi Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Fanyi Yang | Jianfeng Liu | Xin Zhang | Haoyu Liu | Xixin Cao | Yuefeng Zhan | Hao Sun | Weiwei Deng | Feng Sun | Qi Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Instruction tuning has empowered large language models (LLMs) to achieve remarkable performance, yet its success heavily depends on the availability of large-scale, high-quality instruction-response pairs. To meet this demand, various methods have been developed to synthesize data at scale. However, current methods for scaling up data generation often overlook a crucial aspect: the alignment between instructions and responses. We hypothesize that the quality of instruction-response pairs is determined not by the individual quality of each component, but by the degree of mutual alignment. To address this, we propose a Mutual Alignment Framework (MAIN) which enforces coherence between instructions and responses through mutual constraints. We demonstrate that MAIN generalizes well across model architectures and sizes, achieving state-of-the-art performance on LLaMA, Mistral, and Qwen models across diverse benchmarks. This work underscores the critical role of instruction-response alignment in enabling generalizable and high-quality instruction tuning for LLMs. All code is available from our repository.
Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval
Haotong Bao | Jianjin Zhang | Qi Chen | Weihao Han | Zhengxin Zeng | Ruiheng Chang | Mingzheng Li | Hao Sun | Weiwei Deng | Feng Sun | Qi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2025
Haotong Bao | Jianjin Zhang | Qi Chen | Weihao Han | Zhengxin Zeng | Ruiheng Chang | Mingzheng Li | Hao Sun | Weiwei Deng | Feng Sun | Qi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2025
In Embedding Based Retrieval (EBR), Approximate Nearest Neighbor (ANN) algorithms are widely adopted for efficient large-scale search. However, recent studies reveal a query out-of-distribution (OOD) issue, where query and base embeddings follow mismatched distributions, significantly degrading ANN performance. In this work, we empirically verify the generality of this phenomenon and provide a quantitative analysis. To mitigate the distributional gap, we introduce a distribution regularizer into the encoder training objective, encouraging alignment between query and base embeddings. Extensive experiments across multiple datasets, encoders, and ANN indices show that our method consistently improves retrieval performance.
Token-level Proximal Policy Optimization for Query Generation
Yichen Ouyang | Lu Wang | Fangkai Yang | Pu Zhao | Chenghua Huang | Jianfeng Liu | Bochen Pang | Yaming Yang | Yuefeng Zhan | Hao Sun | Qingwei Lin | Saravan Rajmohan | Weiwei Deng | Dongmei Zhang | Feng Sun
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yichen Ouyang | Lu Wang | Fangkai Yang | Pu Zhao | Chenghua Huang | Jianfeng Liu | Bochen Pang | Yaming Yang | Yuefeng Zhan | Hao Sun | Qingwei Lin | Saravan Rajmohan | Weiwei Deng | Dongmei Zhang | Feng Sun
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Query generation is a critical task for web search engines (e.g. Google, Bing) and recommendation systems. Recently, state-of-the-art query generation methods leverage Large Language Models (LLMs) for their strong capabilities in context understanding and text generation. However, they still face challenges in generating high-quality queries in terms of inferring user intent based on their web search interaction history. In this paper, we propose Token-level Proximal Policy Optimization (TPPO), a noval approach designed to empower LLMs perform better in query generation through fine-tuning. TPPO is based on the Reinforcement Learning from AI Feedback (RLAIF) paradigm, consisting of a token-level reward model and a token-level proximal policy optimization module to address the sparse reward challenge in traditional RLAIF frameworks. We conducted experiments on both open-source dataset and an industrial dataset that was collected from a globally-used search engine, demonstrating that TPPO significantly improves the performance of query generation for LLMs and outperforms its existing competitors.
GeAR: Generation Augmented Retrieval
Haoyu Liu | Shaohan Huang | Jianfeng Liu | Yuefeng Zhan | Hao Sun | Weiwei Deng | Feng Sun | Furu Wei | Qi Zhang
Findings of the Association for Computational Linguistics: ACL 2025
Haoyu Liu | Shaohan Huang | Jianfeng Liu | Yuefeng Zhan | Hao Sun | Weiwei Deng | Feng Sun | Furu Wei | Qi Zhang
Findings of the Association for Computational Linguistics: ACL 2025
Document retrieval techniques are essential for developing large-scale information systems. The common approach involves using a bi-encoder to compute the semantic similarity between a query and documents. However, the scalar similarity often fail to reflect enough information, hindering the interpretation of retrieval results. In addition, this process primarily focuses on global semantics, overlooking the finer-grained semantic relationships between the query and the document’s content. In this paper, we introduce a novel method, Generation Augmented Retrieval (GeAR), which not only improves the global document-query similarity through contrastive learning, but also integrates well-designed fusion and decoding modules. This enables GeAR to generate relevant context within the documents based on a given query, facilitating learning to retrieve local fine-grained information.Furthermore, when used as a retriever, GeAR does not incur any additional computational cost over bi-encoders. GeAR exhibits competitive retrieval performance across diverse scenarios and tasks. Moreover, qualitative analysis and the results generated by GeAR provide novel insights into the interpretation of retrieval results. The code, data, and models will be released at https://github.com/microsoft/LMOps.
Search
Fix author
Co-authors
- Weiwei Deng 5
- Feng Sun 5
- Yuefeng Zhan 4
- Jianfeng Liu 3
- Qi Zhang 3
- Weihao Han 2
- Qingwei Lin 2
- Haoyu Liu 2
- Saravan Rajmohan 2
- Fangkai Yang 2
- Jianjin Zhang 2
- Dongmei Zhang 2
- Pu Zhao 2
- Haotong Bao 1
- Xixin Cao 1
- Ruiheng Chang 1
- Yue Chen 1
- Qi Chen 1
- Zhiwei Dai 1
- Yifei Dong 1
- Minghua He 1
- Minjie Hong 1
- Nan Hu 1
- Chenghua Huang 1
- Shaohan Huang 1
- Mingzheng Li 1
- Yichen Ouyang 1
- Bochen Pang 1
- Yifei Sun 1
- Lu Wang 1
- Lu Wang 1
- Furu Wei 1
- Fanyi Yang 1
- Yaming Yang 1
- Zhengxin Zeng 1
- Qi Zhang 1
- Xin Zhang 1