Jianzhao Huang
2026
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services
Fei zhao | Chonggang Lu | Haofu Qian | Fangcheng Shi | Zijie Meng | Jianzhao Huang | Zheyong Xie | Shaosheng Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Fei zhao | Chonggang Lu | Haofu Qian | Fangcheng Shi | Zijie Meng | Jianzhao Huang | Zheyong Xie | Shaosheng Cao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
As a primary medium for human interaction and information exchange, social networking services (SNS) present distinct challenges for large language models (LLMs): rapidly evolving norms and slang, and culturally diverse content that causes knowledge distribution shift. While supervised fine-tuning (SFT) can improve in-domain performance, it often induces a ”seesaw” trade-off with out-of-domain robustness, especially for smaller models. To address these challenges, we present RedOne 2.0, an SNS-oriented LLM developed with a progressive, RL-prioritized post-training paradigm for fast and stable adaptation. Our pipeline has three stages: (1) Exploratory Learning on curated SNS corpora to establish initial alignment and surface systematic weaknesses; (2) Targeted Fine-Tuning that applies SFT only to diagnosed gaps while mixing a small amount of general data to reduce forgetting; and (3) Refinement Learning that re-applies RL with SNS-centric signals to consolidate gains and balance trade-offs across tasks. Across various tasks in three categories, our 4B model improves by 2.41 on average over the prior 7B RedOne baseline. It also yields an 8.74 average gain over its Qwen3-4B base while using less than half the data required by the SFT-centric method, demonstrating superior data efficiency and stability at compact scales. Overall, RedOne 2.0 provides a competitive, cost-effective baseline for SNS-specific LLMs, improving capability without sacrificing robustness.
2025
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
Fei Zhao | Chonggang Lu | Wangyue | Zheyong Xie | Ziyan Liu | Haofu Qian | Jianzhao Huang | Fangcheng Shi | Zijie Meng | Hongcheng Guo | Mingqian He | Xinze Lyu | Zheyu Ye | Weiting Liu | Boyang Wang | Shaosheng Cao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Fei Zhao | Chonggang Lu | Wangyue | Zheyong Xie | Ziyan Liu | Haofu Qian | Jianzhao Huang | Fangcheng Shi | Zijie Meng | Hongcheng Guo | Mingqian He | Xinze Lyu | Zheyu Ye | Weiting Liu | Boyang Wang | Shaosheng Cao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
As a primary medium for modern information dissemination, social networking services (SNS) have experienced rapid growth, which has proposed significant challenges for platform content management and interaction quality improvement. Recently, the development of large language models (LLMs) has offered potential solutions but existing studies focus on isolated tasks, which not only encounter diminishing benefit from the data scaling within individual scenarios but also fail to flexibly adapt to diverse real-world context. To address these challenges, we introduce RedOne, a domain-specific LLM designed to break the performance bottleneck of single-task baselines and establish a comprehensive foundation for the SNS. RedOne was developed through a three-stage training strategy consisting of continue pretraining, supervised fine-tuning, and preference optimization, using a large-scale real-world dataset. Through extensive experiments, RedOne maintains strong general capabilities, and achieves an average improvement up to 14.02% across 8 major SNS tasks and 7.56% in SNS bilingual evaluation benchmark, compared with base models. Furthermore, through online testing, RedOne reduced the exposure rate in harmful content detection by 11.23% and improved the click page rate in post-view search by 14.95% compared with single-tasks baseline models. These results establish RedOne as a robust domain-specific LLM for SNS, demonstrating excellent generalization across various tasks and promising applicability in real-world scenarios.
PreAct: Prediction Enhances Agent’s Planning Ability
Dayuan Fu | Jianzhao Huang | Siyuan Lu | Guanting Dong | Yejie Wang | Keqing He | Weiran Xu
Proceedings of the 31st International Conference on Computational Linguistics
Dayuan Fu | Jianzhao Huang | Siyuan Lu | Guanting Dong | Yejie Wang | Keqing He | Weiran Xu
Proceedings of the 31st International Conference on Computational Linguistics
Addressing the disparity between predictions and actual results can enable individuals to expand their thought processes and stimulate self-reflection, thus promoting accurate planning. In this research, we present **PreAct**, an agent framework that integrates **pre**diction, **rea**soning, and **act**ion. By utilizing the information derived from predictions, the large language model (LLM) agent can provide a wider range and more strategically focused reasoning. This leads to more efficient actions that aid the agent in accomplishing intricate tasks. Our experimental results show that PreAct surpasses the ReAct method in completing complex tasks and that PreAct’s performance can be further improved when paired with other memory or selection strategy techniques. We presented the model with varying quantities of historical predictions and discovered that these predictions consistently enhance LLM planning. The variances in single-step reasoning between PreAct and ReAct indicate that PreAct indeed has benefits in terms of diversity and strategic orientation over ReAct.
2024
Towards Low-Resource Harmful Meme Detection with LMM Agents
Jianzhao Huang | Hongzhan Lin | Liu Ziyan | Ziyang Luo | Guang Chen | Jing Ma
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Jianzhao Huang | Hongzhan Lin | Liu Ziyan | Ziyang Luo | Guang Chen | Jing Ma
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The proliferation of Internet memes in the age of social media necessitates effective identification of harmful ones. Due to the dynamic nature of memes, existing data-driven models may struggle in low-resource scenarios where only a few labeled examples are available. In this paper, we propose an agency-driven framework for low-resource harmful meme detection, employing both outward and inward analysis with few-shot annotated samples. Inspired by the powerful capacity of Large Multimodal Models (LMMs) on multimodal reasoning, we first retrieve relative memes with annotations to leverage label information as auxiliary signals for the LMM agent. Then, we elicit knowledge-revising behavior within the LMM agent to derive well-generalized insights into meme harmfulness. By combining these strategies, our approach enables dialectical reasoning over intricate and implicit harm-indicative patterns. Extensive experiments conducted on three meme datasets demonstrate that our proposed approach achieves superior performance than state-of-the-art methods on the low-resource harmful meme detection task.