Kaiming Liu
2026
Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning
Xuanyu Lei | Chenliang Li | Yuning Wu | Kaiming Liu | Weizhou Shen | Peng Li | Ming Yan | Fei Huang | Ya-Qin Zhang | Yang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xuanyu Lei | Chenliang Li | Yuning Wu | Kaiming Liu | Weizhou Shen | Peng Li | Ming Yan | Fei Huang | Ya-Qin Zhang | Yang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advances in Large Language Models (LLMs) have enabled strong performance in long-form writing, but current training paradigms remain limited: Supervised Fine-Tuning (SFT) remains constrained by data saturation and performance ceilings, while Reinforcement Learning with Verifiable Reward (RLVR), though successful in verifiable domains like math and code, cannot be directly migrated to open-ended long-form writing due to a lack of ground-truths. To further advance long-form writing, we present Writing-RL: an Adaptive Curriculum Reinforcement Learning framework to advance long-form writing capabilities beyond SFT. The framework consists of three key components: Margin-aware Data Selection strategy that prioritizes samples with high learning potential, Pairwise Comparison Reward mechanism that provides discriminative learning signals in the absence of verifiable rewards, and Dynamic Reference Scheduling approach, which plays a critical role by adaptively adjusting task difficulty based on evolving model performance. Experiments on 7B-scale writer models show that Writing-RL effectively improves long-form writing performance over strong SFT baselines. Furthermore, we observe that models trained with long-output RL generalize surprisingly well to long-input reasoning tasks, potentially offering a promising perspective for rethinking long-context training.
2025
Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation
Weitao Li | Xiangyu Zhang | Kaiming Liu | Xuanyu Lei | Weizhi Ma | Yang Liu
Findings of the Association for Computational Linguistics: EMNLP 2025
Weitao Li | Xiangyu Zhang | Kaiming Liu | Xuanyu Lei | Weizhi Ma | Yang Liu
Findings of the Association for Computational Linguistics: EMNLP 2025
Retrieval-Augmented Generation (RAG) has emerged as a widely adopted approach for knowledge injection during large language model (LLM) inference in recent years. However, due to their limited ability to exploit fine-grained inter-document relationships, current RAG implementations face challenges in effectively addressing the retrieved noise and redundancy content, which may cause error in the generation results. To address these limitations, we propose an **E**fficient **D**ynamic **C**lustering-based document **C**ompression framework (**EDC2-RAG**) that utilizes latent inter-document relationships while simultaneously removing irrelevant information and redundant content. We validate our approach, built upon GPT-3.5-Turbo and GPT-4o-mini, on widely used knowledge-QA and Hallucination-Detection datasets. Experimental results show that our method achieves consistent performance improvements across various scenarios and experimental settings, demonstrating strong robustness and applicability. Our code and datasets are available at https://github.com/Tsinghua-dhy/EDC-2-RAG.