Feng Liu
Other people with similar names: Feng Liu
Unverified author pages with similar names: Feng Liu
2026
Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization
Jiantong Jiang | Peiyu Yang | Rui Zhang | Feng Liu
Findings of the Association for Computational Linguistics: ACL 2026
Jiantong Jiang | Peiyu Yang | Rui Zhang | Feng Liu
Findings of the Association for Computational Linguistics: ACL 2026
Despite the rapid advancements of large language models (LLMs), LLM serving systems remain memory-intensive and costly. The key-value (KV) cache, which stores KV tensors during autoregressive decoding, is crucial for enabling low-latency, high-throughput LLM inference serving. In this survey, we focus on system-aware KV infrastructure for serving LLMs (abbreviated as sKis). We revisit recent work from a system behavior perspective, organizing existing efforts into three dimensions: execution and scheduling (temporal), placement and migration (spatial), and representation and retention (structural). Furthermore, we analyze cross-behavior co-design affinity and behavior-objective links, highlighting future opportunities. Our work systematizes a rapidly evolving area, providing a foundation for understanding and innovating KV cache designs in modern LLM serving infrastructure.
2025
‘No’ Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF
Rena Wei Gao | Xuetong Wu | Siwen Luo | Caren Han | Feng Liu
Findings of the Association for Computational Linguistics: ACL 2025
Rena Wei Gao | Xuetong Wu | Siwen Luo | Caren Han | Feng Liu
Findings of the Association for Computational Linguistics: ACL 2025
Out-of-distribution (OOD) detection in multimodal contexts is essential for identifying deviations in different modalities, particularly for interactive dialogue systems in real-life interactions, where the systems are usually infeasible to deploy large language models (LLMs) to generate dialogue responses due to data privacy and ethical issues. This paper aims to improve label detection that involves multi-round long dialogues by efficiently detecting OOD dialogues and images. We introduce a novel scoring framework named Dialogue Image Aligning and Enhancing Framework (DIAEF) that integrates the visual language models with the novel proposed scores that detect OOD in two key scenarios (1) mismatches between the dialogue and image input pair and (2) input pairs with previously unseen labels. Our experimental results, derived from various benchmarks, demonstrate that integrating image and multi-round dialogue OOD detection is more effective with previously unseen labels than using either modality independently. In the presence of mismatched pairs, our proposed score effectively identifies these mismatches and demonstrates strong robustness in long dialogues. This approach enhances domain-aware, adaptive conversational agents and establishes baselines for future studies.