Yijun Liu

Also published as: 议骏刘

2025

Large language models (LLMs) rely on key-value cache (KV cache) to accelerate decoding by reducing redundant computations. However, the KV cache memory usage grows substantially with longer text sequences, posing challenges for efficient deployment. Existing KV cache eviction methods prune tokens using prefilling-stage attention scores, causing inconsistency with actual inference queries, especially under tight memory budgets. In this paper, we propose Lookahead Q-Cache (LAQ), a novel eviction framework that generates low-cost pseudo lookahead queries to better approximate the true decoding-stage queries. By using these lookahead queries as the observation window for importance estimation, LAQ achieves more consistent and accurate KV cache eviction aligned with real inference scenarios. Experimental results on LongBench and Needle-in-a-Haystack benchmarks show that LAQ outperforms existing methods across various budget levels, achieving a 1 4 point improvement on LongBench under limited cache budget. Moreover, LAQ is complementary to existing approaches and can be flexibly combined to yield further improvements.

pdf bib abs
From Crafting Text to Crafting Thought: Grounding Intelligent Writing Support to Writing Center Pedagogy
Yijun Liu | Tal August
Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025)

Intelligent writing support tools have evolved from solving surface-level issues to collaborating and creating language with writers. Along with these new capabilities come concerns that generated fluent text can impact writers’ processes in unintended ways, especially for students. In this workshop paper, we look to a similar transition that writing centers experienced over the last century, which shifted focus from fixing surface-level issues to maintaining student writer voices. We interviewed 10 current writing tutors and grounded their described practices with ideas proposed in writing center literature. We employed these strategies in developing an intelligent writing tool prototype. We describe the design of our tool and discuss potential evaluations along with how to foster deeper relationships between writers and writing centers using intelligent writing tools.

2024

pdf bib abs
中文图文多模态理解评测
Yuxuan Wang (王宇轩) | Yijun Liu (刘议骏) | Zhiguo Wan (万志国) | Wanxiang Che (车万翔)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“中文图文多模态理解评测任务旨在从多角度评价中文图文多模态预训练模型的图文多模态建模和理解能力。本任务共包括五个子任务:图片检索、文本检索、视觉问答、视觉定位和视觉对话,最终成绩根据这五个任务的得分综合计算。本文首先介绍了任务的背景和动机,然后从任务介绍、评价指标、比赛结果、参赛方法等方面介绍并展示了本次评测任务的相关信息。本次任务共有11支队伍报名参赛,其中3支队伍提交了结果。”

Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning stage of the large language model. The training method simply introduces some noise at the input for the model to learn the denoising task. It significantly enhances the parallel decoding capability of the model without affecting the original task capability. In addition, we propose a tree-based retrieval-augmented Jacobi (TR-Jacobi) decoding strategy to further improve the inference speed of MSN models. Experiments in both the general and code domains have shown that MSN can improve inference speed by 2.3-2.7x times without compromising model performance. The MSN model also achieves comparable acceleration ratios to the SOTA model with additional model structure on Spec-Bench.

Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine tuning phase due to inconsistent error distribution and noisy labels. In this paper, we propose a synthetic data construction method based on contextual augmentation, which can ensure an efficient augmentation of the original data with a more consistent error distribution. Specifically, we combine rule-based substitution with model-based generation, using the generation model to generate a richer context for the extracted error patterns. Besides, we also propose a relabeling-based data cleaning method to mitigate the effects of noisy labels in synthetic data. Experiments on CoNLL14 and BEA19-Test show that our proposed augmentation method consistently and substantially outperforms strong baselines and achieves the state-of-the-art level with only a few synthetic data.

pdf bib abs
Domain-aware and Co-adaptive Feature Transformation for Domain Adaption Few-shot Relation Extraction
Yijun Liu | Feifei Dai | Xiaoyan Gu | Minghui Zhai | Bo Li | Meiou Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Few-shot relation extraction (FSRE) can alleviate the data scarcity problem in relation extraction. However, FSRE models often suffer a significant decline in performance when adapting to new domains. To overcome this issue, many researchers have focused on domain adaption FSRE (DAFSRE). Nevertheless, existing approaches primarily concentrate on the source domain, which makes it difficult to accurately transfer useful knowledge to the target domain. Additionally, the lack of distinction between relations further restricts the model performance. In this paper, we propose the domain-aware and co-adaptive feature transformation approach to address these issues. Specifically, we introduce a domain-aware transformation module that leverages the target domain distribution features to guide the domain-aware feature transformations. This can enhance the model’s adaptability across domains, leading to improved target domain performance. Furthermore, we design co-adaptive prototypical networks to perform co-adaptive feature transformation through a transformer mechanism. This results in more robust and distinguishable relation prototypes. Experiments on DAFSRE benchmark datasets demonstrate the effectiveness of our method, which outperforms existing models and achieves state-of-the-art performance.

pdf bib abs
LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction
Yixuan Wang | Baoxin Wang | Yijun Liu | Dayong Wu | Wanxiang Che
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble. Specifically, we train the model on an over-correction dataset constructed through the proposed K-fold cross inference method, which allows it to directly generate filtered sentences by combining the original and the over-corrected text. In the inference stage, we directly take the original sentences and the output results of other systems as input and then obtain the filtered sentences through LM-Combiner. Experiments on the FCGEC dataset show that our proposed method effectively alleviates the over-correction of the original system (+18.2 Precision) while ensuring the error recall remains unchanged. Besides, we find that LM-Combiner still has a good rewriting performance even with small parameters and few training data, and thus can cost-effectively mitigate the over-correction of black-box GEC systems (e.g., ChatGPT).

2023

pdf bib abs
System Report for CCL23-Eval Task 8: Chinese Grammar Error Detection and Correction Using Multi-Granularity Information
Yixuan Wang | Yijun Liu | Bo Sun | Wanxiang Che
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“This paper introduces our system at CCL-2023 Task: Chinese Essay Fluency Evaluation (CEFE).The CEFE task aims to study the identification and correction of grammatical errors in primaryand middle school students’ test compositions. The evaluation has three tracks to examine therecognition of wrong sentence types, character-level error correction, and wrong sentence rewrit-ing. According to the task characteristics and data distribution of each track, we propose a token-level discriminative model based on sequence labeling for the multi-label classification task ofwrong sentences, an auto-encoder model based on edited labels for character-level error correc-tion and a seq2seq model obtained by pre-training on pseudo data and fine-tuning on labeleddata to solve the wrong sentence rewriting task. In the final evaluation results, the method weproposed won the first place in all three tracks according to the corresponding evaluation metrics.”