Zilei Wang
2026
From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning
Jiajun Zhang | Zeyu Cui | Jiaxi Yang | Lei Zhang | Yuheng Jing | Zeyao Ma | Tianyi Bai | Zilei Wang | Qiang Liu | Liang Wang | Binyuan Hui | Junyang Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiajun Zhang | Zeyu Cui | Jiaxi Yang | Lei Zhang | Yuheng Jing | Zeyao Ma | Tianyi Bai | Zilei Wang | Qiang Liu | Liang Wang | Binyuan Hui | Junyang Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The dominant Fill-in-the-Middle (FIM) paradigm for code completion is constrained by its rigid inability to correct contextual errors and reliance on unaligned, insecure Base models. While Chat LLMs offer safety and Agentic workflows provide flexibility, they suffer from performance degradation and prohibitive latency, respectively. To resolve this dilemma, we propose Search-and-Replace Infilling (SRI), a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process. By structurally grounding edits via an explicit search phase, SRI harmonizes completion tasks with the instruction-following priors of Chat LLMs, extending the paradigm from static infilling to dynamic context-aware editing. We synthesize a high-quality dataset, SRI-200K, and fine-tune the SRI-Coder series. Extensive evaluations demonstrate that with minimal data (20k samples), SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts. Crucially, unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM. We release our dataset and models, establishing SRI as a robust, secure, and efficient alignment recipe for next-generation interactive development.
RealChart2Code: Bridging the Gap in Real-World Chart-to-Code Generation via Multi-Task Evaluation
Jiajun Zhang | Yuying Li | Zhixun Li | Xingyu Guo | Jingzhuo Wu | Leqi Zheng | Yiran Yang | Jianke Zhang | Qingbin Li | Shannan Yan | Changguo Jia | Junfei Wu | Zilei Wang | Qiang Liu | Liang Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiajun Zhang | Yuying Li | Zhixun Li | Xingyu Guo | Jingzhuo Wu | Leqi Zheng | Yiran Yang | Jianke Zhang | Qingbin Li | Shannan Yan | Changguo Jia | Junfei Wu | Zilei Wang | Qiang Liu | Liang Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Vision-Language Models (VLMs) have demonstrated impressive capabilities in code generation across various domains. However, their ability to replicate complex, multi-panel visualizations from real-world data remains largely unassessed. To address this gap, we introduce RealChart2Code, a new large-scale benchmark with over 2,800 instances grounded in authentic datasets and featuring tasks with clear analytical intent. Crucially, it is the first benchmark to systematically evaluate chart generation from large-scale raw data and assess iterative code refinement in a multi-turn conversational setting. Our comprehensive evaluation of 14 leading VLMs on RealChart2Code reveals significant performance degradation compared to simpler benchmarks, highlighting their struggles with complex plot structures and authentic data. Our analysis uncovers a substantial performance gap between proprietary and open-weight models and confirms that even state-of-the-art VLMs often fail to accurately replicate intricate, multi-panel charts. These findings provide valuable insights into the current limitations of VLMs and guide future research directions.
2025
Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG
Xin Sun | Jianan Xie | Zhongqi Chen | Qiang Liu | Shu Wu | Yuehe Chen | Bowen Song | Zilei Wang | Weiqiang Wang | Liang Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xin Sun | Jianan Xie | Zhongqi Chen | Qiang Liu | Shu Wu | Yuehe Chen | Bowen Song | Zilei Wang | Weiqiang Wang | Liang Wang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) augmented with retrieval systems have significantly advanced natural language processing tasks by integrating external knowledge sources, enabling more accurate and contextually rich responses. To improve the robustness of such systems against noisy retrievals, Retrieval-Augmented Fine-Tuning (RAFT) has emerged as a widely adopted method. However, RAFT conditions models to generate answers even in the absence of reliable knowledge. This behavior undermines their reliability in high-stakes domains, where acknowledging uncertainty is critical. To address this issue, we propose Divide-Then-Align (DTA), a post-training approach designed to endow RAG systems with the ability to respond with “I don’t know” when the query is out of the knowledge boundary of both the retrieved passages and the model’s internal knowledge. DTA divides data samples into four knowledge quadrants and constructs tailored preference data for each quadrant, resulting in a curated dataset for Direct Preference Optimization (DPO). Experimental results on three benchmark datasets demonstrate that effectively balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
2023
Noise-Robust Semi-Supervised Learning for Distantly Supervised Relation Extraction
Xin Sun | Qiang Liu | Shu Wu | Zilei Wang | Liang Wang
Findings of the Association for Computational Linguistics: EMNLP 2023
Xin Sun | Qiang Liu | Shu Wu | Zilei Wang | Liang Wang
Findings of the Association for Computational Linguistics: EMNLP 2023
Distantly supervised relation extraction (DSRE) aims to extract relational facts from texts but suffers from noisy instances. To mitigate the influence of noisy labels, current methods typically use the Multi-Instance-Learning framework to extract relations for each bag. However, these approaches are not capable of extracting relation labels for individual sentences. Several studies have focused on sentence-level DSRE to solve the above problem. These studies primarily aim to develop methods for identifying noisy samples and filtering them out to mitigate the impact of noise. However, discarding noisy samples directly leads to the loss of useful information. To this end, we propose SSLRE, a novel Semi-Supervised-Learning Relation Extraction framework for sentence-level DSRE. We discard only the labels of the noisy samples and utilize these instances without labels as unlabeled samples. Our SSLRE framework utilizes a weighted K-NN graph to select confident samples as labeled data and the rest as unlabeled. We then design a robust semi-supervised learning framework that can efficiently handle remaining label noise present in the labeled dataset, while also making effective use of unlabeled samples. Based on our experiments on two real-world datasets, the SSLRE framework we proposed has achieved significant enhancements in sentence-level relation extraction performance compared to the existing state-of-the-art methods. Moreover, it has also attained a state-of-the-art level of performance in bag-level relation extraction with ONE aggregation strategy.
Search
Fix author
Co-authors
- Qiang Liu 4
- Xin Sun 2
- Liang Wang 2
- Liang Wang 2
- Shu Wu 2
- Jiajun Zhang 2
- Tianyi Bai 1
- Yuehe Chen 1
- Zhongqi Chen 1
- Zeyu Cui 1
- Xingyu Guo 1
- Binyuan Hui 1
- Changguo Jia 1
- Yuheng Jing 1
- Qingbin Li 1
- Yuying Li 1
- Zhixun Li 1
- Junyang Lin 1
- Zeyao Ma 1
- Bowen Song 1
- Weiqiang Wang (王维强) 1
- Jingzhuo Wu 1
- Junfei Wu 1
- Jianan Xie 1
- Shannan Yan 1
- Jiaxi Yang 1
- Yiran Yang 1
- Jianke Zhang 1
- Lei Zhang 1
- Leqi Zheng 1