Yiming Zeng
2026
TreeDiff: AST-Guided Code Generation with Diffusion LLMs
Yiming Zeng | Jinghan Cao | Zexin Li | Yiming Chen | Tao Ren | Zhuochun Li | Dawei Xiang | Xidong Wu | Shangqian Gao | Tingting Yu
Proceedings of the First Workshop on Structured Understanding, Retrieval, and Generation in the LLM Era (SURGeLLM 2026)
Yiming Zeng | Jinghan Cao | Zexin Li | Yiming Chen | Tao Ren | Zhuochun Li | Dawei Xiang | Xidong Wu | Shangqian Gao | Tingting Yu
Proceedings of the First Workshop on Structured Understanding, Retrieval, and Generation in the LLM Era (SURGeLLM 2026)
Code generation is increasingly critical for real-world applications. Still, diffusion-based large language models continue to struggle with this demand. Unlike free-form text, code requires syntactic precision; even minor structural inconsistencies can render a program non-executable. Existing diffusion-based large language models rely on random token masking for corruption, leading to two key failures: they lack awareness of syntactic boundaries during the iterative denoising process, and they fail to capture the long-range hierarchical dependencies essential for program correctness.We propose TreeDiff to address both issues. Specifically, we propose a syntax-aware diffusion framework that incorporates structural priors from Abstract Syntax Tree (AST) into the corruption process. Instead of masking individual tokens at random, we selectively mask tokens belonging to key AST nodes. By aligning the corruption process with the underlying structure of code, our method encourages the model to internalize the compositional nature of programming languages, enabling it to reconstruct programs that respect grammatical boundaries and capture long-range dependencies. Our method achieves a 13.3% relative improvement over the random masking training method, demonstrating its effectiveness in code generation task by leveraging underlying structures.
HyperEdit: Unlocking Instruction-based Text Editing in LLMs via Hypernetworks
Yiming Zeng | Jinghan Cao | Zexin Li | Wanhao Yu | Zhankai Ye | Dawei Xiang | Ting Hua | Xin Liu | Shangqian Gao | Tingting Yu
Findings of the Association for Computational Linguistics: ACL 2026
Yiming Zeng | Jinghan Cao | Zexin Li | Wanhao Yu | Zhankai Ye | Dawei Xiang | Ting Hua | Xin Liu | Shangqian Gao | Tingting Yu
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models (LLMs) have fundamentally transformed natural language processing (NLP), demonstrating remarkable capabilities across a wide spectrum of tasks. However, when applied to instruction-based text editing, LLMs continue to exhibit some limitations. Different from free-form generation, instruction-based editing requires precise, targeted modifications that respect two essential properties: faithfully implementing the specific instruction and local fidelity. Existing approaches often overlook these properties, treating editing as a generic text generation problem. As a result, they either over-edit or fail to apply modifications consistently. To address this gap, we propose HyperEdit, a framework that adaptively processes each editing request to best align with it. To achieve this, HyperEdit generates request-specific dynamic weights that guide the editing process. The computational overhead of producing these weights is minimized through a carefully designed hypernetwork. With this design, HyperEdit achieves a relatively 9% improvement over the state-of-the-art editing model.
2025
Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications
Yiming Zeng | Wanhao Yu | Zexin Li | Tao Ren | Yu Ma | Jinghan Cao | Xiyan Chen | Tingting Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Yiming Zeng | Wanhao Yu | Zexin Li | Tao Ren | Yu Ma | Jinghan Cao | Xiyan Chen | Tingting Yu
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) have significantly advanced natural language processing, demonstrating strong capabilities in tasks such as text generation, summarization, and reasoning. Recently, their potential for automating precise text editing tasks across specialized domains, such as programming code, LaTeX, and structured database languages, has gained attention. However, current state-of-the-art LLMs still struggle with executing precise, instruction-driven edits, particularly when structural accuracy and strict adherence to domain conventions are required.To address these challenges, we introduce InstrEditBench, an automated benchmark dataset comprising over 30,000 structured editing tasks spanning diverse domains, including Wikipedia articles, LaTeX documents, source code, and database languages. Using this benchmark, we develop FineEdit, a specialized editing model explicitly trained for accurate, context-aware text modifications. Experimental evaluations demonstrate that FineEdit outperforms state-of-the-art models, achieving improvements of approximately 10% over Gemini models on single-turn edits, up to 30% over Llama-3.2-3B, and exceeding Mistral-7B-OpenOrca performance by over 40% on direct editing tasks. FineEdit also effectively generalizes to realistic multi-turn editing scenarios, highlighting its practical applicability. To facilitate further research and reproducibility, we release FineEdit at https://github.com/StuRinDQB/FineEdit and https://huggingface.co/datasets/YimingZeng/FineEdit_bench.
PromptSculptor: Multi-Agent Based Text-to-Image Prompt Optimization
Dawei Xiang | Wenyan Xu | Kexin Chu | Tianqi Ding | Zixu Shen | Yiming Zeng | Jianchang Su | Wei Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Dawei Xiang | Wenyan Xu | Kexin Chu | Tianqi Ding | Zixu Shen | Yiming Zeng | Jianchang Su | Wei Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The rapid advancement of generative AI has democratized access to powerful tools such as Text-to-Image (T2I) models. However, to generate high-quality images, users must still craft detailed prompts specifying scene, style, and context—often through multiple rounds of refinement. We propose PromptSculptor, a novel multi-agent framework that automates this iterative prompt optimization process. Our system decomposes the task into four specialized agents that work collaboratively to transform a short, vague user prompt into a comprehensive, refined prompt. By leveraging Chain-of-Thought (CoT) reasoning, our framework effectively infers hidden context and enriches scene and background details. To iteratively refine the prompt, a self-evaluation agent aligns the modified prompt with the original input, while a feedback-tuning agent incorporates user feedback for further refinement. Experimental results demonstrate that PromptSculptor significantly enhances output quality and reduces the number of iterations needed for user satisfaction. Moreover, its model-agnostic design allows seamless integration with various T2I models, paving the way for industrial applications.