Jinghan Cao

2026

Large Language Models (LLMs) have fundamentally transformed natural language processing (NLP), demonstrating remarkable capabilities across a wide spectrum of tasks. However, when applied to instruction-based text editing, LLMs continue to exhibit some limitations. Different from free-form generation, instruction-based editing requires precise, targeted modifications that respect two essential properties: faithfully implementing the specific instruction and local fidelity. Existing approaches often overlook these properties, treating editing as a generic text generation problem. As a result, they either over-edit or fail to apply modifications consistently. To address this gap, we propose HyperEdit, a framework that adaptively processes each editing request to best align with it. To achieve this, HyperEdit generates request-specific dynamic weights that guide the editing process. The computational overhead of producing these weights is minimized through a carefully designed hypernetwork. With this design, HyperEdit achieves a relatively 9% improvement over the state-of-the-art editing model.

2025

pdf bib abs

Large Language Models (LLMs) have significantly advanced natural language processing, demonstrating strong capabilities in tasks such as text generation, summarization, and reasoning. Recently, their potential for automating precise text editing tasks across specialized domains, such as programming code, LaTeX, and structured database languages, has gained attention. However, current state-of-the-art LLMs still struggle with executing precise, instruction-driven edits, particularly when structural accuracy and strict adherence to domain conventions are required.To address these challenges, we introduce InstrEditBench, an automated benchmark dataset comprising over 30,000 structured editing tasks spanning diverse domains, including Wikipedia articles, LaTeX documents, source code, and database languages. Using this benchmark, we develop FineEdit, a specialized editing model explicitly trained for accurate, context-aware text modifications. Experimental evaluations demonstrate that FineEdit outperforms state-of-the-art models, achieving improvements of approximately 10% over Gemini models on single-turn edits, up to 30% over Llama-3.2-3B, and exceeding Mistral-7B-OpenOrca performance by over 40% on direct editing tasks. FineEdit also effectively generalizes to realistic multi-turn editing scenarios, highlighting its practical applicability. To facilitate further research and reproducibility, we release FineEdit at https://github.com/StuRinDQB/FineEdit and https://huggingface.co/datasets/YimingZeng/FineEdit_bench.

Co-authors

Shangqian Gao 1

Ting Hua 1

Xin Liu 1

Yu Ma 1

Tao Ren 1

Dawei Xiang 1

Zhankai Ye 1

Venues

Findings2

Fix author