Sachit Kuhar

2026

Large Language Models (LLMs) excel at code-related tasks but often struggle in realistic software repositories, where project-specific APIs and cross-file dependencies are crucial. Retrieval-augmented methods mitigate this by injecting repository context at inference time. Low inference time latency budget either affects retrieval quality or the added latency impacts user experience adversely. We address this limitation with SpecAgent, an agent that enhances both latency and code-generation quality by proactively exploring repository files during indexing and constructing speculative context that anticipates future edits in each file. This indexing-time asynchrony allows thorough context computation masking latency and the speculative nature of the context improves code-generation quality. Additionally, we identify the problem of future context leakage in existing benchmarks, which can inflate reported performance. To address this, we construct a synthetic, leakage-free benchmark that enables a more realistic evaluation of our agent against baselines. Experiments show that SpecAgent consistently achieves absolute gains of 9–11% (48–58% relative) compared to the best-performing baselines, while significantly reducing inference latency.

2025

pdf bib abs

LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation
Sachit Kuhar | Wasi Uddin Ahmad | Zijian Wang | Nihal Jain | Haifeng Qian | Baishakhi Ray | Murali Krishna Ramanathan | Xiaofei Ma | Anoop Deoras
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Recent advancements in code completion models have primarily focused on local file contexts. However, these studies do not fully capture the complexity of real-world software development, which often requires the use of rapidly-evolving public libraries. To address this gap, we introduce LibEvolutionEval, a comprehensive study that emphasizes the need to understand library evolution to perform accurate in-line code completions. LibEvolutionEvaloffers a version-specific code-completion task across eight libraries as they evolve over the years, along with an in-depth analysis of the evolution of two widely used and well-maintained public libraries: PyTorch and Matplotlib. We evaluate several popular models and find that public library evolution significantly affects their performance. To mitigate this, we explored how retrieving version-specific library documentation and prompt-based techniques can enhance model capability in dealing with these fast-evolving packages. This suggests a promising path forward for better handling fast-evolving libraries. Our tasks will be made publicly available upon acceptance.

Co-authors

Yu Yu 1

Venues

ACL1
NAACL1

Fix author