On the Impacts of Contexts on Repository-Level Code Generation

Nam Le Hai; Dung Manh Nguyen; Nghi D. Q. Bui

On the Impacts of Contexts on Repository-Level Code Generation

Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui

Abstract

CodeLLMs are widely used for code generation, yet their ability to handle repository-level dependencies remains underexplored. We introduce RepoExec, a benchmark for evaluating repository-level code generation, focusing on executability, functional correctness, and dependency utilization. Our study evaluates 18 models, revealing that retaining full dependency context yields the best performance, while smaller context sizes can be misleading. Pretrained LLMs excel in correctness but often reimplement dependencies, while instruction-tuned models better utilize dependencies but sometimes introduce unnecessary complexity. We propose an instruction-tuning dataset that improves dependency handling and introduce a new metric, Dependency Invocation Rate (DIR), to measure context utilization. Experiments show that instruction-tuned models improve DIR by over 10%, and multi-round debugging further enhances both correctness and dependency use. RepoExec provides a comprehensive framework to advance CodeLLMs for real-world applications. The dataset and source code are available at https://github.com/FSoft-AI4Code/RepoExec.

Anthology ID:: 2025.findings-naacl.82
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1496–1524
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.82/
DOI:
Bibkey:
Cite (ACL):: Nam Le Hai, Dung Manh Nguyen, and Nghi D. Q. Bui. 2025. On the Impacts of Contexts on Repository-Level Code Generation. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 1496–1524, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: On the Impacts of Contexts on Repository-Level Code Generation (Hai et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.82.pdf

PDF Cite Search Fix data