Automated and Context-Aware Code Documentation Leveraging Advanced LLMs

Swapnil Sharma Sarker, Tanzina Taher Ifty


Abstract
Code documentation is essential to improve software maintainability and comprehension. The tedious nature of manual code documentation has led to much research on automated documentation generation. Existing automated approaches primarily focused on code summarization, leaving a gap in template-based documentation generation (e.g., Javadoc), particularly with publicly available Large Language Models (LLMs). Furthermore, progress in this area has been hindered by the lack of a Javadoc-specific dataset that incorporates modern language features, provides broad framework/library coverage, and includes necessary contextual information. This study aims to address these gaps by developing a tailored dataset and assessing the capabilities of publicly available LLMs for context-aware, template-based Javadoc generation. In this work, we present a novel, context-aware dataset for Javadoc generation that includes critical structural and semantic information from modern Java codebases. We evaluate five open-source LLMs (including LLaMA-3.1, Gemma-2, Phi-3, Mistral, Qwen-2.5) using zero-shot, few-shot, and fine-tuned setups and provide a comparative analysis of their performance. Our results demonstrate that LLaMA 3.1 performs consistently well and is a reliable candidate for practical, automated Javadoc generation, offering a viable alternative to proprietary systems.
Anthology ID:
2025.inlg-main.29
Volume:
Proceedings of the 18th International Natural Language Generation Conference
Month:
October
Year:
2025
Address:
Hanoi, Vietnam
Editors:
Lucie Flek, Shashi Narayan, Lê Hồng Phương, Jiahuan Pei
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
486–498
Language:
URL:
https://preview.aclanthology.org/ingest-luhme/2025.inlg-main.29/
DOI:
Bibkey:
Cite (ACL):
Swapnil Sharma Sarker and Tanzina Taher Ifty. 2025. Automated and Context-Aware Code Documentation Leveraging Advanced LLMs. In Proceedings of the 18th International Natural Language Generation Conference, pages 486–498, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):
Automated and Context-Aware Code Documentation Leveraging Advanced LLMs (Sarker & Ifty, INLG 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-luhme/2025.inlg-main.29.pdf