Abstract
Recent research on code summarization relies on the structural information from the abstract syntax tree (AST) of source codes. It is, however, questionable whether it is the most effective to use AST for expressing the structural information. We find that a program dependency graph (PDG) can represent the structure of a code more effectively. We propose PDG Boosting Module (PBM) that encodes PDG into graph embedding and the framework to implement the proposed PBM with the existing models. PBM achieves improvements of 6.67% (BLEU) and 7.47% (ROUGE) on average. We then analyze the experimental results, and examine how PBM helps the training of baseline models and its performance robustness. For the validation of robustness, we measure the performance of an out-of-domain benchmark dataset, and confirm its robustness. In addition, we apply a new evaluation measure, SBERT score, to evaluate the semantic performance. The models implemented with PBM improve the performance of SBERT score. This implies that they generate summaries that are semantically more similar to the reference summary.- Anthology ID:
- 2022.coling-1.521
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 5966–5977
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.521
- DOI:
- Cite (ACL):
- Jikyoeng Son, Joonghyuk Hahn, HyeonTae Seo, and Yo-Sub Han. 2022. Boosting Code Summarization by Embedding Code Structures. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5966–5977, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Boosting Code Summarization by Embedding Code Structures (Son et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.coling-1.521.pdf
- Data
- CodeSearchNet