SEK: Self-Explained Keywords Empower Large Language Models for Code Generation

Lishui Fan; Mouxiang Chen; Zhongxin Liu

SEK: Self-Explained Keywords Empower Large Language Models for Code Generation

Abstract

Large language models (LLMs) have achieved impressive performance in code generation. Despite the remarkable success, we observed that LLMs often misunderstand or overlook some problem-specific undertrained keywords during code generation, compromising the accuracy of the generated code. After explicitly explaining these undertrained keywords using well-trained terms in the prompt, LLMs are more likely to generate correct code implementation. Inspired by this observation, we propose a novel technique named SEK(Self-Explained Keywords), which empowers an LLM for better code generation by extracting and explaining the key terms in the problem description with the LLM itself. Comprehensive experiments across four benchmarks, i.e., HumanEval(+), MBPP(+), APPS and BigCodeBench, with five representative LLMs, show that SEK can significantly improve LLMs in code generation, yielding substantial and consistent gains. For instance, SEK improves the Pass@1 of DeepSeek-Coder-V2-Instruct from 85.4% to 93.3% on the Humaneval benchmark. Further analysis confirms that SEK enables the LLMs to shift their attention from low-frequency keywords to their corresponding explanations.

Anthology ID:: 2025.findings-acl.324
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6249–6278
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.324/
DOI:
Bibkey:
Cite (ACL):: Lishui Fan, Mouxiang Chen, and Zhongxin Liu. 2025. SEK: Self-Explained Keywords Empower Large Language Models for Code Generation. In Findings of the Association for Computational Linguistics: ACL 2025, pages 6249–6278, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: SEK: Self-Explained Keywords Empower Large Language Models for Code Generation (Fan et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.324.pdf

PDF Cite Search Fix data