ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation
Prince Kumar, Vitobha Munigala, Jaydeep Sen, Ashish Mittal, Vishwajeet Kumar, Srikanth G. Tamilselvam
Abstract
Code explanations are increasingly generated by large language models and used in software engineering workflows, making reliable evaluation essential. However, existing model-based and embedding-based methods often fail to distinguish correct explanations from partially or fully incorrect ones, and their similarity scores are poorly calibrated and do not reflect meaningful differences in explanation quality. To address this, we propose ODASim(Orderly, Dstinctive, and Absolute Similarity), a model-agnostic graded fine-tuning framework for embedding models that learns calibrated similarity representations between code and explanations. To support fine-grained supervision and evaluation, we also introduce ODA-X, a novel benchmark for code-to-explanation quality grading, comprising code–explanation pairs graded similarity labels derived from strategic perturbations of gold explanations. We apply our ODASim approach to multiple embedding models and evaluate it on two benchmarks: widely popular CodeXGLUE and our proposed benchmark ODA-X, spanning four programming languages - Python, Java, JavaScript, and Go. Results show that our method achieves up to 35% improvement in F1 score and 85% reduction in Expected Calibration Error (ECE), enabling reliable evaluation of code to explanation quality.- Anthology ID:
- 2026.findings-acl.1415
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 28390–28403
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1415/
- DOI:
- Cite (ACL):
- Prince Kumar, Vitobha Munigala, Jaydeep Sen, Ashish Mittal, Vishwajeet Kumar, and Srikanth G. Tamilselvam. 2026. ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 28390–28403, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation (Kumar et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1415.pdf