Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models

Cong Thanh Do; Rama Sanand Doddipatla; Kate Knill

Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models

Cong Thanh Do, Rama Sanand Doddipatla, Kate Knill

Abstract

Chain-of-Thought (CoT) prompting is a widely used method to improve the reasoning capability of Large Language Models (LLMs). More recently, CoT has been leveraged in Knowledge Distillation (KD) to transfer reasoning capability from a larger LLM to a smaller one. This paper examines the role of CoT in distilling the reasoning capability from larger LLMs to smaller LLMs using white-box KD, analyzing its effectiveness in improving the performance of the distilled models for various natural language reasoning and understanding tasks. We conduct white-box KD experiments using LLMs from the Qwen and Llama2 families, employing CoT data from the CoT-Collection dataset. The distilled models are then evaluated on natural language reasoning and understanding tasks from the BIG-Bench-Hard (BBH) benchmark, which presents complex challenges for smaller LLMs. Experimental results demonstrate the role of CoT in improving white-box KD effectiveness, enabling the distilled models to achieve better average performance in natural language reasoning and understanding tasks from BBH.

Anthology ID:: 2025.inlg-main.49
Volume:: Proceedings of the 18th International Natural Language Generation Conference
Month:: October
Year:: 2025
Address:: Hanoi, Vietnam
Editors:: Lucie Flek, Shashi Narayan, Lê Hồng Phương, Jiahuan Pei
Venue:: INLG
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 833–845
Language:
URL:: https://preview.aclanthology.org/ingest-luhme/2025.inlg-main.49/
DOI:
Bibkey:
Cite (ACL):: Cong Thanh Do, Rama Sanand Doddipatla, and Kate Knill. 2025. Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models. In Proceedings of the 18th International Natural Language Generation Conference, pages 833–845, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):: Effectiveness of Chain-of-Thought in Distilling Reasoning Capability from Large Language Models (Do et al., INLG 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-luhme/2025.inlg-main.49.pdf

PDF Cite Search Fix data