Abstract
Stance detection aims to determine the position of an author toward a target and provides insights into people’s views on controversial topics such as marijuana legalization. Despite recent progress in this task, most existing approaches use hard labels (one-hot vectors) during training, which ignores meaningful signals among categories offered by soft labels. In this work, we explore knowledge distillation for stance detection and present a comprehensive analysis. Our contributions are: 1) we propose to use knowledge distillation over multiple generations in which a student is taken as a new teacher to transfer knowledge to a new fresh student; 2) we propose a novel dynamic temperature scaling for knowledge distillation to calibrate teacher predictions in each generation step. Extensive results on three stance detection datasets show that knowledge distillation benefits stance detection and a teacher is able to transfer knowledge to a student more smoothly via calibrated guiding signals. We publicly release our code to facilitate future research.- Anthology ID:
- 2023.findings-acl.393
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6316–6329
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.393
- DOI:
- Cite (ACL):
- Yingjie Li and Cornelia Caragea. 2023. Distilling Calibrated Knowledge for Stance Detection. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6316–6329, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Distilling Calibrated Knowledge for Stance Detection (Li & Caragea, Findings 2023)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2023.findings-acl.393.pdf