Intent-calibrated Self-training for Answer Selection in Open-domain Dialogues
Wentao Deng, Jiahuan Pei, Zhaochun Ren, Zhumin Chen, Pengjie Ren
Abstract
Answer selection in open-domain dialogues aims to select an accurate answer from candidates. The recent success of answer selection models hinges on training with large amounts of labeled data. However, collecting large-scale labeled data is labor-intensive and time-consuming. In this paper, we introduce the predicted intent labels to calibrate answer labels in a self-training paradigm. Specifically, we propose intent-calibrated self-training (ICAST) to improve the quality of pseudo answer labels through the intent-calibrated answer selection paradigm, in which we employ pseudo intent labels to help improve pseudo answer labels. We carry out extensive experiments on two benchmark datasets with open-domain dialogues. The experimental results show that ICAST outperforms baselines consistently with 1%, 5%, and 10% labeled data. Specifically, it improves 2.06% and 1.00% of F1 score on the two datasets, compared with the strongest baseline with only 5% labeled data.- Anthology ID:
- 2023.tacl-1.70
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 11
- Month:
- Year:
- 2023
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 1232–1249
- Language:
- URL:
- https://aclanthology.org/2023.tacl-1.70
- DOI:
- 10.1162/tacl_a_00599
- Cite (ACL):
- Wentao Deng, Jiahuan Pei, Zhaochun Ren, Zhumin Chen, and Pengjie Ren. 2023. Intent-calibrated Self-training for Answer Selection in Open-domain Dialogues. Transactions of the Association for Computational Linguistics, 11:1232–1249.
- Cite (Informal):
- Intent-calibrated Self-training for Answer Selection in Open-domain Dialogues (Deng et al., TACL 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.tacl-1.70.pdf