Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination
Jianxin Liang, Chang Liu, Chongyang Tao, Jiazhan Feng, Dongyan Zhao
Abstract
Although the incorporation of pre-trained language models (PLMs) significantly pushes the research frontier of multi-turn response selection, it brings a new issue of heavy computation costs. To alleviate this problem and make the PLM-based response selection model both effective and efficient, we propose an inference framework together with a post-training strategy that builds upon any pre-trained transformer-based response selection models to accelerate inference by progressively selecting and eliminating unimportant content under the guidance of context-response dual-attention. Specifically, at each transformer layer, we first identify the importance of each word based on context-to-response and response-to-context attention, then select a number of unimportant words to be eliminated following a retention configuration derived from evolutionary search while passing the rest of the representations into deeper layers. To mitigate the training-inference gap posed by content elimination, we introduce a post-training strategy where we use knowledge distillation to force the model with progressively eliminated content to mimic the predictions of the original model with no content elimination. Experiments on three benchmarks indicate that our method can effectively speeds-up SOTA models without much performance degradation and shows a better trade-off between speed and performance than previous methods.- Anthology ID:
- 2023.findings-acl.422
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6758–6770
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.422
- DOI:
- 10.18653/v1/2023.findings-acl.422
- Cite (ACL):
- Jianxin Liang, Chang Liu, Chongyang Tao, Jiazhan Feng, and Dongyan Zhao. 2023. Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6758–6770, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination (Liang et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2023.findings-acl.422.pdf