FASTMATCH: Accelerating the Inference of BERT-based Text Matching
Shuai Pang, Jianqiang Ma, Zeyu Yan, Yang Zhang, Jianping Shen
Abstract
Recently, pre-trained language models such as BERT have shown state-of-the-art accuracies in text matching. When being applied to IR (or QA), the BERT-based matching models need to online calculate the representations and interactions for all query-candidate pairs. The high inference cost has prohibited the deployments of BERT-based matching models in many practical applications. To address this issue, we propose a novel BERT-based text matching model, in which the representations and the interactions are decoupled. Then, the representations of the candidates can be calculated and stored offline, and directly retrieved during the online matching phase. To conduct the interactions and generate final matching scores, a lightweight attention network is designed. Experiments based on several large scale text matching datasets show that the proposed model, called FASTMATCH, can achieve up to 100X speed-up to BERT and RoBERTa at the online matching phase, while keeping more up to 98.7% of the performance.- Anthology ID:
- 2020.coling-main.568
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 6459–6469
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.568
- DOI:
- 10.18653/v1/2020.coling-main.568
- Cite (ACL):
- Shuai Pang, Jianqiang Ma, Zeyu Yan, Yang Zhang, and Jianping Shen. 2020. FASTMATCH: Accelerating the Inference of BERT-based Text Matching. In Proceedings of the 28th International Conference on Computational Linguistics, pages 6459–6469, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- FASTMATCH: Accelerating the Inference of BERT-based Text Matching (Pang et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2020.coling-main.568.pdf
- Data
- GLUE, QNLI, TrecQA