Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents

Yicheng Zou, Hongwei Liu, Tao Gui, Junzhe Wang, Qi Zhang, Meng Tang, Haixiang Li, Daniell Wang


Abstract
Text semantic matching is a fundamental task that has been widely used in various scenarios, such as community question answering, information retrieval, and recommendation. Most state-of-the-art matching models, e.g., BERT, directly perform text comparison by processing each word uniformly. However, a query sentence generally comprises content that calls for different levels of matching granularity. Specifically, keywords represent factual information such as action, entity, and event that should be strictly matched, while intents convey abstract concepts and ideas that can be paraphrased into various expressions. In this work, we propose a simple yet effective training strategy for text semantic matching in a divide-and-conquer manner by disentangling keywords from intents. Our approach can be easily combined with pre-trained language models (PLM) without influencing their inference efficiency, achieving stable performance improvements against a wide range of PLMs on three benchmarks.
Anthology ID:
2022.findings-acl.287
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3622–3632
Language:
URL:
https://aclanthology.org/2022.findings-acl.287
DOI:
10.18653/v1/2022.findings-acl.287
Bibkey:
Cite (ACL):
Yicheng Zou, Hongwei Liu, Tao Gui, Junzhe Wang, Qi Zhang, Meng Tang, Haixiang Li, and Daniell Wang. 2022. Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3622–3632, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents (Zou et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.findings-acl.287.pdf
Code
 rowitzou/dc-match
Data
GLUEMRPC