CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting
Andrea Iovine, Anjie Fang, Besnik Fetahu, Jie Zhao, Oleg Rokhlenko, Shervin Malmasi
Abstract
Users expect their queries to be answered by search systems, regardless of the query’s surface form, which include keyword queries and natural questions. Natural Language Understanding (NLU) components of Search and QA systems may fail to correctly interpret semantically equivalent inputs if this deviates from how the system was trained, leading to suboptimal understanding capabilities. We propose the keyword-question rewriting task to improve query understanding capabilities of NLU systems for all surface forms. To achieve this, we present CycleKQR, an unsupervised approach, enabling effective rewriting between keyword and question queries using non-parallel data.Empirically we show the impact on QA performance of unfamiliar query forms for open domain and Knowledge Base QA systems (trained on either keywords or natural language questions). We demonstrate how CycleKQR significantly improves QA performance by rewriting queries into the appropriate form, while at the same time retaining the original semantic meaning of input queries, allowing CycleKQR to improve performance by up to 3% over supervised baselines. Finally, we release a datasetof 66k keyword-question pairs.- Anthology ID:
- 2022.emnlp-main.814
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11875–11886
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.814
- DOI:
- 10.18653/v1/2022.emnlp-main.814
- Cite (ACL):
- Andrea Iovine, Anjie Fang, Besnik Fetahu, Jie Zhao, Oleg Rokhlenko, and Shervin Malmasi. 2022. CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11875–11886, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting (Iovine et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.emnlp-main.814.pdf