MarkQA: A large scale KBQA dataset with numerical reasoning
Xiang Huang, Sitao Cheng, Yuheng Bao, Shanshan Huang, Yuzhong Qu
Abstract
While question answering over knowledge bases (KBQA) has shown progress in addressing factoid questions, KBQA with numerical reasoning remains relatively unexplored. In this paper, we focus on the complex numerical reasoning in KBQA, and propose a new task, NR-KBQA, which necessitates the ability to perform both multi-hop reasoning and numerical reasoning. We also design a logic form in Python format called PyQL to represent the reasoning process of numerical reasoning questions. To facilitate the development of NR-KBQA, we present a large NR-KBQA dataset called MarkQA, which is automatically constructed by a small set of seeds. Each question in MarkQA is annotated with its corresponding SPARQL query, alongside the step-by-step reasoning path in the QDMR format and PyQL program. Experimental results of some state-of-the-art QA methods performed on the MarkQA dataset show that complex numerical reasoning in KBQA faces great challenges.- Anthology ID:
- 2023.emnlp-main.633
- Original:
- 2023.emnlp-main.633v1
- Version 2:
- 2023.emnlp-main.633v2
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10241–10259
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-main.633
- DOI:
- Cite (ACL):
- Xiang Huang, Sitao Cheng, Yuheng Bao, Shanshan Huang, and Yuzhong Qu. 2023. MarkQA: A large scale KBQA dataset with numerical reasoning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10241–10259, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- MarkQA: A large scale KBQA dataset with numerical reasoning (Huang et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.emnlp-main.633.pdf