CausalQA: A Benchmark for Causal Question Answering
Alexander Bondarenko, Magdalena Wolska, Stefan Heindorf, Lukas Blübaum, Axel-Cyrille Ngonga Ngomo, Benno Stein, Pavel Braslavski, Matthias Hagen, Martin Potthast
Abstract
At least 5% of questions submitted to search engines ask about cause-effect relationships in some way. To support the development of tailored approaches that can answer such questions, we construct Webis-CausalQA-22, a benchmark corpus of 1.1 million causal questions with answers. We distinguish different types of causal questions using a novel typology derived from a data-driven, manual analysis of questions from ten large question answering (QA) datasets. Using high-precision lexical rules, we extract causal questions of each type from these datasets to create our corpus. As an initial baseline, the state-of-the-art QA model UnifiedQA achieves a ROUGE-L F1 score of 0.48 on our new benchmark.- Anthology ID:
- 2022.coling-1.291
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 3296–3308
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.291
- DOI:
- Cite (ACL):
- Alexander Bondarenko, Magdalena Wolska, Stefan Heindorf, Lukas Blübaum, Axel-Cyrille Ngonga Ngomo, Benno Stein, Pavel Braslavski, Matthias Hagen, and Martin Potthast. 2022. CausalQA: A Benchmark for Causal Question Answering. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3296–3308, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- CausalQA: A Benchmark for Causal Question Answering (Bondarenko et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2022.coling-1.291.pdf
- Code
- webis-de/coling-22
- Data
- CommonsenseQA, ELI5, GooAQ, HotpotQA, MS MARCO, Natural Questions, NewsQA, PAQ, SQuAD, SearchQA, TriviaQA