LiveQA: A Question Answering Dataset over Sports Live

Liu Qianying, Jiang Sicong, Wang Yizhong, Li Sujian


Abstract
In this paper, we introduce LiveQA, a new question answering dataset constructed from play-by-play live broadcast. It contains 117k multiple-choice questions written by human commentators for over 1,670 NBA games, which are collected from the Chinese Hupu1 website. Derived from the characteristics of sports games, LiveQA can potentially test the reasoning ability across timeline-based live broadcasts, which is challenging compared to the existing datasets. In LiveQA, the questions require understanding the timeline, tracking events or doing mathematical computations. Our preliminary experiments show that the dataset introduces a challenging problem for question answering models, and a strong baseline model only achieves the accuracy of 53.1% and cannot beat the dominant option rule. We release the code and data of this paper for future research.
Anthology ID:
2020.ccl-1.98
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
1057–1067
Language:
English
URL:
https://aclanthology.org/2020.ccl-1.98
DOI:
Bibkey:
Cite (ACL):
Liu Qianying, Jiang Sicong, Wang Yizhong, and Li Sujian. 2020. LiveQA: A Question Answering Dataset over Sports Live. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 1057–1067, Haikou, China. Chinese Information Processing Society of China.
Cite (Informal):
LiveQA: A Question Answering Dataset over Sports Live (Qianying et al., CCL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.ccl-1.98.pdf
Code
 PKU-TANGENT/LiveQA +  additional community code
Data
LiveQACBTMS MARCORACESQuAD