An empirical analysis of existing systems and datasets toward general simple question answering

Namgi Han, Goran Topic, Hiroshi Noji, Hiroya Takamura, Yusuke Miyao


Abstract
In this paper, we evaluate the progress of our field toward solving simple factoid questions over a knowledge base, a practically important problem in natural language interface to database. As in other natural language understanding tasks, a common practice for this task is to train and evaluate a model on a single dataset, and recent studies suggest that SimpleQuestions, the most popular and largest dataset, is nearly solved under this setting. However, this common setting does not evaluate the robustness of the systems outside of the distribution of the used training data. We rigorously evaluate such robustness of existing systems using different datasets. Our analysis, including shifting of training and test datasets and training on a union of the datasets, suggests that our progress in solving SimpleQuestions dataset does not indicate the success of more general simple question answering. We discuss a possible future direction toward this goal.
Anthology ID:
2020.coling-main.465
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5321–5334
Language:
URL:
https://aclanthology.org/2020.coling-main.465
DOI:
10.18653/v1/2020.coling-main.465
Bibkey:
Cite (ACL):
Namgi Han, Goran Topic, Hiroshi Noji, Hiroya Takamura, and Yusuke Miyao. 2020. An empirical analysis of existing systems and datasets toward general simple question answering. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5321–5334, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
An empirical analysis of existing systems and datasets toward general simple question answering (Han et al., COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.coling-main.465.pdf
Code
 aistairc/simple-qa-analysis
Data
FreebaseQASimpleQuestions