HAWP: a Dataset for Hindi Arithmetic Word Problem Solving

Harshita Sharma, Pruthwik Mishra, Dipti Sharma


Abstract
Word Problem Solving remains a challenging and interesting task in NLP. A lot of research has been carried out to solve different genres of word problems with various complexity levels in recent years. However, most of the publicly available datasets and work has been carried out for English. Recently there has been a surge in this area of word problem solving in Chinese with the creation of large benchmark datastes. Apart from these two languages, labeled benchmark datasets for low resource languages are very scarce. This is the first attempt to address this issue for any Indian Language, especially Hindi. In this paper, we present HAWP (Hindi Arithmetic Word Problems), a dataset consisting of 2336 arithmetic word problems in Hindi. We also developed baseline systems for solving these word problems. We also propose a new evaluation technique for word problem solvers taking equation equivalence into account.
Anthology ID:
2022.lrec-1.373
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3479–3490
Language:
URL:
https://aclanthology.org/2022.lrec-1.373
DOI:
Bibkey:
Cite (ACL):
Harshita Sharma, Pruthwik Mishra, and Dipti Sharma. 2022. HAWP: a Dataset for Hindi Arithmetic Word Problem Solving. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3479–3490, Marseille, France. European Language Resources Association.
Cite (Informal):
HAWP: a Dataset for Hindi Arithmetic Word Problem Solving (Sharma et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.373.pdf
Code
 Pruthwik/Hindi-Word-Problem-Solver
Data
MAWPS