Abstract
Existing Math Word Problem (MWP) solvers have achieved high accuracy on benchmark datasets. However, prior works have shown that such solvers do not generalize well and rely on superficial cues to achieve high performance. In this paper, we first conduct experiments to showcase that this behaviour is mainly associated with the limited size and diversity present in existing MWP datasets. Next, we propose several data augmentation techniques broadly categorized into Substitution and Paraphrasing based methods. By deploying these methods we increase the size of existing datasets by five folds. Extensive experiments on two benchmark datasets across three state-of-the-art MWP solvers shows that proposed methods increase the generalization and robustness of existing solvers. On average, proposed methods significantly increase the state-of-the-art results by over five percentage points on benchmark datasets. Further, the solvers trained on the augmented dataset performs comparatively better on the challenge test set. We also show the effectiveness of proposed techniques through ablation studies and verify the quality of augmented samples through human evaluation.- Anthology ID:
- 2022.naacl-main.310
- Volume:
- Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4194–4206
- Language:
- URL:
- https://aclanthology.org/2022.naacl-main.310
- DOI:
- 10.18653/v1/2022.naacl-main.310
- Cite (ACL):
- Vivek Kumar, Rishabh Maheshwary, and Vikram Pudi. 2022. Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4194–4206, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers (Kumar et al., NAACL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.naacl-main.310.pdf
- Code
- kevivk/mwp-augmentation
- Data
- ASDiv, MAWPS, SVAMP