ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples
Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, Dragomir Radev
Abstract
Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills. Current models with table-specific architectures and pre-training methods perform well on understanding table structures, but they still struggle with tasks that require various table reasoning skills. In this work, we develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. We define 7 table reasoning skills, such as numerical operation, temporal comparison, and conjunction. Each reasoning skill is associated with one example generator, which synthesizes questions over semi-structured tables according to the sampled templates. We model the table pre-training task as a sequence generation task and pre-train ReasTAP to generate precise answers of the synthetic examples. ReasTAP is evaluated on four benchmarks covering three downstream tasks including 1) WikiSQL-Weak and WikiTQ for Table Question Answering, 2) TabFact for Table Fact Verification, and 3) LogicNLG for Faithful Table-to-Text Generation. Experimental results demonstrate that ReasTAP achieves new state-of-the-art results on all of them and delivers a significant improvement under low-resource setting. Our code is publicly available at https://github.com/Yale-LILY/ReasTAP.- Anthology ID:
- 2022.emnlp-main.615
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9006–9018
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.emnlp-main.615/
- DOI:
- 10.18653/v1/2022.emnlp-main.615
- Cite (ACL):
- Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, and Dragomir Radev. 2022. ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9006–9018, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples (Zhao et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.emnlp-main.615.pdf