M3TQA: Massively Multilingual Multitask Table Question Answering
Daixin Shu, Jian Yang, Zhenhe Wu, Xianjie Wu, Xianfu Cheng, Guan Xiangyuan, Yanghai Wang, Pengfei Wu, Tingyang Yang, Hualei Zhu, Wei Zhang, Ge Zhang, Jiaheng Liu, Zhoujun Li
Abstract
Tabular data is a fundamental component of real-world information systems. However, existing multilingual table benchmarks suffer from geolinguistic imbalance - overrepresenting certain languages and lacking sufficient scale for rigorous cross-lingual analysis. To address these limitations, we introduce M3TQA, which is a comprehensive framework for massively multilingual multitask table question answering, including subsequent datasets M3TQA-BENCH and M3TQA-INSTRUCT, featuring tables expanded to 97 languages from Chinese and English sources. M3TQA-BENCH includes 6,606 professionally annotated question-answering pairs across four tasks designed to evaluate nuanced table reasoning capabilities. Additionally, we synthesized the training set M3TQA-INSTRUCT in 97 languages using Large Language Model (LLM). Experiments on state-of-the-art LLMs reveal critical insights into cross-lingual generalization, demonstrating that synthetically generated, unannotated training data can significantly boost performance, particularly for low-resource languages. M3TQA establishes a new standard for multilingual table understanding, providing both a challenging evaluation platform and a scalable methodology for future research.- Anthology ID:
- 2026.findings-acl.1134
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 22578–22602
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1134/
- DOI:
- Cite (ACL):
- Daixin Shu, Jian Yang, Zhenhe Wu, Xianjie Wu, Xianfu Cheng, Guan Xiangyuan, Yanghai Wang, Pengfei Wu, Tingyang Yang, Hualei Zhu, Wei Zhang, Ge Zhang, Jiaheng Liu, and Zhoujun Li. 2026. M3TQA: Massively Multilingual Multitask Table Question Answering. In Findings of the Association for Computational Linguistics: ACL 2026, pages 22578–22602, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- M3TQA: Massively Multilingual Multitask Table Question Answering (Shu et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1134.pdf