Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network
Zheng Gong, Kun Zhou, Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen
Abstract
In this paper, we study how to continually pre-train language models for improving the understanding of math problems. Specifically, we focus on solving a fundamental challenge in modeling math problems, how to fuse the semantics of textual description and formulas, which are highly different in essence. To address this issue, we propose a new approach called COMUS to continually pre-train language models for math problem understanding with syntax-aware memory network. In this approach, we first construct the math syntax graph to model the structural semantic information, by combining the parsing trees of the text and formulas, and then design the syntax-aware memory networks to deeply fuse the features from the graph and text. With the help of syntax relations, we can model the interaction between the token from the text and its semantic-related nodes within the formulas, which is helpful to capture fine-grained semantic correlations between texts and formulas. Besides, we devise three continual pre-training tasks to further align and fuse the representations of the text and math syntax graph. Experimental results on four tasks in the math domain demonstrate the effectiveness of our approach. Our code and data are publicly available at the link: bluehttps://github.com/RUCAIBox/COMUS.- Anthology ID:
- 2022.acl-long.408
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5923–5933
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.408
- DOI:
- 10.18653/v1/2022.acl-long.408
- Cite (ACL):
- Zheng Gong, Kun Zhou, Xin Zhao, Jing Sha, Shijin Wang, and Ji-Rong Wen. 2022. Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5923–5933, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network (Gong et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2022.acl-long.408.pdf
- Code
- rucaibox/comus
- Data
- MATH