Syntax-Aware Retrieval Augmentation for Neural Symbolic Regression

Canmiao Zhou, Han Huang


Abstract
Symbolic regression is a powerful technique for discovering mathematical expressions that best fit observed data. While neural symbolic regression methods based on large-scale pre-trained models perform well on simple tasks, the reliance on fixed parametric knowledge typically limits their generalization to complex and diverse data distributions. To address this challenge, we propose a syntax-aware retrieval-augmented mechanism that leverages the syntactic structure of symbolic expressions to perform context-aware retrieval from a pre-constructed token datastore during inference. This mechanism enables the model to incorporate highly relevant non-parametric prior information to assist in expression generation. Additionally, we design an entropy-based confidence network that dynamically adjusts the fusion strength between neural and retrieved components by estimating predictive uncertainty. Extensive experiments on multiple symbolic regression benchmarks demonstrate that the proposed method significantly outperforms representative baselines, validating the effectiveness of retrieval augmentation in enhancing the generalization performance of neural symbolic regression models.
Anthology ID:
2025.emnlp-main.664
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13148–13158
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.664/
DOI:
Bibkey:
Cite (ACL):
Canmiao Zhou and Han Huang. 2025. Syntax-Aware Retrieval Augmentation for Neural Symbolic Regression. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 13148–13158, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Syntax-Aware Retrieval Augmentation for Neural Symbolic Regression (Zhou & Huang, EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.664.pdf
Checklist:
 2025.emnlp-main.664.checklist.pdf