MaRF: Leveraging Representation-Level Fusion of Formula Semantics for Mathematical Information Retrieval

Suyuan Wang, Hongbo Zheng, Nickvash Kani


Abstract
Mathematical information retrieval (MIR) depends on jointly modeling natural-language context and mathematical expressions. While BERT-based dense retrievers are effective, they often dilute mathematical semantics because textual content dominates most training data and mathematical formulas differ fundamentally from natural language in structure and composition. Consequently, these models rely heavily on surrounding text, which reduces robustness in math-intensive scenarios with limited textual description. We propose MaRF, a dual-encoder representation-level fusion framework for MIR that explicitly integrates formula semantics into context-aware dense retrieval. By combining contextual and formula-specific representations, MaRF captures complementary information from both textual and symbolic views. Experiments on the ARQMath-3 benchmark demonstrate that MaRF substantially improves retrieval performance and robustness, outperforming strong baselines across MIR tasks. The source code and datasets are available at https://github.com/MLPgroup/MaRF.
Anthology ID:
2026.findings-acl.1277
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
25570–25585
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1277/
DOI:
Bibkey:
Cite (ACL):
Suyuan Wang, Hongbo Zheng, and Nickvash Kani. 2026. MaRF: Leveraging Representation-Level Fusion of Formula Semantics for Mathematical Information Retrieval. In Findings of the Association for Computational Linguistics: ACL 2026, pages 25570–25585, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
MaRF: Leveraging Representation-Level Fusion of Formula Semantics for Mathematical Information Retrieval (Wang et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1277.pdf
Checklist:
 2026.findings-acl.1277.checklist.pdf