Language Modeling for the Future of Finance: A Survey into Metrics, Tasks, and Data Opportunities

Nikita Tatarinov, Siddhant Sukhani, Agam Shah, Sudheer Chava


Abstract
Recent advances in language modeling have led to a growing number of papers related to finance in top-tier Natural Language Processing (NLP) venues. To systematically examine this trend, we review 374 NLP research papers published between 2017 and 2024 across 38 conferences and workshops, with a focused analysis of 221 papers that directly address finance-related tasks. We evaluate these papers across 11 quantitative and qualitative dimensions, with particular attention to evaluation practices, metric choices, dataset coverage, and reproducibility in a high-stakes applied LM domain. Our study identifies the following opportunities for NLP researchers: (i) expanding the scope of forecasting tasks; (ii) enriching evaluation with finance-specific metrics; (iii) leveraging multilingual and crisis-period datasets for robustness-oriented evaluation; and (iv) balancing PLMs with efficient or interpretable alternatives. We identify actionable directions supported by dataset and tool recommendations, with implications for both academic evaluation practices and industry deployment.
Anthology ID:
2026.gem-main.65
Volume:
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Simon Mille, Sebastian Gehrmann, Patrícia Schmidtová, Ondřej Dušek, Marzieh Fadaee, Kyle Lo, Enrico Santus, Gabriel Stanovsky
Venues:
GEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
718–744
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.gem-main.65/
DOI:
Bibkey:
Cite (ACL):
Nikita Tatarinov, Siddhant Sukhani, Agam Shah, and Sudheer Chava. 2026. Language Modeling for the Future of Finance: A Survey into Metrics, Tasks, and Data Opportunities. In Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM), pages 718–744, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Language Modeling for the Future of Finance: A Survey into Metrics, Tasks, and Data Opportunities (Tatarinov et al., GEM 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.gem-main.65.pdf