Siddhant Sukhani
2026
Language Modeling for the Future of Finance: A Survey into Metrics, Tasks, and Data Opportunities
Nikita Tatarinov | Siddhant Sukhani | Agam Shah | Sudheer Chava
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Nikita Tatarinov | Siddhant Sukhani | Agam Shah | Sudheer Chava
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Recent advances in language modeling have led to a growing number of papers related to finance in top-tier Natural Language Processing (NLP) venues. To systematically examine this trend, we review 374 NLP research papers published between 2017 and 2024 across 38 conferences and workshops, with a focused analysis of 221 papers that directly address finance-related tasks. We evaluate these papers across 11 quantitative and qualitative dimensions, with particular attention to evaluation practices, metric choices, dataset coverage, and reproducibility in a high-stakes applied LM domain. Our study identifies the following opportunities for NLP researchers: (i) expanding the scope of forecasting tasks; (ii) enriching evaluation with finance-specific metrics; (iii) leveraging multilingual and crisis-period datasets for robustness-oriented evaluation; and (iv) balancing PLMs with efficient or interpretable alternatives. We identify actionable directions supported by dataset and tool recommendations, with implications for both academic evaluation practices and industry deployment.