KRX Bench: Automating Financial Benchmark Creation via Large Language Models

Guijin Son, Hyunjun Jeon, Chami Hwang, Hanearl Jung


Abstract
In this work, we introduce KRX-Bench, an automated pipeline for creating financial benchmarks via GPT-4. To demonstrate the effectiveness of the pipeline, we create KRX-Bench-POC, a benchmark assessing the knowledge of LLMs in real-world companies. This dataset comprises 1,002 questions, each focusing on companies across the U.S., Japanese, and Korean stock markets. We make our pipeline and dataset publicly available and integrate the evaluation code into EleutherAI’s Language Model Evaluation Harness.
Anthology ID:
2024.finnlp-1.2
Volume:
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Chung-Chi Chen, Xiaomo Liu, Udo Hahn, Armineh Nourbakhsh, Zhiqiang Ma, Charese Smiley, Veronique Hoste, Sanjiv Ranjan Das, Manling Li, Mohammad Ghassemi, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venue:
FinNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10–20
Language:
URL:
https://aclanthology.org/2024.finnlp-1.2
DOI:
Bibkey:
Cite (ACL):
Guijin Son, Hyunjun Jeon, Chami Hwang, and Hanearl Jung. 2024. KRX Bench: Automating Financial Benchmark Creation via Large Language Models. In Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing, pages 10–20, Torino, Italia. Association for Computational Linguistics.
Cite (Informal):
KRX Bench: Automating Financial Benchmark Creation via Large Language Models (Son et al., FinNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.finnlp-1.2.pdf