Hyunjun Jeon


2024

In this work, we introduce KRX-Bench, an automated pipeline for creating financial benchmarks via GPT-4. To demonstrate the effectiveness of the pipeline, we create KRX-Bench-POC, a benchmark assessing the knowledge of LLMs in real-world companies. This dataset comprises 1,002 questions, each focusing on companies across the U.S., Japanese, and Korean stock markets. We make our pipeline and dataset publicly available and integrate the evaluation code into EleutherAI’s Language Model Evaluation Harness.