HATS : Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models

Ashray Gupta, Rohan Joseph, Sunny Rai


Abstract
Analogies test a model’s ability to infer implicit relationships between concepts, making them a key benchmark for evaluating reasoning capabilities. While large language models (LLMs) are widely evaluated for reasoning in English, their abilities in Indic languages remain understudied, limiting our understanding of whether these models generalize across languages. To address this gap, we introduce a new Hindi Analogy Test Set (HATS), comprising 405 multiple-choice questions sourced from Indian government exams. We benchmark state-of-the-art multilingual LLMs using various prompting strategies and introduce a grounded Chain of Thought approach that leverages cognitive theories of analogical reasoning. This approach improves model performance on Hindi analogy questions. Our experiments show that models perform best with English prompts, irrespective of the prompting strategy. Our test set addresses the lack of a critical resource to evaluate LLM reasoning capabilities in Hindi. The test set is publicly available for research purposes here https://github.com/Inequilazitive/HATS-Hindi_Analogy_Test_Set
Anthology ID:
2025.analogyangle-1.6
Volume:
Proceedings of the 2nd Workshop on Analogical Abstraction in Cognition, Perception, and Language (Analogy-Angle II)
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Giulia Rambelli, Filip Ilievski, Marianna Bolognesi, Pia Sommerauer
Venues:
Analogy-Angle | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
57–80
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.analogyangle-1.6/
DOI:
10.18653/v1/2025.analogyangle-1.6
Bibkey:
Cite (ACL):
Ashray Gupta, Rohan Joseph, and Sunny Rai. 2025. HATS : Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models. In Proceedings of the 2nd Workshop on Analogical Abstraction in Cognition, Perception, and Language (Analogy-Angle II), pages 57–80, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
HATS : Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models (Gupta et al., Analogy-Angle 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.analogyangle-1.6.pdf