Persian in a Court: Benchmarking VLMs In Persian Multi-Modal Tasks
Farhan Farsi, Shahriar Shariati Motlagh, Shayan Bali, Sadra Sabouri, Saeedeh Momtazi
Abstract
This study introduces a novel framework for evaluating Large Language Models (LLMs) and Vision-Language Models (VLMs) in Persian, a low-resource language. We develop comprehensive datasets to assess reasoning, linguistic understanding, and multimodal capabilities. Our datasets include Persian-OCR-QA for optical character recognition, Persian-VQA for visual question answering, Persian world-image puzzle for multimodal integration, Visual-Abstraction-Reasoning for abstract reasoning, and Iran-places for visual knowledge of Iranian figures and locations. We evaluate models like GPT-4o, Claude 3.5 Sonnet, and Llama 3.2 90B Vision, revealing their strengths and weaknesses in processing Persian. This research contributes to inclusive language processing by addressing the unique challenges of low-resource language evaluation.- Anthology ID:
- 2025.evalmg-1.5
- Volume:
- Proceedings of the First Workshop of Evaluation of Multi-Modal Generation
- Month:
- Jan
- Year:
- 2025
- Address:
- Abu Dhabi, UAE
- Editors:
- Wei Emma Zhang, Xiang Dai, Desmond Elliot, Byron Fang, Mongyuan Sim, Haojie Zhuang, Weitong Chen
- Venues:
- EvalMG | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 52–56
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.evalmg-1.5/
- DOI:
- Cite (ACL):
- Farhan Farsi, Shahriar Shariati Motlagh, Shayan Bali, Sadra Sabouri, and Saeedeh Momtazi. 2025. Persian in a Court: Benchmarking VLMs In Persian Multi-Modal Tasks. In Proceedings of the First Workshop of Evaluation of Multi-Modal Generation, pages 52–56, Abu Dhabi, UAE. Association for Computational Linguistics.
- Cite (Informal):
- Persian in a Court: Benchmarking VLMs In Persian Multi-Modal Tasks (Farsi et al., EvalMG 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.evalmg-1.5.pdf