PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence Completion

Morteza Alikhani, Mohammadtaha Bagherifard, Erfan Zinvandi, Mehran Sarmadi


Abstract
We introduced PerCoR—Persian Commonsense Reasoning—the first large-scale Persian benchmark for commonsense reasoning. PerCoR contains 106K multiple-choice sentence-completion problems drawn from more than forty news, cultural and other web sources. We adopt a linguistically grounded, conjunction-based segmentation strategy to generate coherent prefix–continuation pairs. To create challenging distractors, we propose DRESS-AF—Distractor Ranking via Embedding Similarity Scoring and Adversarial Filtering—a generation-free adversarial filtering method that selects distractors from the pool of gold continuations while maximising model confusion. Human annotators score 89% on PerCoR, while OpenAI-o3 achieves the highest performance at 92.18%, followed closely by Claude-Sonnet-3.7 (91.17%). The strongest open-source model, DeepSeek-R1, reaches 82.51%, underscoring both the dataset’s difficulty and the remaining performance gap in Persian commonsense reasoning. We further show that DRESS-AF transfers to the English HellaSwag benchmark, increasing its difficulty without hurting human solvability. The dataset is available at https://huggingface.co/datasets/MCINext/PerCoR .
Anthology ID:
2025.ijcnlp-long.120
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venues:
IJCNLP | AACL
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
2205–2224
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.120/
DOI:
Bibkey:
Cite (ACL):
Morteza Alikhani, Mohammadtaha Bagherifard, Erfan Zinvandi, and Mehran Sarmadi. 2025. PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence Completion. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 2205–2224, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence Completion (Alikhani et al., IJCNLP-AACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.120.pdf