IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages

Priyaranjan Pattnayak, Sanchari Chowdhuri


Abstract
Safety alignment of large language models (LLMs) is mostly evaluated in English and contract-bound, leaving multilingual vulnerabilities understudied. We introduce Indic Jailbreak Robustness (IJR) a judge-free benchmark for adversarial safety across 12 Indic and South Asian languages (~2.09B speakers), covering 45,216 prompts in JSON (contract-bound) and Free (naturalistic) tracks.IJR reveals three patterns. (1) Contracts inflate refusals but do not stop jailbreaks: in JSON, LLaMA and Sarvam exceed 0.92 JSR, and in Free all models reach ~1.0 with refusals collapsing. (2) English→Indic attacks transfer strongly, with format wrappers often outperforming instruction wrappers. (3) Orthography matters: romanized/mixed inputs reduce JSR under JSON, with correlations to romanization share and tokenization ρ ≈ 0.28–0.32 indicating systematic effects. Human audits confirm detector reliability, and lite-to-full comparisons preserve conclusions. IJR offers a reproducible multilingual stress test revealing risks hidden by English-only, contract-focused evaluations, especially for South Asian users who frequently code-switch and romanize.
Anthology ID:
2026.eacl-industry.50
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
649–668
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.50/
DOI:
Bibkey:
Cite (ACL):
Priyaranjan Pattnayak and Sanchari Chowdhuri. 2026. IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 649–668, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages (Pattnayak & Chowdhuri, EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.50.pdf