LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation
Samar M. Magdy, Fakhraddin Alwajih, Abdellah EL Mekki, Wesam El-Sayed, Muhammad Abdul-Mageed
Abstract
Existing MT evaluation frameworks, including automatic metrics and human evaluation schemes such as Multidimensional Quality Metrics (MQM), are largely language-agnostic. However, they often fail to capture dialect- and culture-specific errors in diglossic languages (e.g., Arabic), where translation failures stem from mismatches in language variety, content coverage, and pragmatic appropriateness rather than surface form alone.We introduce LQM: Linguistically Motivated Multidimensional Quality Metrics for MT. LQM is a hierarchical error taxonomy for diagnosing MT errors through six linguistically grounded levels: sociolinguistics, pragmatics, semantics, morphosyntax, orthography, and graphetics (Figure 1).We construct a bidirectional parallel corpus of 3,850 sentences (550 per variety) spanning seven Arabic dialects (Egyptian, Emirati, Jordanian, Mauritanian, Moroccan, Palestinian, and Yemeni), derived from conversational, culturally rich content. We evaluate six LLMs in a zero-shot setting and conduct expert span-level human annotation using LQM, producing 6,113 labeled error spans across 3,495 unique erroneous sentences, along with severity-weighted quality scores. We complement this analysis with an automatic metric (spBLEU). Though validated here on Arabic, LQM is a language-agnostic framework designed to be easily applied to or adapted for other languages. LQM annotated errors data, prompts, and annotation guidelines are publicly available at https://github.com/UBC-NLP/LQM_MT- Anthology ID:
- 2026.findings-acl.2012
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 40470–40493
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2012/
- DOI:
- Cite (ACL):
- Samar M. Magdy, Fakhraddin Alwajih, Abdellah EL Mekki, Wesam El-Sayed, and Muhammad Abdul-Mageed. 2026. LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 40470–40493, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation (Magdy et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2012.pdf