Quality Estimation and Post-Editing Using LLMs For Indic Languages: How Good Is It?

Anushka Singh; Aarya Pakhale; Mitesh M. Khapra; Raj Dabre

Quality Estimation and Post-Editing Using LLMs For Indic Languages: How Good Is It?

Anushka Singh, Aarya Pakhale, Mitesh M. Khapra, Raj Dabre

Abstract

Recently, there have been increasing efforts on Quality Estimation (QE) and Post-Editing (PE) using Large Language Models (LLMs) for Machine Translation (MT). However, the focus has mainly been on high resource languages and the approaches either rely on prompting or combining existing QE models with LLMs, instead of single end-to-end systems. In this paper, we investigate the efficacy of end-to-end QE and PE systems for low-resource languages taking 5 Indian languages as a use-case. We augment existing QE data containing multidimentional quality metric (MQM) error annotations with explanations of errors and PEs with the help of proprietary LLMs (GPT-4), following which we fine-tune Gemma-2-9B, an open-source multilingual LLM to perform QE and PE jointly. While our models attain QE capabilities competitive with or surpassing existing models in both referenceful and referenceless settings, we observe that they still struggle with PE. Further investigation reveals that this occurs because our models lack the ability to accurately identify fine-grained errors in the translation, despite being excellent indicators of overall quality. This opens up opportunities for research in end-to-end QE and PE for low-resource languages.

Anthology ID:: 2025.mtsummit-1.30
Volume:: Proceedings of Machine Translation Summit XX: Volume 1
Month:: June
Year:: 2025
Address:: Geneva, Switzerland
Editors:: Pierrette Bouillon, Johanna Gerlach, Sabrina Girletti, Lise Volkart, Raphael Rubino, Rico Sennrich, Ana C. Farinha, Marco Gaido, Joke Daems, Dorothy Kenny, Helena Moniz, Sara Szoc
Venue:: MTSummit
SIG:
Publisher:: European Association for Machine Translation
Note:
Pages:: 388–398
Language:
URL:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-1.30/
DOI:
Bibkey:
Cite (ACL):: Anushka Singh, Aarya Pakhale, Mitesh M. Khapra, and Raj Dabre. 2025. Quality Estimation and Post-Editing Using LLMs For Indic Languages: How Good Is It?. In Proceedings of Machine Translation Summit XX: Volume 1, pages 388–398, Geneva, Switzerland. European Association for Machine Translation.
Cite (Informal):: Quality Estimation and Post-Editing Using LLMs For Indic Languages: How Good Is It? (Singh et al., MTSummit 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-1.30.pdf

PDF Cite Search Fix data