Kang Wang

2025

Current fine-grained error analyses by LLMs gain more and more attention in machine translation, but these analyses do not ground the errors to the reasons why the annotated text spans are erroneous. If LLMs do not know such reasons, the corrections or refinements by LLMs will be untrustworthy.In this paper, we check whether LLMs know such reasons in translation error grounding task. We manually build an evaluation resource through a bi-directional grounding scheme. In the forward direction, we annotate the explanation of the reason for each error span. In the backward direction, we annotate the error span given its explanation, in which the error span is masked. If the error spans of both directions are consistent, we deem the explanation is valid. Such grounding process can regulate the explanation so as to avoid the subjective bias. The evaluation results on this resource show that LLMs perform significantly worse than human in both directions. Furthermore, we apply the error grounding for filtering false alarmed errors, and achieve significant improvement in translation error detection.

Co-authors

Min Zhang (张民) 1

Zixuan Zhou 1

Venues

findings1

Fix author