Human-AI Annotation Error Auditing for Hebrew Diacritization with Frontier LLMs

Hillel Gershuni; Avi Shmidman

Human-AI Annotation Error Auditing for Hebrew Diacritization with Frontier LLMs

Abstract

Large annotated datasets inevitably contain errors that are costly to identify via manual review. We study a human-AI annotation error auditing workflow using frontier Large Language Models (LLMs), focusing on Hebrew nikud (diacritization). We take the the EACL 2023 Hebrew Homograph Challenge Set as our test case. In a focused evaluation on 12 of the homograph sets with 271 confirmed errors (verified through exhaustive manual review of all 7,241 sentences), Gemini 3 Pro achieves 83.6% recall (95% confidence interval: [79.3%, 88.2%]) and 99.1% precision - substantially higher than other frontier LLMs. Two independent human experts achieved 62.4% and 42.8% recall respectively, a 20-percentage-point spread that reflects the difficulty of sparse-target error search. Even the union of both experts’ findings (73.4% recall) falls short of a single LLM run (83.6%), while LLM-aided auditing reduces review effort by over 95%. We analyze the trade-offs between batch size and recall, and release both a human-verified Gold Standard with per-error difficulty annotations and a globally corrected version of the Challenge Set.

Anthology ID:: 2026.law-main.4
Volume:: Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yang Janet Liu, Luke Gessler
Venues:: LAW | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33–46
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.4/
DOI:
Bibkey:
Cite (ACL):: Hillel Gershuni and Avi Shmidman. 2026. Human-AI Annotation Error Auditing for Hebrew Diacritization with Frontier LLMs. In Proceedings of the 20th Linguistic Annotation Workshop (LAW XX), pages 33–46, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Human-AI Annotation Error Auditing for Hebrew Diacritization with Frontier LLMs (Gershuni & Shmidman, LAW 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.4.pdf

PDF Cite Search Fix data