RE-AD: Real-Time Requirement Adherence for Data Labeling

Siddarth Malreddy, Ishan Nigam, Akshay Arora, Nikhil Mittal, Subrat Sahu


Abstract
Human-annotated data remains fundamental to training frontier Large Language Models (LLMs). However, crowd-sourced annotations often suffer from quality issues stemming from annotator misunderstanding or lack of engagement. To address this, we introduce a real-time requirement adherence (RE-AD) framework that leverages LLMs to proactively validate labeling quality. Our methodology involves decomposing Standard Operating Procedures (SOPs) into atomic rules via self-reflection, categorizing them by complexity, and applying tiered validation strategies. Evaluated on a synthetic benchmark, the system achieved an F1 score of 0.749. Furthermore, production deployment resulted in annotators accepting and fixing 82% of the errors flagged by the framework. We include ablation studies to demonstrate the impact of our core design decisions.
Anthology ID:
2026.gem-main.17
Volume:
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Simon Mille, Sebastian Gehrmann, Patrícia Schmidtová, Ondřej Dušek, Marzieh Fadaee, Kyle Lo, Enrico Santus, Gabriel Stanovsky
Venues:
GEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
148–154
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.gem-main.17/
DOI:
Bibkey:
Cite (ACL):
Siddarth Malreddy, Ishan Nigam, Akshay Arora, Nikhil Mittal, and Subrat Sahu. 2026. RE-AD: Real-Time Requirement Adherence for Data Labeling. In Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM), pages 148–154, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
RE-AD: Real-Time Requirement Adherence for Data Labeling (Malreddy et al., GEM 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.gem-main.17.pdf