Efficient Annotator Reliability Assessment with EffiARA

Owen Cook, Jake A Vasilakes, Ian Roberts, Xingyi Song


Abstract
Data annotation is an essential component of the machine learning pipeline; it is also a costly and time-consuming process. With the introduction of transformer-based models, annotation at the document level is increasingly popular; however, there is no standard framework for structuring such tasks. The EffiARA annotation framework is, to our knowledge, the first project to support the whole annotation pipeline, from understanding the resources required for an annotation task to compiling the annotated dataset and gaining insights into the reliability of individual annotators as well as the dataset as a whole. The framework’s efficacy is supported by two previous studies: one improving classification performance through annotator-reliability-based soft-label aggregation and sample weighting, and the other increasing the overall agreement among annotators through removing identifying and replacing an unreliable annotator. This work introduces the EffiARA Python package and its accompanying webtool, which provides an accessible graphical user interface for the system. We open-source the EffiARA Python package at https://github.com/MiniEggz/EffiARA and the webtool is publicly accessible at https://effiara.gate.ac.uk.
Anthology ID:
2025.acl-demo.52
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Pushkar Mishra, Smaranda Muresan, Tao Yu
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
542–550
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.52/
DOI:
Bibkey:
Cite (ACL):
Owen Cook, Jake A Vasilakes, Ian Roberts, and Xingyi Song. 2025. Efficient Annotator Reliability Assessment with EffiARA. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 542–550, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Efficient Annotator Reliability Assessment with EffiARA (Cook et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-demo.52.pdf
Copyright agreement:
 2025.acl-demo.52.copyright_agreement.pdf