Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Feng Gu; Zongxia Li; Carlos R. Colon; Benjamin Evans; Ishani Mondal; Jordan Lee Boyd-Graber

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Feng Gu, Zongxia Li, Carlos R. Colon, Benjamin Evans, Ishani Mondal, Jordan Lee Boyd-Graber

Abstract

Event annotation is important for identifying, monitoring, and understanding sociological trends. Although expert annotators set the gold standard, they are expensive and inefficient. While state-of-the-art NLP models are an attractive alternative, they are often evaluated on standalone subtasks rather than entire workflows. Thus, we evaluate a holistic workflow that summarizes news with event coreference resolution and argument extraction in three modes: AI-only, AI assistance, and human only. Although AI’s recall is seven times higher than the tf-idf baseline at coreference resolution, it is far from replacing experts. However, experts adopt AI-extracted arguments 60% of the time, reducing extraction time by 25%. Our code and data are in https://github.com/Obertura777/gtd-data.

Anthology ID:: 2026.findings-acl.4
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 71–89
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.4/
DOI:
Bibkey:
Cite (ACL):: Feng Gu, Zongxia Li, Carlos R. Colon, Benjamin Evans, Ishani Mondal, and Jordan Lee Boyd-Graber. 2026. Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators. In Findings of the Association for Computational Linguistics: ACL 2026, pages 71–89, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators (Gu et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.4.pdf
Checklist:: 2026.findings-acl.4.checklist.pdf

PDF Cite Search Checklist Fix data