Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
Feng Gu, Zongxia Li, Carlos R. Colon, Benjamin Evans, Ishani Mondal, Jordan Lee Boyd-Graber
Abstract
Event annotation is important for identifying, monitoring, and understanding sociological trends. Although expert annotators set the gold standard, they are expensive and inefficient. While state-of-the-art NLP models are an attractive alternative, they are often evaluated on standalone subtasks rather than entire workflows. Thus, we evaluate a holistic workflow that summarizes news with event coreference resolution and argument extraction in three modes: AI-only, AI assistance, and human only. Although AI’s recall is seven times higher than the tf-idf baseline at coreference resolution, it is far from replacing experts. However, experts adopt AI-extracted arguments 60% of the time, reducing extraction time by 25%. Our code and data are in https://github.com/Obertura777/gtd-data.- Anthology ID:
- 2026.findings-acl.4
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 71–89
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.4/
- DOI:
- Cite (ACL):
- Feng Gu, Zongxia Li, Carlos R. Colon, Benjamin Evans, Ishani Mondal, and Jordan Lee Boyd-Graber. 2026. Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators. In Findings of the Association for Computational Linguistics: ACL 2026, pages 71–89, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators (Gu et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.4.pdf