Abstract
Recent studies focus on exploring the capability of Large Language Models (LLMs) for data annotation. Our work, firstly, offers a comparative overview of twelve such studies that investigate labelling with LLMs, particularly focusing on classification tasks. Secondly, we present an empirical analysis that examines the degree of alignment between the opinion distributions returned by GPT and those provided by human annotators across four subjective datasets. Our analysis supports a minority of studies that are considering diverse perspectives when evaluating data annotation tasks and highlights the need for further research in this direction.- Anthology ID:
- 2024.nlperspectives-1.11
- Volume:
- Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Gavin Abercrombie, Valerio Basile, Davide Bernadi, Shiran Dudy, Simona Frenda, Lucy Havens, Sara Tonelli
- Venues:
- NLPerspectives | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 100–110
- Language:
- URL:
- https://aclanthology.org/2024.nlperspectives-1.11
- DOI:
- Cite (ACL):
- Maja Pavlovic and Massimo Poesio. 2024. The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation. In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024, pages 100–110, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation (Pavlovic & Poesio, NLPerspectives-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.nlperspectives-1.11.pdf