Abstract
We investigate annotator variation for the novel task of Entity-Level Sentiment Analysis (ELSA) which annotates the aggregated sentiment directed towards volitional entities in a text. More specifically, we analyze the annotations of a newly constructed Norwegian ELSA dataset and release additional data with each annotator’s labels for the 247 entities in the dataset’s test split. We also perform a number of experiments prompting ChatGPT for these sentiment labels regarding each entity in the text and compare the generated annotations with the human labels. Cohen’s Kappa for agreement between the best LLM-generated labels and curated gold was 0.425, which indicates that these labels would not have high quality. Our analyses further investigate the errors that ChatGPT outputs, and compare them with the variations that we find among the 5 trained annotators that all annotated the same test data.- Anthology ID:
- 2024.law-1.13
- Volume:
- Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julians, Malta
- Editors:
- Sophie Henning, Manfred Stede
- Venues:
- LAW | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 133–139
- Language:
- URL:
- https://aclanthology.org/2024.law-1.13
- DOI:
- Cite (ACL):
- Egil Rønningstad, Erik Velldal, and Lilja Øvrelid. 2024. A GPT among Annotators: LLM-based Entity-Level Sentiment Annotation. In Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII), pages 133–139, St. Julians, Malta. Association for Computational Linguistics.
- Cite (Informal):
- A GPT among Annotators: LLM-based Entity-Level Sentiment Annotation (Rønningstad et al., LAW-WS 2024)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2024.law-1.13.pdf