The Elsagate Corpus: Characterising Commentary on Alarming Video Content

Panagiotis Soustas, Matthew Edwards


Abstract
Identifying disturbing online content being targeted at children is an important content moderation problem. However, previous approaches to this problem have focused on features of the content itself, and neglected potentially helpful insights from the reactions expressed by its online audience. To help remedy this, we present the Elsagate Corpus, a collection of over 22 million comments on more than 18,000 videos that have been associated with disturbing content. We describe the how we collected this corpus and present some insights from our initial explorations, including the surprisingly positive reactions from audiences to this content, some unusual non-linguistic commenting behavior of uncertain purpose and references to some concerning themes.
Anthology ID:
2024.nlpaics-1.17
Volume:
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
Month:
July
Year:
2024
Address:
Lancaster, UK
Editors:
Ruslan Mitkov, Saad Ezzini, Tharindu Ranasinghe, Ignatius Ezeani, Nouran Khallaf, Cengiz Acarturk, Matthew Bradbury, Mo El-Haj, Paul Rayson
Venue:
NLPAICS
SIG:
Publisher:
International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
Note:
Pages:
147–152
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1.17/
DOI:
Bibkey:
Cite (ACL):
Panagiotis Soustas and Matthew Edwards. 2024. The Elsagate Corpus: Characterising Commentary on Alarming Video Content. In Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security, pages 147–152, Lancaster, UK. International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security.
Cite (Informal):
The Elsagate Corpus: Characterising Commentary on Alarming Video Content (Soustas & Edwards, NLPAICS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1.17.pdf