Abstract
We present CUT, a dataset for studying Civil Unrest on Twitter. Our dataset includes 4,381 tweets related to civil unrest, hand-annotated with information related to the study of civil unrest discussion and events. Our dataset is drawn from 42 countries from 2014 to 2019. We present baseline systems trained on this data for the identification of tweets related to civil unrest. We include a discussion of ethical issues related to research on this topic.- Anthology ID:
- 2020.wnut-1.28
- Volume:
- Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 215–221
- Language:
- URL:
- https://aclanthology.org/2020.wnut-1.28
- DOI:
- 10.18653/v1/2020.wnut-1.28
- Cite (ACL):
- Justin Sech, Alexandra DeLucia, Anna L. Buczak, and Mark Dredze. 2020. Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 215–221, Online. Association for Computational Linguistics.
- Cite (Informal):
- Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest (Sech et al., WNUT 2020)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2020.wnut-1.28.pdf
- Code
- aadelucia/jhu-cut