Justin Sech


2020

pdf
Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest
Justin Sech | Alexandra DeLucia | Anna L. Buczak | Mark Dredze
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

We present CUT, a dataset for studying Civil Unrest on Twitter. Our dataset includes 4,381 tweets related to civil unrest, hand-annotated with information related to the study of civil unrest discussion and events. Our dataset is drawn from 42 countries from 2014 to 2019. We present baseline systems trained on this data for the identification of tweets related to civil unrest. We include a discussion of ethical issues related to research on this topic.