Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo

Yong Hu, Heyan Huang, Anfan Chen, Xian-Ling Mao


Abstract
With the rapid development of COVID-19 around the world, people are requested to maintain “social distance” and “stay at home”. In this scenario, extensive social interactions transfer to cyberspace, especially on social media platforms like Twitter and Sina Weibo. People generate posts to share information, express opinions and seek help during the pandemic outbreak, and these kinds of data on social media are valuable for studies to prevent COVID-19 transmissions, such as early warning and outbreaks detection. Therefore, in this paper, we release a novel and fine-grained large-scale COVID-19 social media dataset collected from Sina Weibo, named Weibo-COV, contains more than 40 million posts ranging from December 1, 2019 to April 30, 2020. Moreover, this dataset includes comprehensive information nuggets like post-level information, interactive information, location information, and repost network. We hope this dataset can promote studies of COVID-19 from multiple perspectives and enable better and rapid researches to suppress the spread of this pandemic.
Anthology ID:
2020.nlpcovid19-2.34
Volume:
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Month:
December
Year:
2020
Address:
Online
Editors:
Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, Byron Wallace
Venue:
NLP-COVID19
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2020.nlpcovid19-2.34
DOI:
10.18653/v1/2020.nlpcovid19-2.34
Bibkey:
Cite (ACL):
Yong Hu, Heyan Huang, Anfan Chen, and Xian-Ling Mao. 2020. Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
Cite (Informal):
Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo (Hu et al., NLP-COVID19 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2020.nlpcovid19-2.34.pdf
Code
 nghuyong/weibo-public-opinion-datasets
Data
Weibo-COV