WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets

Dat Quoc Nguyen, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, Long Doan


Abstract
In this paper, we provide an overview of the WNUT-2020 shared task on the identification of informative COVID-19 English Tweets. We describe how we construct a corpus of 10K Tweets and organize the development and evaluation phases for this task. In addition, we also present a brief summary of results obtained from the final system evaluation submissions of 55 teams, finding that (i) many systems obtain very high performance, up to 0.91 F1 score, (ii) the majority of the submissions achieve substantially higher results than the baseline fastText (Joulin et al., 2017), and (iii) fine-tuning pre-trained language models on relevant language data followed by supervised training performs well in this task.
Anthology ID:
2020.wnut-1.41
Volume:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:
November
Year:
2020
Address:
Online
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
314–318
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2020.wnut-1.41/
DOI:
10.18653/v1/2020.wnut-1.41
Bibkey:
Cite (ACL):
Dat Quoc Nguyen, Thanh Vu, Afshin Rahimi, Mai Hoang Dao, Linh The Nguyen, and Long Doan. 2020. WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 314–318, Online. Association for Computational Linguistics.
Cite (Informal):
WNUT-2020 Task 2: Identification of Informative COVID-19 English Tweets (Nguyen et al., WNUT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2020.wnut-1.41.pdf
Data
WNUT-2020 Task 2