UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network

Khiem Tran, Hao Phan, Kiet Nguyen, Ngan Luu Thuy Nguyen


Abstract
Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences. More and more tweets about COVID-19 has been shared publicly on Twitter. However, the plurality of those Tweets are uninformative, which is challenging to build automatic systems to detect the informative ones for useful AI applications. In this paper, we present our results at the W-NUT 2020 Shared Task 2: Identification of Informative COVID-19 English Tweets. In particular, we propose our simple but effective approach using the transformer-based models based on COVID-Twitter-BERT (CT-BERT) with different fine-tuning techniques. As a result, we achieve the F1-Score of 90.94% with the third place on the leaderboard of this task which attracted 56 submitted teams in total.
Anthology ID:
2020.wnut-1.53
Volume:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
383–387
Language:
URL:
https://aclanthology.org/2020.wnut-1.53
DOI:
10.18653/v1/2020.wnut-1.53
Bibkey:
Cite (ACL):
Khiem Tran, Hao Phan, Kiet Nguyen, and Ngan Luu Thuy Nguyen. 2020. UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 383–387, Online. Association for Computational Linguistics.
Cite (Informal):
UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network (Tran et al., WNUT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.wnut-1.53.pdf
Data
WNUT-2020 Task 2