imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A multilabel BERT-based system for predicting COVID-19 events

Xiangyu Yang; Giannis Bekoulis; Nikos Deligiannis

doi:10.18653/v1/2020.wnut-1.77

imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A multilabel BERT-based system for predicting COVID-19 events

Xiangyu Yang, Giannis Bekoulis, Nikos Deligiannis

Abstract

In this paper, we present our system designed to address the W-NUT 2020 shared task for COVID-19 Event Extraction from Twitter. To mitigate the noisy nature of the Twitter stream, our system makes use of the COVID-Twitter-BERT (CT-BERT), which is a language model pre-trained on a large corpus of COVID-19 related Twitter messages. Our system is trained on the COVID-19 Twitter Event Corpus and is able to identify relevant text spans that answer pre-defined questions (i.e., slot types) for five COVID-19 related events (i.e., TESTED POSITIVE, TESTED NEGATIVE, CAN-NOT-TEST, DEATH and CURE & PREVENTION). We have experimented with different architectures; our best performing model relies on a multilabel classifier on top of the CT-BERT model that jointly trains all the slot types for a single event. Our experimental results indicate that our Multilabel-CT-BERT system outperforms the baseline methods by 7 percentage points in terms of micro average F1 score. Our model ranked as 4th in the shared task leaderboard.

Anthology ID:: 2020.wnut-1.77
Volume:: Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:: November
Year:: 2020
Address:: Online
Editors:: Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 505–513
Language:
URL:: https://aclanthology.org/2020.wnut-1.77
DOI:: 10.18653/v1/2020.wnut-1.77
Bibkey:
Cite (ACL):: Xiangyu Yang, Giannis Bekoulis, and Nikos Deligiannis. 2020. imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A multilabel BERT-based system for predicting COVID-19 events. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 505–513, Online. Association for Computational Linguistics.
Cite (Informal):: imec-ETRO-VUB at W-NUT 2020 Shared Task-3: A multilabel BERT-based system for predicting COVID-19 events (Yang et al., WNUT 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-1/2020.wnut-1.77.pdf

PDF Search