Anupam Kumar


2020

pdf
Linguist Geeks on WNUT-2020 Task 2: COVID-19 Informative Tweet Identification using Progressive Trained Language Models and Data Augmentation
Vasudev Awatramani | Anupam Kumar
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

Since the outbreak of COVID-19, there has been a surge of digital content on social media. The content ranges from news articles, academic reports, tweets, videos, and even memes. Among such an overabundance of data, it is crucial to distinguish which information is actually informative or merely sensational, redundant or false. This work focuses on developing such a language system that can differentiate between Informative or Uninformative tweets associated with COVID-19 for WNUT-2020 Shared Task 2. For this purpose, we employ deep transfer learning models such as BERT along other techniques such as Noisy Data Augmentation and Progress Training. The approach achieves a competitive F1-score of 0.8715 on the final testing dataset.
Search
Co-authors
Venues