Workshop on NLP for COVID-19 (2020)


bib (full) Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020

pdf bib
Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020
Karin Verspoor | Kevin Bretonnel Cohen | Mark Dredze | Emilio Ferrara | Jonathan May | Robert Munro | Cecile Paris | Byron Wallace

pdf bib
CORD-19: The COVID-19 Open Research Dataset
Lucy Lu Wang | Kyle Lo | Yoganand Chandrasekhar | Russell Reas | Jiangjiang Yang | Doug Burdick | Darrin Eide | Kathryn Funk | Yannis Katsis | Rodney Michael Kinney | Yunyao Li | Ziyang Liu | William Merrill | Paul Mooney | Dewey A. Murdick | Devvret Rishi | Jerry Sheehan | Zhihong Shen | Brandon Stilson | Alex D. Wade | Kuansan Wang | Nancy Xin Ru Wang | Christopher Wilhelm | Boya Xie | Douglas M. Raymond | Daniel S. Weld | Oren Etzioni | Sebastian Kohlmeier

The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many COVID-19 text mining and discovery systems. In this article, we describe the mechanics of dataset construction, highlighting challenges and key design decisions, provide an overview of how CORD-19 has been used, and describe several shared tasks built around the dataset. We hope this resource will continue to bring together the computing community, biomedical experts, and policy makers in the search for effective treatments and management policies for COVID-19.

pdf bib
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Rodrigo Nogueira | Kyunghyun Cho | Jimmy Lin

The Neural Covidex is a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset (CORD-19) curated by the Allen Institute for AI. It exists as part of a suite of tools we have developed to help domain experts tackle the ongoing global pandemic. We hope that improved information access capabilities to the scientific literature can inform evidence-based decision making and insight generation.

Document Classification for COVID-19 Literature
Bernal Jiménez Gutiérrez | Juncheng Zeng | Dongdong Zhang | Ping Zhang | Yu Su

The global pandemic has made it more important than ever to quickly and accurately retrieve relevant scientific literature for effective consumption by researchers in a wide range of fields. We provide an analysis of several multi-label document classification models on the LitCovid dataset. We find that pre-trained language models outperform other models in both low and high data regimes, achieving a maximum F1 score of around 86%. We note that even the highest performing models still struggle with label correlation, distraction from introductory text and CORD-19 generalization. Both data and code are available on GitHub.

Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-Training
Alexander Spangher | Nanyun Peng | Jonathan May | Emilio Ferrara

Self-supervised context-aware COVID-19 document exploration through atlas grounding
Dusan Grujicic | Gorjan Radevski | Tinne Tuytelaars | Matthew Blaschko

In this paper, we aim to develop a self-supervised grounding of Covid-related medical text based on the actual spatial relationships between the referred anatomical concepts. More specifically, we learn to project sentences into a physical space defined by a three-dimensional anatomical atlas, allowing for a visual approach to navigating Covid-related literature. We design a straightforward and empirically effective training objective to reduce the curated data dependency issue. We use BERT as the main building block of our model and perform a quantitative analysis that demonstrates that the model learns a context-aware mapping. We illustrate two potential use-cases for our approach, one in interactive, 3D data exploration, and the other in document retrieval. To accelerate research in this direction, we make public all trained models, codebase and the developed tools, which can be accessed at

CODA-19: Using a Non-Expert Crowd to Annotate Research Aspects on 10,000+ Abstracts in the COVID-19 Open Research Dataset
Ting-Hao Kenneth Huang | Chieh-Yang Huang | Chien-Kuang Cornelia Ding | Yen-Chia Hsu | C. Lee Giles

This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, and achieved labeling quality comparable to that of experts. Each abstract was annotated by nine different workers, and the final labels were acquired by majority vote. The inter-annotator agreement (Cohen’s kappa) between the crowd and the biomedical expert (0.741) is comparable to inter-expert agreement (0.788). CODA-19’s labels have an accuracy of 82.2% when compared to the biomedical expert’s labels, while the accuracy between experts was 85.0%. Reliable human annotations help scientists access and integrate the rapidly accelerating coronavirus literature, and also serve as the battery of AI/NLP research, but obtaining expert annotations can be slow. We demonstrated that a non-expert crowd can be rapidly employed at scale to join the fight against COVID-19.

Information Retrieval and Extraction on COVID-19 Clinical Articles Using Graph Community Detection and Bio-BERT Embeddings
Debasmita Das | Yatin Katyal | Janu Verma | Shashank Dubey | AakashDeep Singh | Kushagra Agarwal | Sourojit Bhaduri | RajeshKumar Ranjan

In this paper, we present an information retrieval system on a corpus of scientific articles related to COVID-19. We build a similarity network on the articles where similarity is determined via shared citations and biological domain-specific sentence embeddings. Ego-splitting community detection on the article network is employed to cluster the articles and then the queries are matched with the clusters. Extractive summarization using BERT and PageRank methods is used to provide responses to the query. We also provide a Question-Answer bot on a small set of intents to demonstrate the efficacy of our model for an information extraction module.

What Are People Asking About COVID-19? A Question Classification Dataset
Jerry Wei | Chengyu Huang | Soroush Vosoughi | Jason Wei

We present COVID-Q, a set of 1,690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters. The most common questions in our dataset asked about transmission, prevention, and societal effects of COVID, and we found that many questions that appeared in multiple sources were not answered by any FAQ websites of reputable organizations such as the CDC and FDA. We post our dataset publicly at For classifying questions into 15 categories, a BERT baseline scored 58.1% accuracy when trained on 20 examples per category, and for a question clustering task, a BERT + triplet loss baseline achieved 49.5% accuracy. We hope COVID-Q can help either for direct use in developing applied systems or as a domain-specific resource for model evaluation.

Jennifer for COVID-19: An NLP-Powered Chatbot Built for the People and by the People to Combat Misinformation
Yunyao Li | Tyrone Grandison | Patricia Silveyra | Ali Douraghy | Xinyu Guan | Thomas Kieselbach | Chengkai Li | Haiqi Zhang

Just as SARS-CoV-2, a new form of coronavirus continues to infect a growing number of people around the world, harmful misinformation about the outbreak also continues to spread. With the goal of combating misinformation, we designed and built Jennifer–a chatbot maintained by a global group of volunteers. With Jennifer, we hope to learn whether public information from reputable sources could be more effectively organized and shared in the wake of a crisis as well as to understand issues that the public were most immediately curious about. In this paper, we introduce Jennifer and describe the design of this proof-of-principle system. We also present lessons learned and discuss open challenges. Finally, to facilitate future research, we release COVID-19 Question Bank, a dataset of 3,924 COVID-19-related questions in 944 groups, gathered from our users and volunteers.

A Natural Language Processing System for National COVID-19 Surveillance in the US Department of Veterans Affairs
Alec Chapman | Kelly Peterson | Augie Turano | Tamára Box | Katherine Wallace | Makoto Jones

Timely and accurate accounting of positive cases has been an important part of the response to the COVID-19 pandemic. While most positive cases within Veterans Affairs (VA) are identified through structured laboratory results, some patients are tested or diagnosed outside VA so their clinical status is documented only in free-text narratives. We developed a Natural Language Processing pipeline for identifying positively diagnosed COVID19 patients and deployed this system to accelerate chart review. As part of the VA national response to COVID-19, this process identified 6,360 positive cases which did not have corresponding laboratory data. These cases accounted for 36.1% of total confirmed positive cases in VA to date. With available data, performance of the system is estimated as 82.4% precision and 94.2% recall. A public-facing implementation is released as open source and available to the community.

Measuring Emotions in the COVID-19 Real World Worry Dataset
Bennett Kleinberg | Isabelle van der Vegt | Maximilian Mozes

The COVID-19 pandemic is having a dramatic impact on societies and economies around the world. With various measures of lockdowns and social distancing in place, it becomes important to understand emotional responses on a large scale. In this paper, we present the first ground truth dataset of emotional responses to COVID-19. We asked participants to indicate their emotions and express these in text. This resulted in the Real World Worry Dataset of 5,000 texts (2,500 short + 2,500 long texts). Our analyses suggest that emotional responses correlated with linguistic measures. Topic modeling further revealed that people in the UK worry about their family and the economic situation. Tweet-sized texts functioned as a call for solidarity, while longer texts shed light on worries and concerns. Using predictive modeling approaches, we were able to approximate the emotional responses of participants from text within 14% of their actual value. We encourage others to use the dataset and improve how we can use automated methods to learn about emotional responses and worries about an urgent problem.

Estimating the effect of COVID-19 on mental health: Linguistic indicators of depression during a global pandemic
JT Wolohan

This preliminary analysis uses a deep LSTM neural network with fastText embeddings to predict population rates of depression on Reddit in order to estimate the effect of COVID-19 on mental health. We find that year over year, depression rates on Reddit are up 50% , suggesting a 15-million person increase in the number of depressed Americans and a $7.5 billion increase in depression related spending. This finding suggests that utility in NLP approaches to longitudinal public-health surveillance.

Exploration of Gender Differences in COVID-19 Discourse on Reddit
Jai Aggarwal | Ella Rabinovich | Suzanne Stevenson

Decades of research on differences in the language of men and women have established postulates about the nature of lexical, topical, and emotional preferences between the two genders, along with their sociological underpinnings. Using a novel dataset of male and female linguistic productions collected from the Reddit discussion platform, we further confirm existing assumptions about gender-linked affective distinctions, and demonstrate that these distinctions are amplified in social media postings involving emotionally-charged discourse related to COVID-19. Our analysis also confirms considerable differences in topical preferences between male and female authors in pandemic-related discussions.

Cross-language sentiment analysis of European Twitter messages during the COVID-19 pandemic
Anna Kruspe | Matthias Häberle | Iona Kuhn | Xiao Xiang Zhu

In this paper, we analyze Twitter messages (tweets) collected during the first months of the COVID-19 pandemic in Europe with regard to their sentiment. This is implemented with a neural network for sentiment analysis using multilingual sentence embeddings. We separate the results by country of origin, and correlate their temporal development with events in those countries. This allows us to study the effect of the situation on people’s moods. We see, for example, that lockdown announcements correlate with a deterioration of mood in almost all surveyed countries, which recovers within a short time span.

Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment
Sharon Levy | William Yang Wang

The spread of COVID-19 has become a significant and troubling aspect of society in 2020. With millions of cases reported across countries, new outbreaks have occurred and followed patterns of previously affected areas. Many disease detection models do not incorporate the wealth of social media data that can be utilized for modeling and predicting its spread. It is useful to ask, can we utilize this knowledge in one country to model the outbreak in another? To answer this, we propose the task of cross-lingual transfer learning for epidemiological alignment. Utilizing both macro and micro text features, we train on Italy’s early COVID-19 outbreak through Twitter and transfer to several other countries. Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.

COVID-19 and Arabic Twitter: How can Arab World Governments and Public Health Organizations Learn from Social Media?
Lama Alsudias | Paul Rayson

In March 2020, the World Health Organization announced the COVID-19 outbreak as a pandemic. Most previous social media related research has been on English tweets and COVID-19. In this study, we collect approximately 1 million Arabic tweets from the Twitter streaming API related to COVID-19. Focussing on outcomes that we believe will be useful for Public Health Organizations, we analyse them in three different ways: identifying the topics discussed during the period, detecting rumours, and predicting the source of the tweets. We use the k-means algorithm for the first goal with k=5. The topics discussed can be grouped as follows: COVID-19 statistics, prayers for God, COVID-19 locations, advise and education for prevention, and advertising. We sample 2000 tweets and label them manually for false information, correct information, and unrelated. Then, we apply three different machine learning algorithms, Logistic Regression, Support Vector Classification, and Naïve Bayes with two sets of features, word frequency approach and word embeddings. We find that Machine Learning classifiers are able to correctly identify the rumour related tweets with 84% accuracy. We also try to predict the source of the rumour related tweets depending on our previous model which is about classifying tweets into five categories: academic, media, government, health professional, and public. Around (60%) of the rumour related tweets are classified as written by health professionals and academics.

NLP-based Feature Extraction for the Detection of COVID-19 Misinformation Videos on YouTube
Juan Carlos Medina Serrano | Orestis Papakyriakopoulos | Simon Hegelich

We present a simple NLP methodology for detecting COVID-19 misinformation videos on YouTube by leveraging user comments. We use transfer learning pre-trained models to generate a multi-label classifier that can categorize conspiratorial content. We use the percentage of misinformation comments on each video as a new feature for video classification.

COVID-QA: A Question Answering Dataset for COVID-19
Timo Möller | Anthony Reina | Raghavan Jayakumar | Malte Pietsch

We present COVID-QA, a Question Answering dataset consisting of 2,019 question/answer pairs annotated by volunteer biomedical experts on scientific articles related to COVID-19. To evaluate the dataset we compared a RoBERTa base model fine-tuned on SQuAD with the same model trained on SQuAD and our COVID-QA dataset. We found that the additional training on this domain-specific data leads to significant gains in performance. Both the trained model and the annotated dataset have been open-sourced at:


bib (full) Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

pdf bib
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Karin Verspoor | Kevin Bretonnel Cohen | Michael Conway | Berry de Bruijn | Mark Dredze | Rada Mihalcea | Byron Wallace

pdf bib
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee | Sean S. Yi | Minbyul Jeong | Mujeen Sung | WonJin Yoon | Yonghwa Choi | Miyoung Ko | Jaewoo Kang

The recent outbreak of the novel coronavirus is wreaking havoc on the world and researchers are struggling to effectively combat it. One reason why the fight is difficult is due to the lack of information and knowledge. In this work, we outline our effort to contribute to shrinking this knowledge vacuum by creating covidAsk, a question answering (QA) system that combines biomedical text mining and QA techniques to provide answers to questions in real-time. Our system also leverages information retrieval (IR) approaches to provide entity-level answers that are complementary to QA models. Evaluation of covidAsk is carried out by using a manually created dataset called COVID-19 Questions which is based on information from various sources, including the CDC and the WHO. We hope our system will be able to aid researchers in their search for knowledge and information not only for COVID-19, but for future pandemics as well.

pdf bib
CORA: A Deep Active Learning Covid-19 Relevancy Algorithm to Identify Core Scientific Articles
Zubair Afzal | Vikrant Yadav | Olga Fedorova | Vaishnavi Kandala | Janneke van de Loo | Saber A. Akhondi | Pascal Coupet | George Tsatsaronis

Ever since the COVID-19 pandemic broke out, the academic and scientific research community, as well as industry and governments around the world have joined forces in an unprecedented manner to fight the threat. Clinicians, biologists, chemists, bioinformaticians, nurses, data scientists, and all of the affiliated relevant disciplines have been mobilized to help discover efficient treatments for the infected population, as well as a vaccine solution to prevent further the virus spread. In this combat against the virus responsible for the pandemic, key for any advancements is the timely, accurate, peer-reviewed, and efficient communication of any novel research findings. In this paper we present a novel framework to address the information need of filtering efficiently the scientific bibliography for relevant literature around COVID-19. The contributions of the paper are summarized in the following: we define and describe the information need that encompasses the major requirements for COVID-19 articles relevancy, we present and release an expert-curated benchmark set for the task, and we analyze the performance of several state-of-the-art machine learning classifiers that may distinguish the relevant from the non-relevant COVID-19 literature.

Frugal neural reranking: evaluation on the Covid-19 literature
Tiago Almeida | Sérgio Matos

The Covid-19 pandemic urged the scientific community to join efforts at an unprecedented scale, leading to faster than ever dissemination of data and results, which in turn motivated more research works. This paper presents and discusses information retrieval models aimed at addressing the challenge of searching the large number of publications that stem from these studies. The model presented, based on classical baselines followed by an interaction based neural ranking model, was evaluated and evolved within the TREC Covid challenge setting. Results on this dataset show that, when starting with a strong baseline, our light neural ranking model can achieve results that are comparable to other model architectures that use very large number of parameters.

COVID-19 Literature Topic-Based Search via Hierarchical NMF
Rachel Grotheer | Longxiu Huang | Yihuan Huang | Alona Kryshchenko | Oleksandr Kryshchenko | Pengyu Li | Xia Li | Elizaveta Rebrova | Kyung Ha | Deanna Needell

A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.

TICO-19: the Translation Initiative for COvid-19
Antonios Anastasopoulos | Alessandro Cattelan | Zi-Yi Dou | Marcello Federico | Christian Federmann | Dmitriy Genzel | Franscisco Guzmán | Junjie Hu | Macduff Hughes | Philipp Koehn | Rosie Lazar | Will Lewis | Graham Neubig | Mengmeng Niu | Alp Öktem | Eric Paquin | Grace Tang | Sylwia Tur

The COVID-19 pandemic is the worst pandemic to strike the world in over a century. Crucial to stemming the tide of the SARS-CoV-2 virus is communicating to vulnerable populations the means by which they can protect themselves. To this end, the collaborators forming the Translation Initiative for COvid-19 (TICO-19) have made test and development data available to AI and MT researchers in 35 different languages in order to foster the development of tools and resources for improving access to information about COVID-19 in these languages. In addition to 9 high-resourced, ”pivot” languages, the team is targeting 26 lesser resourced languages, in particular languages of Africa, South Asia and South-East Asia, whose populations may be the most vulnerable to the spread of the virus. The same data is translated into all of the languages represented, meaning that testing or development can be done for any pairing of languages in the set. Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.

Expressive Interviewing: A Conversational System for Coping with COVID-19
Charles Welch | Allison Lahnala | Veronica Perez-Rosas | Siqi Shen | Sarah Seraj | Larry An | Kenneth Resnicow | James Pennebaker | Rada Mihalcea

The ongoing COVID-19 pandemic has raised concerns for many regarding personal and public health implications, financial security and economic stability. Alongside many other unprecedented challenges, there are increasing concerns over social isolation and mental health. We introduce Expressive Interviewing – an interview-style conversational system that draws on ideas from motivational interviewing and expressive writing. Expressive Interviewing seeks to encourage users to express their thoughts and feelings through writing by asking them questions about how COVID-19 has impacted their lives. We present relevant aspects of the system’s design and implementation as well as quantitative and qualitative analyses of user interactions with the system. In addition, we conduct a comparative evaluation with a general purpose dialogue system for mental health that shows our system potential in helping users to cope with COVID-19 issues.

Temporal Mental Health Dynamics on Social Media
Tom Tabak | Matthew Purver

We describe a set of experiments for building a temporal mental health dynamics system. We utilise a pre-existing methodology for distant- supervision of mental health data mining from social media platforms and deploy the system during the global COVID-19 pandemic as a case study. Despite the challenging nature of the task, we produce encouraging results, both explicit to the global pandemic and implicit to a global phenomenon, Christmas Depres- sion, supported by the literature. We propose a methodology for providing insight into tem- poral mental health dynamics to be utilised for strategic decision-making.

Quantifying the Effects of COVID-19 on Mental Health Support Forums
Laura Biester | Katie Matton | Janarthanan Rajendran | Emily Mower Provost | Rada Mihalcea

The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health. Understanding its impact can inform strategies for mitigating negative consequences. In this work, we seek to better understand the effects of COVID-19 on mental health by examining discussions within mental health support communities on Reddit. First, we quantify the rate at which COVID-19 is discussed in each community, or subreddit, in order to understand levels of pandemic-related discussion. Next, we examine the volume of activity in order to determine whether the number of people discussing mental health has risen. Finally, we analyze how COVID-19 has influenced language use and topics of discussion within each subreddit.

COVID-19 Surveillance through Twitter using Self-Supervised and Few Shot Learning
Brandon Lwowski | Peyman Najafirad

Public health surveillance and tracking virus via social media can be a useful digital tool for contact tracing and preventing the spread of the virus. Nowadays, large volumes of COVID-19 tweets can quickly be processed in real-time to offer information to researchers. Nonetheless, due to the absence of labeled data for COVID-19, the preliminary supervised classifier or semi-supervised self-labeled methods will not handle non-spherical data with adequate accuracy. With the seasonal influenza and novel Coronavirus having many similar symptoms, we propose using few shot learning to fine-tune a semi-supervised model built on unlabeled COVID-19 and previously labeled influenza dataset that can provide in- sights into COVID-19 that have not been investigated. The experimental results show the efficacy of the proposed model with an accuracy of 86%, identification of Covid-19 related discussion using recently collected tweets.

Explaining the Trump Gap in Social Distancing Using COVID Discourse
Austin Van Loon | Sheridan Stewart | Brandon Waldon | Shrinidhi K Lakshmikanth | Ishan Shah | Sharath Chandra Guntuku | Garrick Sherman | James Zou | Johannes Eichstaedt

Our ability to limit the future spread of COVID-19 will in part depend on our understanding of the psychological and sociological processes that lead people to follow or reject coronavirus health behaviors. We argue that the virus has taken on heterogeneous meanings in communities across the United States and that these disparate meanings shaped communities’ response to the virus during the early, vital stages of the outbreak in the U.S. Using word embeddings, we demonstrate that counties where residents socially distanced less on average (as measured by residential mobility) more semantically associated the virus in their COVID discourse with concepts of fraud, the political left, and more benign illnesses like the flu. We also show that the different meanings the virus took on in different communities explains a substantial fraction of what we call the “”Trump Gap”, or the empirical tendency for more Trump-supporting counties to socially distance less. This work demonstrates that community-level processes of meaning-making in part determined behavioral responses to the COVID-19 pandemic and that these processes can be measured unobtrusively using Twitter.

COVIDLies: Detecting COVID-19 Misinformation on Social Media
Tamanna Hossain | Robert L. Logan IV | Arjuna Ugarte | Yoshitomo Matsubara | Sean Young | Sameer Singh

The ongoing pandemic has heightened the need for developing tools to flag COVID-19-related misinformation on the internet, specifically on social media such as Twitter. However, due to novel language and the rapid change of information, existing misinformation detection datasets are not effective for evaluating systems designed to detect misinformation on this topic. Misinformation detection can be divided into two sub-tasks: (i) retrieval of misconceptions relevant to posts being checked for veracity, and (ii) stance detection to identify whether the posts Agree, Disagree, or express No Stance towards the retrieved misconceptions. To facilitate research on this task, we release COVIDLies ( ), a dataset of 6761 expert-annotated tweets to evaluate the performance of misinformation detection systems on 86 different pieces of COVID-19 related misinformation. We evaluate existing NLP systems on this dataset, providing initial benchmarks and identifying key challenges for future models to improve upon.

Improved Topic Representations of Medical Documents to Assist COVID-19 Literature Exploration
Yulia Otmakhova | Karin Verspoor | Timothy Baldwin | Simon Šuster

Efficient discovery and exploration of biomedical literature has grown in importance in the context of the COVID-19 pandemic, and topic-based methods such as latent Dirichlet allocation (LDA) are a useful tool for this purpose. In this study we compare traditional topic models based on word tokens with topic models based on medical concepts, and propose several ways to improve topic coherence and specificity.

A System for Worldwide COVID-19 Information Aggregation
Akiko Aizawa | Frederic Bergeron | Junjie Chen | Fei Cheng | Katsuhiko Hayashi | Kentaro Inui | Hiroyoshi Ito | Daisuke Kawahara | Masaru Kitsuregawa | Hirokazu Kiyomaru | Masaki Kobayashi | Takashi Kodama | Sadao Kurohashi | Qianying Liu | Masaki Matsubara | Yusuke Miyao | Atsuyuki Morishima | Yugo Murawaki | Kazumasa Omura | Haiyue Song | Eiichiro Sumita | Shinji Suzuki | Ribeka Tanaka | Yu Tanaka | Masashi Toyoda | Nobuhiro Ueda | Honai Ueoka | Masao Utiyama | Ying Zhong

The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-19 information aggregation containing reliable articles from 10 regions in 7 languages sorted by topics. Our reliable COVID-19 related website dataset collected through crowdsourcing ensures the quality of the articles. A neural machine translation module translates articles in other languages into Japanese and English. A BERT-based topic-classifier trained on our article-topic pair dataset helps users find their interested information efficiently by putting articles into different categories.

CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
Dan Su | Yan Xu | Tiezheng Yu | Farhad Bin Siddique | Elham Barezi | Pascale Fung

We present CAiRE-COVID, a real-time question answering (QA) and multi-document summarization system, which won one of the 10 tasks in the Kaggle COVID-19 Open Research Dataset Challenge, judged by medical experts. Our system aims to tackle the recent challenge of mining the numerous scientific articles being published on COVID-19 by answering high priority questions from the community and summarizing salient question-related information. It combines information extraction with state-of-the-art QA and query-focused multi-document summarization techniques, selecting and highlighting evidence snippets from existing literature given a query. We also propose query-focused abstractive and extractive multi-document summarization methods, to provide more relevant information related to the question. We further conduct quantitative experiments that show consistent improvements on various metrics for each module. We have launched our website CAiRE-COVID for broader use by the medical community, and have open-sourced the code for our system, to bootstrap further study by other researches.

Automatic Evaluation vs. User Preference in Neural Textual QuestionAnswering over COVID-19 Scientific Literature
Arantxa Otegi | Jon Ander Campos | Gorka Azkune | Aitor Soroa | Eneko Agirre

We present a Question Answering (QA) system that won one of the tasks of the Kaggle CORD-19 Challenge, according to the qualitative evaluation of experts. The system is a combination of an Information Retrieval module and a reading comprehension module that finds the answers in the retrieved passages. In this paper we present a quantitative and qualitative analysis of the system. The quantitative evaluation using manually annotated datasets contradicted some of our design choices, e.g. the fact that using QuAC for fine-tuning provided better answers over just using SQuAD. We analyzed this mismatch with an additional A/B test which showed that the system using QuAC was indeed preferred by users, confirming our intuition. Our analysis puts in question the suitability of automatic metrics and its correlation to user preferences. We also show that automatic metrics are highly dependent on the characteristics of the gold standard, such as the average length of the answers.

A Multilingual Neural Machine Translation Model for Biomedical Data
Alexandre Bérard | Zae Myung Kim | Vassilina Nikoulina | Eunjeong Lucy Park | Matthias Gallé

We release a multilingual neural machine translation model, which can be used to translate text in the biomedical domain. The model can translate from 5 languages (French, German, Italian, Korean and Spanish) into English. It is trained with large amounts of generic and biomedical data, using domain tags. Our benchmarks show that it performs near state-of-the-art both on news (generic domain) and biomedical test sets, and that it outperforms the existing publicly released models. We believe that this release will help the large-scale multilingual analysis of the digital content of the COVID-19 crisis and of its effects on society, economy, and healthcare policies. We also release a test set of biomedical text for Korean-English. It consists of 758 sentences from official guidelines and recent papers, all about COVID-19.

Public Sentiment on Governmental COVID-19 Measures in Dutch Social Media
Shihan Wang | Marijn Schraagen | Erik Tjong Kim Sang | Mehdi Dastani

Public sentiment (the opinion, attitude or feeling that the public expresses) is a factor of interest for government, as it directly influences the implementation of policies. Given the unprecedented nature of the COVID-19 crisis, having an up-to-date representation of public sentiment on governmental measures and announcements is crucial. In this paper, we analyse Dutch public sentiment on governmental COVID-19 measures from text data collected across three online media sources (Twitter, Reddit and from February to September 2020. We apply sentiment analysis methods to analyse polarity over time, as well as to identify stance towards two specific pandemic policies regarding social distancing and wearing face masks. The presented preliminary results provide valuable insights into the narratives shown in vast social media text data, which help understand the influence of COVID-19 measures on the general public.

Exploratory Analysis of COVID-19 Related Tweets in North America to Inform Public Health Institutes
Hyeju Jang | Emily Rempel | Giuseppe Carenini | Naveed Janjua

Social media is a rich source where we can learn about people’s reactions to social issues. As COVID-19 has significantly impacted on people’s lives, it is essential to capture how people react to public health interventions and understand their concerns. In this paper, we aim to investigate people’s reactions and concerns about COVID-19 in North America, especially focusing on Canada. We analyze COVID-19 related tweets using topic modeling and aspect-based sentiment analysis, and interpret the results with public health experts. We compare timeline of topics discussed with timing of implementation of public health interventions for COVID-19. We also examine people’s sentiment about COVID-19 related issues. We discuss how the results can be helpful for public health agencies when designing a policy for new interventions. Our work shows how Natural Language Processing (NLP) techniques could be applied to public health questions with domain expert involvement.

Twitter Data Augmentation for Monitoring Public Opinion on COVID-19 Intervention Measures
Lin Miao | Mark Last | Marina Litvak

The COVID-19 outbreak is an ongoing worldwide pandemic that was announced as a global health crisis in March 2020. Due to the enormous challenges and high stakes of this pandemic, governments have implemented a wide range of policies aimed at containing the spread of the virus and its negative effect on multiple aspects of our life. Public responses to various intervention measures imposed over time can be explored by analyzing the social media. Due to the shortage of available labeled data for this new and evolving domain, we apply data distillation methodology to labeled datasets from related tasks and a very small manually labeled dataset. Our experimental results show that data distillation outperforms other data augmentation methods on our task.

COVID-19: A Semantic-Based Pipeline for Recommending Biomedical Entities
Marcia Afonso Barros | Andre Lamurias | Diana Sousa | Pedro Ruas | Francisco M. Couto

With the increasing number of publications about COVID-19, it is a challenge to extract personalized knowledge suitable for each researcher. This work aims to build a new semantic-based pipeline for recommending biomedical entities to scientific researchers. To this end, we developed a pipeline that creates an implicit feedback matrix based on Named Entity Recognition (NER) on a corpus of documents, using multidisciplinary ontologies for recognizing and linking the entities. Our hypothesis is that by using ontologies from different fields in the NER phase, we can improve the results for state-of-the-art collaborative-filtering recommender systems applied to the dataset created. The tests performed using the COVID-19 Open Research Dataset (CORD-19) dataset show that when using four ontologies, the results for precision@k, for example, reach the 80%, whereas when using only one ontology, the results for precision@k drops to 20%, for the same users. Furthermore, the use of multi-fields entities may help in the discovery of new items, even if the researchers do not have items from that field in their set of preferences.

Vapur: A Search Engine to Find Related Protein - Compound Pairs in COVID-19 Literature
Abdullatif Köksal | Hilal Dönmez | Rıza Özçelik | Elif Ozkirimli | Arzucan Özgür

Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Vapur: an online COVID-19 search engine specifically designed to find related protein - chemical pairs. Vapur is empowered with a relation-oriented inverted index that is able to retrieve and group studies for a query biomolecule with respect to its related entities. The inverted index of Vapur is automatically created with a BioNLP pipeline and integrated with an online user interface. The online interface is designed for the smooth traversal of the current literature by domain researchers and is publicly available at

Knowledge Discovery in COVID-19 Research Literature
Alejandro Piad-Morffis | Suilan Estevez-Velarde | Ernesto Luis Estevanell-Valladares | Yoan Gutiérrez | Andrés Montoyo | Rafael Muñoz | Yudivián Almeida-Cruz

This paper presents the preliminary results of an ongoing project that analyzes the growing body of scientific research published around the COVID-19 pandemic. In this research, a general-purpose semantic model is used to double annotate a batch of $500$ sentences that were manually selected by the researchers from the CORD-19 corpus. Afterwards, a baseline text-mining pipeline is designed and evaluated via a large batch of $100,959$ sentences. We present a qualitative analysis of the most interesting facts automatically extracted and highlight possible future lines of development. The preliminary results show that general-purpose semantic models are a useful tool for discovering fine-grained knowledge in large corpora of scientific documents.

Identifying pandemic-related stress factors from social-media posts – Effects on students and young-adults
Sachin Thukral | Suyash Sangwan | Arnab Chatterjee | Lipika Dey

The COVID-19 pandemic has thrown natural life out of gear across the globe. Strict measures are deployed to curb the spread of the virus that is causing it, and the most effective of them have been social isolation. This has led to wide-spread gloom and depression across society but more so among the young and the elderly. There are currently more than 200 million college students in 186 countries worldwide, affected due to the pandemic. The mode of education has changed suddenly, with the rapid adaptation of e-learning, whereby teaching is undertaken remotely and on digital platforms. This study presents insights gathered from social media posts that were posted by students and young adults during the COVID times. Using statistical and NLP techniques, we analyzed the behavioural issues reported by users themselves in their posts in depression related communities on Reddit. We present methodologies to systematically analyze content using linguistic techniques to find out the stress-inducing factors. Online education, losing jobs, isolation from friends and abusive families emerge as key stress factors

Tracking And Understanding Public Reaction During COVID-19: Saudi Arabia As A Use Case
Aseel Addawood | Alhanouf Alsuwailem | Ali Alohali | Dalal Alajaji | Mashail Alturki | Jaida Alsuhaibani | Fawziah Aljabli

The coronavirus disease of 2019 (COVID-19) has a huge impact on economies and societies around the world. While governments are taking extreme measures to reduce the spread of the virus, people are getting affected by these new measures. With restrictions like lockdown and social distancing, it became important to understand the emotional response of the public towards the pandemic. In this paper, we study the reaction of Saudi Arabia citizens towards the pandemic. We utilize a collection of Arabic tweets that were sent during 2020, primarily through hashtags that were originated from Saudi Arabia. Our results showed that people had kept a positive reaction towards the pandemic. This positive reaction was at its highest at the beginning of the COVID-19 crisis and started to decline as time passes. Overall, the results showed that people were so supportive of each other through this pandemic. This research can help researchers and policymakers in understanding the emotional effect of a pandemic on societies.

Characterizing drug mentions in COVID-19 Twitter Chatter
Ramya Tekumalla | Juan M Banda

Since the classification of COVID-19 as a global pandemic, there have been many attempts to treat and contain the virus. Although there is no specific antiviral treatment recommended for COVID-19, there are several drugs that can potentially help with symptoms. In this work, we mined a large twitter dataset of 424 million tweets of COVID-19 chatter to identify discourse around drug mentions. While seemingly a straightforward task, due to the informal nature of language use in Twitter, we demonstrate the need of machine learning alongside traditional automated methods to aid in this task. By applying these complementary methods, we are able to recover almost 15% additional data, making misspelling handling a needed task as a pre-processing step when dealing with social media data.

Content analysis of Persian/Farsi Tweets during COVID-19 pandemic in Iran using NLP
Pedram Hosseini | Poorya Hosseini | David Broniatowski

Iran, along with China, South Korea, and Italy was among the countries that were hit hard in the first wave of the COVID-19 spread. Twitter is one of the widely-used online platforms by Iranians inside and abroad for sharing their opinion, thoughts, and feelings about a wide range of issues. In this study, using more than 530,000 original tweets in Persian/Farsi on COVID-19, we analyzed the topics discussed among users, who are mainly Iranians, to gauge and track the response to the pandemic and how it evolved over time. We applied a combination of manual annotation of a random sample of tweets and topic modeling tools to classify the contents and frequency of each category of topics. We identified the top 25 topics among which living experience under home quarantine emerged as a major talking point. We additionally categorized the broader content of tweets that shows satire, followed by news, is the dominant tweet type among Iranian users. While this framework and methodology can be used to track public response to ongoing developments related to COVID-19, a generalization of this framework can become a useful framework to gauge Iranian public reaction to ongoing policy measures or events locally and internationally.

Annotating the Pandemic: Named Entity Recognition and Normalisation in COVID-19 Literature
Nico Colic | Lenz Furrer | Fabio Rinaldi

The COVID-19 pandemic has been accompanied by such an explosive increase in media coverage and scientific publications that researchers find it difficult to keep up. We are presenting a publicly available pipeline to perform named entity recognition and normalisation in parallel to help find relevant publications and to aid in downstream NLP tasks such as text summarisation. In our approach, we are using a dictionary-based system for its high recall in conjunction with two models based on BioBERT for their accuracy. Their outputs are combined according to different strategies depending on the entity type. In addition, we are using a manually crafted dictionary to increase performance for new concepts related to COVID-19. We have previously evaluated our work on the CRAFT corpus, and make the output of our pipeline available on two visualisation platforms.

AskMe: A LAPPS Grid-based NLP Query and Retrieval System for Covid-19 Literature
Keith Suderman | Nancy Ide | Verhagen Marc | Brent Cochran | James Pustejovsky

In a recent project, the Language Application Grid was augmented to support the mining of scientific publications. The results of that ef- fort have now been repurposed to focus on Covid-19 literature, including modification of the LAPPS Grid “AskMe” query and retrieval engine. We describe the AskMe system and discuss its functionality as compared to other query engines available to search covid-related publications.

Concept Wikification for COVID-19
Panagiotis Lymperopoulos | Haoling Qiu | Bonan Min

Understanding scientific articles related to COVID-19 requires broad knowledge about concepts such as symptoms, diseases and medicine. Given the very large and ever-growing scientific articles related to COVID-19, it is a daunting task even for experts to recognize the large set of concepts mentioned in these articles. In this paper, we address the problem of concept wikification for COVID-19, which is to automatically recognize mentions of concepts related to COVID-19 in text and resolve them into Wikipedia titles. We develop an approach to curate a COVID-19 concept wikification dataset by mining Wikipedia text and the associated intra-Wikipedia links. We also develop an end-to-end system for concept wikification for COVID-19. Preliminary experiments show very encouraging results. Our dataset, code and pre-trained model are available at

Developing a Curated Topic Model for COVID-19 Medical Research Literature
Philip Resnik | Katherine E. Goodman | Mike Moran

Topic models can facilitate search, navigation, and knowledge discovery in large document collections. However, automatic generation of topic models can produce results that fail to meet the needs of users. We advocate for a set of user-focused desiderata in topic modeling for the COVID-19 literature, and describe an effort in progress to develop a curated topic model for COVID-19 articles informed by subject matter expertise and the way medical researchers engage with medical literature.

Collecting Verified COVID-19 Question Answer Pairs
Adam Poliak | Max Fleming | Cash Costello | Kenton Murray | Mahsa Yarmohammadi | Shivani Pandya | Darius Irani | Milind Agarwal | Udit Sharma | Shuo Sun | Nicola Ivanov | Lingxi Shang | Kaushik Srinivasan | Seolhwa Lee | Xu Han | Smisha Agarwal | João Sedoc

We release a dataset of over 2,100 COVID19 related Frequently asked Question-Answer pairs scraped from over 40 trusted websites. We include an additional 24, 000 questions pulled from online sources that have been aligned by experts with existing answered questions from our dataset. This paper describes our efforts in collecting the dataset and summarizes the resulting data. Our dataset is automatically updated daily and available at scraping-qas. So far, this data has been used to develop a chatbot providing users information about COVID-19. We encourage others to build analytics and tools upon this dataset as well.

A Comprehensive Dictionary and Term Variation Analysis for COVID-19 and SARS-CoV-2
Robert Leaman | Zhiyong Lu

The number of unique terms in the scientific literature used to refer to either SARS-CoV-2 or COVID-19 is remarkably large and has continued to increase rapidly despite well-established standardized terms. This high degree of term variation makes high recall identification of these important entities difficult. In this manuscript we present an extensive dictionary of terms used in the literature to refer to SARS-CoV-2 and COVID-19. We use a rule-based approach to iteratively generate new term variants, then locate these variants in a large text corpus. We compare our dictionary to an extensive collection of terminological resources, demonstrating that our resource provides a substantial number of additional terms. We use our dictionary to analyze the usage of SARS-CoV-2 and COVID-19 terms over time and show that the number of unique terms continues to grow rapidly. Our dictionary is freely available at

Using the Poly-encoder for a COVID-19 Question Answering System
Seolhwa Lee | João Sedoc

To combat misinformation regarding COVID- 19 during this unprecedented pandemic, we propose a conversational agent that answers questions related to COVID-19. We adapt the Poly-encoder (Humeau et al., 2020) model for informational retrieval from FAQs. We show that after fine-tuning, the Poly-encoder can achieve a higher F1 score. We make our code publicly available for other researchers to use.

Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo
Yong Hu | Heyan Huang | Anfan Chen | Xian-Ling Mao

With the rapid development of COVID-19 around the world, people are requested to maintain “social distance” and “stay at home”. In this scenario, extensive social interactions transfer to cyberspace, especially on social media platforms like Twitter and Sina Weibo. People generate posts to share information, express opinions and seek help during the pandemic outbreak, and these kinds of data on social media are valuable for studies to prevent COVID-19 transmissions, such as early warning and outbreaks detection. Therefore, in this paper, we release a novel and fine-grained large-scale COVID-19 social media dataset collected from Sina Weibo, named Weibo-COV, contains more than 40 million posts ranging from December 1, 2019 to April 30, 2020. Moreover, this dataset includes comprehensive information nuggets like post-level information, interactive information, location information, and repost network. We hope this dataset can promote studies of COVID-19 from multiple perspectives and enable better and rapid researches to suppress the spread of this pandemic.

Detecting Emerging Symptoms of COVID-19 using Context-based Twitter Embeddings
Roshan Santosh | H. Andrew Schwartz | Johannes Eichstaedt | Lyle Ungar | Sharath Chandra Guntuku

In this paper, we present an iterative graph-based approach for the detection of symptoms of COVID-19, the pathology of which seems to be evolving. More generally, the method can be applied to finding context-specific words and texts (e.g. symptom mentions) in large imbalanced corpora (e.g. all tweets mentioning }#COVID-19). Given the novelty of COVID-19, we also test if the proposed approach generalizes to the problem of detecting Adverse Drug Reaction (ADR). We find that the approach applied to Twitter data can detect symptom mentions substantially before to their being reported by the Centers for Disease Control (CDC).

Hate and Toxic Speech Detection in the Context of Covid-19 Pandemic using XAI: Ongoing Applied Research
David Hardage | Peyman Najafirad

As social distancing, self-quarantines, and travel restrictions have shifted a lot of pandemic conversations to social media so does the spread of hate speech. While recent machine learning solutions for automated hate and offensive speech identification are available on Twitter, there are issues with their interpretability. We propose a novel use of learned feature importance which improves upon the performance of prior state-of-the-art text classification techniques, while producing more easily interpretable decisions. We also discuss both technical and practical challenges that remain for this task.

Real-time Classification, Geolocation and Interactive Visualization of COVID-19 Information Shared on Social Media to Better Understand Global Developments
Andrei Mircea

As people communicate on social media during COVID-19, it can be an invaluable source of useful and up-to-date information. However, the large volume and noise-to-signal ratio of social media can make this impractical. We present a prototype dashboard for the real-time classification, geolocation and interactive visualization of COVID-19 tweets that addresses these issues. We also describe a novel L2 classification layer that outperforms linear layers on a dataset of respiratory virus tweets.