Laura Biester

2022

Recent studies have shown that for subjective annotation tasks, the demographics, lived experiences, and identity of annotators can have a large impact on how items are labeled. We expand on this work, hypothesizing that gender may correlate with differences in annotations for a number of NLP benchmarks, including those that are fairly subjective (e.g., affect in text) and those that are typically considered to be objective (e.g., natural language inference). We develop a robust framework to test for differences in annotation across genders for four benchmark datasets. While our results largely show a lack of statistically significant differences in annotation by males and females for these tasks, the framework can be used to analyze differences in annotation between various other demographic groups in future work. Finally, we note that most datasets are collected without annotator demographics and released only in aggregate form; we call on the community to consider annotator demographics as data is collected, and to release dis-aggregated data to allow for further work analyzing variability among annotators.

2020

pdf abs
Quantifying the Effects of COVID-19 on Mental Health Support Forums
Laura Biester | Katie Matton | Janarthanan Rajendran | Emily Mower Provost | Rada Mihalcea
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health. Understanding its impact can inform strategies for mitigating negative consequences. In this work, we seek to better understand the effects of COVID-19 on mental health by examining discussions within mental health support communities on Reddit. First, we quantify the rate at which COVID-19 is discussed in each community, or subreddit, in order to understand levels of pandemic-related discussion. Next, we examine the volume of activity in order to determine whether the number of people discussing mental health has risen. Finally, we analyze how COVID-19 has influenced language use and topics of discussion within each subreddit.

pdf abs
Building Location Embeddings from Physical Trajectories and Textual Representations
Laura Biester | Carmen Banea | Rada Mihalcea
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Word embedding methods have become the de-facto way to represent words, having been successfully applied to a wide array of natural language processing tasks. In this paper, we explore the hypothesis that embedding methods can also be effectively used to represent spatial locations. Using a new dataset consisting of the location trajectories of 729 students over a seven month period and text data related to those locations, we implement several strategies to create location embeddings, which we then use to create embeddings of the sequences of locations a student has visited. To identify the surface level properties captured in the representations, we propose a number of probing tasks such as the presence of a specific location in a sequence or the type of activities that take place at a location. We then leverage the representations we generated and employ them in more complex downstream tasks ranging from predicting a student’s area of study to a student’s depression level, showing the effectiveness of these location embeddings.

Co-authors

Joel Tetreault 1

Lu Xiao 1

Jieyu Zhao 1

Katie Matton 1

Janarthanan Rajendran 1

Emily Mower Provost 1

Laura Biester

2022

2020

Co-authors

Venues