Laura Burdick

Also published as: Laura Wendlandt

2022

Using Paraphrases to Study Properties of Contextual Embeddings
Laura Burdick | Jonathan K. Kummerfeld | Rada Mihalcea
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings. Using the Paraphrase Database’s alignments, we study words within paraphrases as well as phrase representations. We find that contextual embeddings effectively handle polysemous words, but give synonyms surprisingly different representations in many cases. We confirm previous findings that BERT is sensitive to word order, but find slightly different patterns than prior work in terms of the level of contextualization across BERT’s layers.

2021

pdf bib abs

Analyzing the Surprising Variability in Word Embedding Stability Across Languages
Laura Burdick | Jonathan K. Kummerfeld | Rada Mihalcea
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g., overlap between the nearest neighbors of a word in different embedding spaces) in diverse languages. We discuss linguistic properties that are related to stability, drawing out insights about correlations with affixing, language gender systems, and other features. This has implications for embedding use, particularly in research that uses them to study language trends.

2019

pdf bib

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop
Sudipta Kar | Farah Nadeem | Laura Burdick | Greg Durrett | Na-Rae Han
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

pdf bib abs

Identifying Visible Actions in Lifestyle Vlogs
Oana Ignat | Laura Burdick | Jia Deng | Rada Mihalcea
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We consider the task of identifying human actions visible in online videos. We focus on the widely spread genre of lifestyle vlogs, which consist of videos of people performing actions while verbally describing them. Our goal is to identify if actions mentioned in the speech description of a video are visually present. We construct a dataset with crowdsourced manual annotations of visible actions, and introduce a multimodal algorithm that leverages information derived from visual and linguistic clues to automatically infer which actions are visible in a video.

2018

pdf bib abs

Factors Influencing the Surprising Instability of Word Embeddings
Laura Wendlandt | Jonathan K. Kummerfeld | Rada Mihalcea
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations. In this paper, we consider one aspect of embedding spaces, namely their stability. We show that even relatively high frequency words (100-200 occurrences) are often unstable. We provide empirical evidence for how various factors contribute to the stability of word embeddings, and we analyze the effects of stability on downstream tasks.

Co-authors

Oana Ignat 1

Sudipta Kar 1

Farah Nadeem 1

Venues

Fix author