Tessa Masis

2026

Coordinates from Context: Using LLMs to Ground Complex Location References
Tessa Masis | Brendan O'Connor
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Geocoding is the task of linking a location reference to an actual geographic location and is essential for many downstream analyses of unstructured text. In this paper, we explore the challenging setting of geocoding compositional location references. Building on recent work demonstrating LLMs’ abilities to reason over geospatial data, we evaluate LLMs’ geospatial knowledge versus reasoning skills relevant to our task. Based on these insights, we propose an LLM-based strategy for geocoding compositional location references. We show that our approach improves performance for the task and that a relatively small fine-tuned LLM can achieve comparable performance with much larger off-the-shelf models.

2024

pdf bib abs

Where on Earth Do Users Say They Are?: Geo-Entity Linking for Noisy Multilingual User Input
Tessa Masis | Brendan O’Connor
Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)

Geo-entity linking is the task of linking a location mention to the real-world geographic location. In this we explore the challenging task of geo-entity linking for noisy, multilingual social media data. There are few open-source multilingual geo-entity linking tools available and existing ones are often rule-based, which break easily in social media settings, or LLM-based, which are too expensive for large-scale datasets. We present a method which represents real-world locations as averaged embeddings from labeled user-input location names and allows for selective prediction via an interpretable confidence score. We show that our approach improves geo-entity linking on a global and multilingual social media dataset, and discuss progress and problems with evaluating at different geographic granularities.

2022

pdf bib abs

Corpus-Guided Contrast Sets for Morphosyntactic Feature Detection in Low-Resource English Varieties
Tessa Masis | Anissa Neal | Lisa Green | Brendan O’Connor
Proceedings of the First Workshop on NLP applications to field linguistics

The study of language variation examines how language varies between and within different groups of speakers, shedding light on how we use language to construct identities and how social contexts affect language use. A common method is to identify instances of a certain linguistic feature - say, the zero copula construction - in a corpus, and analyze the feature’s distribution across speakers, topics, and other variables, to either gain a qualitative understanding of the feature’s function or systematically measure variation. In this paper, we explore the challenging task of automatic morphosyntactic feature detection in low-resource English varieties. We present a human-in-the-loop approach to generate and filter effective contrast sets via corpus-guided edits. We show that our approach improves feature detection for both Indian English and African American English, demonstrate how it can assist linguistic research, and release our fine-tuned models for use by other researchers.

2021

pdf bib abs

ProSPer: Probing Human and Neural Network Language Model Understanding of Spatial Perspective
Tessa Masis | Carolyn Jane Anderson
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Understanding perspectival language is important for applications like dialogue systems and human-robot interaction. We propose a probe task that explores how well language models understand spatial perspective. We present a dataset for evaluating perspective inference in English, ProSPer, and use it to explore how humans and Transformer-based language models infer perspective. Although the best bidirectional model performs similarly to humans, they display different strengths: humans outperform neural networks in conversational contexts, while RoBERTa excels at written genres.

Co-authors

Venues

Fix author