Michela Marchini
2025
Does Context Matter? A Prosodic Comparison of English and Spanish in Monolingual and Multilingual Discourse Settings
Debasmita Bhattacharya
|
David Sasu
|
Michela Marchini
|
Natalie Schluter
|
Julia Hirschberg
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Different languages are known to have typical and distinctive prosodic profiles. However, the majority of work on prosody across languages has been restricted to monolingual discourse contexts. We build on prior studies by asking: how does the nature of the discourse context influence variations in the prosody of monolingual speech? To answer this question, we compare the prosody of spontaneous, conversational monolingual English and Spanish both in monolingual and in multilingual speech settings. For both languages, we find that monolingual speech produced in a monolingual context is prosodically different from that produced in a multilingual context, with more marked differences having increased proximity to multilingual discourse. Our work is the first to incorporate multilingual discourse contexts into the study of native-level monolingual prosody, and has potential downstream applications for the recognition and synthesis of multilingual speech.
2024
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles
Julia Kruk
|
Michela Marchini
|
Rijul Magu
|
Caleb Ziems
|
David Muchlinski
|
Diyi Yang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. Dog whistling historically originated from United States politics, but in recent years has taken root in social media as a means of evading hate speech detection systems and maintaining plausible deniability. In this paper, we present an approach for word-sense disambiguation of dog whistles from standard speech using Large Language Models (LLMs), and leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication. Silent Signals is the largest dataset of disambiguated dog whistle usage, created for applications in hate speech detection, neology, and political science.
Search
Fix author
Co-authors
- Debasmita Bhattacharya 1
- Julia Hirschberg 1
- Julia Kruk 1
- Rijul Magu 1
- David Muchlinski 1
- show all...