Julia Sammartino
2026
The Multilingual Euphemism Benchmark: Datasets and Baselines for Pragmatic Language Understanding
Whitney Poh | Julia Sammartino | Jasper Andrew | Witold Kieraś | Natalia Zawadzka-Paluektau | Iryna Dilai | Libby Barak | JIng Peng | Anna Feldman
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Whitney Poh | Julia Sammartino | Jasper Andrew | Witold Kieraś | Natalia Zawadzka-Paluektau | Iryna Dilai | Libby Barak | JIng Peng | Anna Feldman
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Euphemisms are words or phrases used to soften or indirectly refer to taboo or sensitive topics. They pose interpretation challenges because the same expression may appear in different senses depending on context: literal, figurative but non-euphemistic, or euphemistic. For example, pull the plug may refer euphemistically to ending a patient’s life support, figuratively to canceling a project or funding, or literally to unplugging a device. Euphemisms also vary across languages and cultures in both their surface forms and the contexts in which they are conventionally used. Previous work introduced datasets for the computational study of euphemisms in five languages. We extend this line of work by introducing two new annotated datasets for euphemism detection in Polish and Ukrainian and by standardizing resources for all seven languages into a unified benchmark format that supports cross-lingual evaluation. Finally, we provide zero-shot and few-shot baselines using GPT-5-nano. We ran each configuration five times and report the average score, establishing reference scores for multilingual pragmatic understanding. We also performed pilot tests using Qwen3-4B on the English and Chinese datasets.
2025
When Does Language Transfer Help? Sequential Fine-Tuning for Cross-Lingual Euphemism Detection
Julia Sammartino | Libby Barak | Jing Peng | Anna Feldman
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Julia Sammartino | Libby Barak | Jing Peng | Anna Feldman
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Euphemisms are culturally variable and often ambiguous, posing challenges for language models, especially in low-resource settings. This paper investigates how cross-lingual transfer via sequential fine-tuning affects euphemism detection across five languages: English, Spanish, Chinese, Turkish, and Yorùbá. We compare sequential fine-tuning with monolingual and simultaneous fine-tuning using XLM-R and mBERT, analyzing how performance is shaped by language pairings, typological features, and pretraining coverage. Results show that sequential fine-tuning with a high-resource L1 improves L2 performance, especially for low-resource languages like Yorùbá and Turkish. XLM-R achieves larger gains but is more sensitive to pretraining gaps and catastrophic forgetting, while mBERT yields more stable, though lower, results. These findings highlight sequential fine-tuning as a simple yet effective strategy for improving euphemism detection in multilingual models, particularly when low-resource languages are involved.