Sarah Ruth Brogden Payne


2025

pdf bib
Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection
Sarah Ruth Brogden Payne | Jordan Kodner
Findings of the Association for Computational Linguistics: ACL 2025

Recent work has suggested that overlap –whether a given lemma or feature set is attested independently in train – drives model performance on morphological inflection tasks. The impact of lemma overlap, however, is debated, with recent work reporting accuracy drops from 0% to 30% between seen and unseen test lemmas. In this paper, we introduce a novel splitting algorithm designed to investigate predictors of accuracy on seen and unseen lemmas. We find only an 11% average drop from seen to unseen test lemmas, but show that the number of lemmas in train has a much stronger effect on accuracy on unseen than seen lemmas. We also show that the previously reported 30% drop is inflated due to the introduction of a near-30% drop in the number of training lemmas from the original splits to their novel splits.

2023

pdf bib
Exploring Linguistic Probes for Morphological Generalization
Jordan Kodner | Salam Khalifa | Sarah Ruth Brogden Payne
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Modern work on the cross-linguistic computational modeling of morphological inflection has typically employed language-independent data splitting algorithms. In this paper, we supplement that approach with language-specific probes designed to test aspects of morphological generalization. Testing these probes on three morphologically distinct languages, English, Spanish, and Swahili, we find evidence that three leading morphological inflection systems employ distinct generalization strategies over conjugational classes and feature sets on both orthographic and phonologically transcribed inputs.