Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection

Sarah Ruth Brogden Payne; Jordan Kodner

Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection

Abstract

Recent work has suggested that overlap –whether a given lemma or feature set is attested independently in train – drives model performance on morphological inflection tasks. The impact of lemma overlap, however, is debated, with recent work reporting accuracy drops from 0% to 30% between seen and unseen test lemmas. In this paper, we introduce a novel splitting algorithm designed to investigate predictors of accuracy on seen and unseen lemmas. We find only an 11% average drop from seen to unseen test lemmas, but show that the number of lemmas in train has a much stronger effect on accuracy on unseen than seen lemmas. We also show that the previously reported 30% drop is inflated due to the introduction of a near-30% drop in the number of training lemmas from the original splits to their novel splits.

Anthology ID:: 2025.findings-acl.1296
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25270–25286
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1296/
DOI:
Bibkey:
Cite (ACL):: Sarah Ruth Brogden Payne and Jordan Kodner. 2025. Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection. In Findings of the Association for Computational Linguistics: ACL 2025, pages 25270–25286, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection (Payne & Kodner, Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.1296.pdf

PDF Cite Search Fix data