Emily Morgan


2025

pdf bib
The Role of Abstract Representations and Observed Preferences in the Ordering of Binomials in Large Language Models
Zachary Nicholas Houghton | Kenji Sagae | Emily Morgan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

To what extent do large language models learn abstract representations as opposed to more superficial aspects of their very large training corpora? We examine this question in the context of binomial ordering preferences involving two conjoined nouns in English. When choosing a binomial ordering (radio and television vs television and radio), humans rely on more than simply the observed frequency of each option. Humans also rely on abstract ordering preferences (e.g., preferences for short words before long words). We investigate whether large language models simply rely on the observed preference in their training data, or whether they are capable of learning the abstract ordering preferences (i.e., abstract representations) that humans rely on. Our results suggest that both smaller and larger models’ ordering preferences are driven exclusively by their experience with that item in the training data. Our study provides further insights into differences between how large language models represent and use language and how humans do it, particularly with respect to the use of abstract representations versus observed preferences.

2021

pdf bib
Frequency-Dependent Regularization in Syntactic Constructions
Zoey Liu | Emily Morgan
Proceedings of the Society for Computation in Linguistics 2021

2020

pdf bib
Processing effort is a poor predictor of cross-linguistic word order frequency
Brennan Gonering | Emily Morgan
Proceedings of the 24th Conference on Computational Natural Language Learning

Some have argued that word orders which are more difficult to process should be rarer cross-linguistically. Our current study fails to replicate the results of Maurits, Navarro, and Perfors (2010), who used an entropy-based Uniform Information Density (UID) measure to moderately predict the Greenbergian typology of transitive word orders. We additionally report an inability of three measures of processing difficulty — entropy-based UID, surprisal-based UID, and pointwise mutual information — to correctly predict the correct typological distribution, using transitive constructions from 20 languages in the Universal Dependencies project (version 2.5). However, our conclusions are limited by data sparsity.

pdf bib
Frequency-(in)dependent regularization in language production and cultural transmission
Emily Morgan | Roger Levy
Proceedings of the Society for Computation in Linguistics 2020