Eva Portelance
2026
Revisiting Age of Acquisition in Curriculum Learning: Disentangling Lexical Features and Semantic Structure
Ian Gifford | Aaron Shah | Catherine Chen | Taimaa Kassab Bachi | Eva Portelance
Proceedings of the 30th Conference on Computational Natural Language Learning
Ian Gifford | Aaron Shah | Catherine Chen | Taimaa Kassab Bachi | Eva Portelance
Proceedings of the 30th Conference on Computational Natural Language Learning
Previous work has found that ordering training data by children’s Age of Acquisition (AoA) for words increases the stability of distributional word embeddings, suggesting that early-learned words play a privileged role in shaping semantic structure. In this study, we determine whether AoA itself drives these effects, or whether they emerge from correlated lexical factors such as frequency, concreteness, and phonological complexity. Using incremental Word2Vec training, we construct curricula ordered by AoA and by individual lexical features, while systematically controlling for vocabulary growth and deterministic ordering effects. We show that AoA-ordered curricula produce greater early-phase stability than shuffled baselines, even under controlled exposure conditions. We find that the advantage observed with AoA can be largely explained by correlated factors like overall word frequency. Despite limited gains on general similarity benchmarks, AoA-ordered embeddings outperform shuffled embeddings on a proxy domain-specific task: predicting human AoA norms. This advantage persists after debiasing timestamp effects, implying that AoA curricula induce developmentally meaningful semantic structure.
2023
Grammar induction pretraining for language modeling in low resource contexts
Xuanda Chen | Eva Portelance
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
Xuanda Chen | Eva Portelance
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
2021
The Emergence of the Shape Bias Results from Communicative Efficiency
Eva Portelance | Michael C. Frank | Dan Jurafsky | Alessandro Sordoni | Romain Laroche
Proceedings of the 25th Conference on Computational Natural Language Learning
Eva Portelance | Michael C. Frank | Dan Jurafsky | Alessandro Sordoni | Romain Laroche
Proceedings of the 25th Conference on Computational Natural Language Learning
By the age of two, children tend to assume that new word categories are based on objects’ shape, rather than their color or texture; this assumption is called the shape bias. They are thought to learn this bias by observing that their caregiver’s language is biased towards shape based categories. This presents a chicken and egg problem: if the shape bias must be present in the language in order for children to learn it, how did it arise in language in the first place? In this paper, we propose that communicative efficiency explains both how the shape bias emerged and why it persists across generations. We model this process with neural emergent language agents that learn to communicate about raw pixelated images. First, we show that the shape bias emerges as a result of efficient communication strategies employed by agents. Second, we show that pressure brought on by communicative need is also necessary for it to persist across generations; simply having a shape bias in an agent’s input language is insufficient. These results suggest that, over and above the operation of other learning strategies, the shape bias in human learners may emerge and be sustained by communicative pressures.