Thomas Hikaru Clark

2025

pdf bib abs
Resource-Rational Noisy-Channel Language Processing: Testing the Effect of Algorithmic Constraints on Inferences
Thomas Hikaru Clark | Jacob Hoover Vigly | Edward Gibson | Roger P. Levy
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Human language use is robust to errors: comprehenders can and do mentally correct utterances that are implausible or anomalous. How are humans able to solve these problems in real time, picking out alternatives from an unbounded space of options using limited cognitive resources? And can language models trained on next-word prediction for typical language be augmented to handle language anomalies in a human-like way? Using a language model as a prior and an error model to encode likelihoods, we use Sequential Monte Carlo with optional rejuvenation to perform incremental and approximate probabilistic inference over intended sentences and production errors. We demonstrate that the model captures previously established patterns in human sentence processing, and that a trade-off between human-like noisy-channel inferences and computational resources falls out of this model. From a psycholinguistic perspective, our results offer a candidate algorithmic model of rational inference in language processing. From an NLP perspective, our results showcase how to elicit human-like noisy-channel inference behavior from a relatively small LLM while controlling the amount of computation available during inference. Our model is implemented in the Gen.jl probabilistic programming language, and our code is available at https://github.com/thomashikaru/noisy_channel_model.

2023

While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures. In the effort to identify these pressures, prior work has compared real and counterfactual word orders. Yet one functional pressure has been overlooked in such investigations: The uniform information density (UID) hypothesis, which holds that information should be spread evenly throughout an utterance. Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically. To this end, we use computational models to test whether real orders lead to greater information uniformity than counterfactual orders. In our empirical study of 10 typologically diverse languages, we find that: (i) among SVO languages, real word orders consistently have greater uniformity than reverse word orders, and (ii) only linguistically implausible counterfactual orders consistently exceed the uniformity of real orders. These findings are compatible with a pressure for information uniformity in the development and usage of natural languages.1

Co-authors

Clara Meister 1

Tiago Pimentel 1

Jacob Hoover Vigly 1

Venues

emnlp1
tacl1

Fix author