Emily Cheng
2025
Geometric Signatures of Compositionality Across a Language Model’s Lifetime
Jin Hwa Lee
|
Thomas Jiralerspong
|
Lei Yu
|
Yoshua Bengio
|
Emily Cheng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
By virtue of linguistic compositionality, few syntactic rules and a finite lexicon can generate an unbounded number of sentences. That is, language, though seemingly high-dimensional, can be explained using relatively few degrees of freedom. An open question is whether contemporary language models (LMs) reflect the intrinsic simplicity of language that is enabled by compositionality. We take a geometric view of this problem by relating the degree of compositionality in a dataset to the intrinsic dimension (ID) of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations’ ID, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between nonlinear and linear dimensionality, showing they respectively encode semantic and superficial aspects of linguistic composition.
2023
Bridging Information-Theoretic and Geometric Compression in Language Models
Emily Cheng
|
Corentin Kervadec
|
Marco Baroni
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
For a language model (LM) to faithfully model human language, it must compress vast, potentially infinite information into relatively few dimensions. We propose analyzing compression in (pre-trained) LMs from two points of view: geometric and information-theoretic. We demonstrate that the two views are highly correlated, such that the intrinsic geometric dimension of linguistic data predicts their coding length under the LM. We then show that, in turn, high compression of a linguistic dataset predicts rapid adaptation to that dataset, confirming that being able to compress linguistic information is an important part of successful LM performance. As a practical byproduct of our analysis, we evaluate a battery of intrinsic dimension estimators for the first time on linguistic data, showing that only some encapsulate the relationship between information-theoretic compression, geometric compression, and ease-of-adaptation.
On the Correspondence between Compositionality and Imitation in Emergent Neural Communication
Emily Cheng
|
Mathieu Rita
|
Thierry Poibeau
Findings of the Association for Computational Linguistics: ACL 2023
Compositionality is a hallmark of human language that not only enables linguistic generalization, but also potentially facilitates acquisition. When simulating language emergence with neural networks, compositionality has been shown to improve communication performance; however, its impact on imitation learning has yet to be investigated. Our work explores the link between compositionality and imitation in a Lewis game played by deep neural agents. Our contributions are twofold: first, we show that the learning algorithm used to imitate is crucial: supervised learning tends to produce more average languages, while reinforcement learning introduces a selection pressure toward more compositional languages. Second, our study reveals that compositional languages are easier to imitate, which may induce the pressure toward compositional languages in RL imitation settings.
Search
Fix author
Co-authors
- Marco Baroni 1
- Yoshua Bengio 1
- Thomas Jiralerspong 1
- Corentin Kervadec 1
- Jin Hwa Lee 1
- show all...