On the In-context Generation of Language Models
Zhongtao Jiang, Yuanzhe Zhang, Kun Luo, Xiaowei Yuan, Jun Zhao, Kang Liu
Abstract
Large language models (LLMs) are found to have the ability of in-context generation (ICG): when they are fed with an in-context prompt concatenating a few somehow similar examples, they can implicitly recognize the pattern of them and then complete the prompt in the same pattern. ICG is curious, since language models are usually not explicitly trained in the same way as the in-context prompt, and the distribution of examples in the prompt differs from that of sequences in the pretrained corpora. This paper provides a systematic study of the ICG ability of language models, covering discussions about its source and influential factors, in the view of both theory and empirical experiments. Concretely, we first propose a plausible latent variable model to model the distribution of the pretrained corpora, and then formalize ICG as a problem of next topic prediction. With this framework, we can prove that the repetition nature of a few topics ensures the ICG ability on them theoretically. Then, we use this controllable pretrained distribution to generate several medium-scale synthetic datasets (token scale: 2.1B-3.9B) and experiment with different settings of Transformer architectures (parameter scale: 4M-234M). Our experimental results further offer insights into how the data and model architectures influence ICG.- Anthology ID:
- 2024.emnlp-main.568
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10169–10187
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-main.568/
- DOI:
- 10.18653/v1/2024.emnlp-main.568
- Cite (ACL):
- Zhongtao Jiang, Yuanzhe Zhang, Kun Luo, Xiaowei Yuan, Jun Zhao, and Kang Liu. 2024. On the In-context Generation of Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10169–10187, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- On the In-context Generation of Language Models (Jiang et al., EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-main.568.pdf