Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning)
Francesca Padovani, Bastian Bunzeck, Manar Ali, Omar Momen, Arianna Bisazza, Hendrik Buschmeier, Sina Zarrieß
Abstract
We investigate whether pre-training exclusively on dialogue data results in formally and functionally apt small language models. Based on this pre-trained llamalogue model, we employ a variety of fine-tuning strategies to enforce “more communicative” text generations by our models. Although our models underperform on most standard BabyLM benchmarks, they excel at dialogue continuation prediction in a minimal pair setting. While PPO fine-tuning has mixed to adversarial effects on our models, DPO fine-tuning further improves their performance on our custom dialogue benchmark.- Anthology ID:
- 2025.babylm-main.29
- Volume:
- Proceedings of the First BabyLM Workshop
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Lucas Charpentier, Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Michael Y. Hu, Jing Liu, Jaap Jumelet, Tal Linzen, Aaron Mueller, Candace Ross, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox, Adina Williams
- Venue:
- BabyLM
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 421–435
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.29/
- DOI:
- Cite (ACL):
- Francesca Padovani, Bastian Bunzeck, Manar Ali, Omar Momen, Arianna Bisazza, Hendrik Buschmeier, and Sina Zarrieß. 2025. Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning). In Proceedings of the First BabyLM Workshop, pages 421–435, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning) (Padovani et al., BabyLM 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.babylm-main.29.pdf