Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

Alex Warstadt, Aaron Mueller, Leshem Choshen, Ethan Wilcox, Chengxu Zhuang, Juan Ciro, Rafael Mosquera, Bhargavi Paranjabe, Adina Williams, Tal Linzen, Ryan Cotterell (Editors)


Anthology ID:
2023.conll-babylm
Month:
December
Year:
2023
Address:
Singapore
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/2023.conll-babylm
DOI:
Bib Export formats:
BibTeX
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2023.conll-babylm.pdf

pdf bib
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
Alex Warstadt | Aaron Mueller | Leshem Choshen | Ethan Wilcox | Chengxu Zhuang | Juan Ciro | Rafael Mosquera | Bhargavi Paranjabe | Adina Williams | Tal Linzen | Ryan Cotterell

pdf bib
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt | Aaron Mueller | Leshem Choshen | Ethan Wilcox | Chengxu Zhuang | Juan Ciro | Rafael Mosquera | Bhargavi Paranjabe | Adina Williams | Tal Linzen | Ryan Cotterell

pdf bib
GPT-wee: How Small Can a Small Language Model Really Get?
Bastian Bunzeck | Sina Zarrieß

pdf
Tiny Language Models Enriched with Multimodal Knowledge from Multiplex Networks
Clayton Fields | Osama Natouf | Andrew McMains | Catherine Henry | Casey Kennington

pdf
Mini Minds: Exploring Bebeshka and Zlata Baby Models
Irina Proskurina | Guillaume Metzler | Julien Velcin

pdf
Grammar induction pretraining for language modeling in low resource contexts
Xuanda Chen | Eva Portelance

pdf
ChapGTP, ILLC’s Attempt at Raising a BabyLM: Improving Data Efficiency by Automatic Task Formation
Jaap Jumelet | Michael Hanna | Marianne de Heer Kloots | Anna Langedijk | Charlotte Pouw | Oskar van der Wal

pdf
Penn & BGU BabyBERTa+ for Strict-Small BabyLM Challenge
Yahan Yang | Elior Sulem | Insup Lee | Dan Roth

pdf
Too Much Information: Keeping Training Simple for BabyLMs
Lukas Edman | Lisa Bylinina

pdf
Can training neural language models on a curriculum with developmentally plausible data improve alignment with human reading behavior?
Aryaman Chobey | Oliver Smith | Anzi Wang | Grusha Prasad

pdf
CLIMB – Curriculum Learning for Infant-inspired Model Building
Richard Diehl Martinez | Hope McGovern | Zebulon Goriely | Christopher Davis | Andrew Caines | Paula Buttery | Lisa Beinborn

pdf
Acquiring Linguistic Knowledge from Multimodal Input
Theodor Amariucai | Alexander Scott Warstadt

pdf
Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures
Julius Steuer | Marius Mosbach | Dietrich Klakow

pdf
Baby’s CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models
Zheyu Zhang | Han Yang | Bolei Ma | David Rügamer | Ercong Nie

pdf
ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding
Ömer Veysel Çağatan

pdf
CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs
Lukas Thoma | Ivonne Weyers | Erion Çano | Stefan Schweter | Jutta L Mueller | Benjamin Roth

pdf
BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?
Xingmeng Zhao | Tongnian Wang | Sheri Osborn | Anthony Rios

pdf
Byte-ranked Curriculum Learning for BabyLM Strict-small Shared Task 2023
Justin DeBenedetto

pdf
McGill BabyLM Shared Task Submission: The Effects of Data Formatting and Structural Biases
Ziling Cheng | Rahul Aralikatte | Ian Porada | Cesare Spinoso-Di Piano | Jackie CK Cheung

pdf
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
David Samuel

pdf
Not all layers are equally as important: Every Layer Counts BERT
Lucas Georges Gabriel Charpentier | David Samuel

pdf
WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words
Lukas Wolf | Klemen Kotar | Greta Tuckute | Eghbal Hosseini | Tamar I. Regev | Ethan Gotlieb Wilcox | Alexander Scott Warstadt

pdf
A surprisal oracle for active curriculum language modeling
Xudong Hong | Sharid Loáiciga | Asad Sayeed

pdf
Mmi01 at The BabyLM Challenge: Linguistically Motivated Curriculum Learning for Pretraining in Low-Resource Settings
Maggie Mi

pdf
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
Inar Timiryasov | Jean-Loup Tastet

pdf
BabyLM Challenge: Curriculum learning based on sentence complexity approximating language acquisition
Miyu Oba | Akari Haga | Akiyo Fukatsu | Yohei Oseki

pdf
Better Together: Jointly Using Masked Latent Semantic Modeling and Masked Language Modeling for Sample Efficient Pre-training
Gábor Berend

pdf
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
Venkata S Govindarajan | Juan Diego Rodriguez | Kaj Bostrom | Kyle Mahowald

pdf
Towards more Human-like Language Models based on Contextualizer Pretraining Strategy
Chenghao Xiao | G Thomas Hudson | Noura Al Moubayed

pdf
Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building
Omar Momen | David Arps | Laura Kallmeyer

pdf
Pre-training LLMs using human-like development data corpus
Khushi Bhardwaj | Raj Sanjay Shah | Sashank Varma

pdf
On the effect of curriculum learning with developmental data for grammar acquisition
Mattia Opper | J. Morrison | N. Siddharth

pdf
Optimizing GPT-2 Pretraining on BabyLM Corpus with Difficulty-based Sentence Reordering
Nasim Borazjanizadeh