@inproceedings{bhardwaj-etal-2023-pre,
    title = "Pre-training {LLM}s using human-like development data corpus",
    author = "Bhardwaj, Khushi  and
      Shah, Raj Sanjay  and
      Varma, Sashank",
    editor = "Warstadt, Alex  and
      Mueller, Aaron  and
      Choshen, Leshem  and
      Wilcox, Ethan  and
      Zhuang, Chengxu  and
      Ciro, Juan  and
      Mosquera, Rafael  and
      Paranjabe, Bhargavi  and
      Williams, Adina  and
      Linzen, Tal  and
      Cotterell, Ryan",
    booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2023.conll-babylm.30/",
    doi = "10.18653/v1/2023.conll-babylm.30",
    pages = "339--345"
}Markdown (Informal)
[Pre-training LLMs using human-like development data corpus](https://preview.aclanthology.org/ingest-emnlp/2023.conll-babylm.30/) (Bhardwaj et al., CoNLL-BabyLM 2023)
ACL
- Khushi Bhardwaj, Raj Sanjay Shah, and Sashank Varma. 2023. Pre-training LLMs using human-like development data corpus. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, pages 339–345, Singapore. Association for Computational Linguistics.