@inproceedings{agrawal-singh-2023-corpus,
    title = "Corpus Complexity Matters in Pretraining Language Models",
    author = "Agrawal, Ameeta  and
      Singh, Suresh",
    editor = "Sadat Moosavi, Nafise  and
      Gurevych, Iryna  and
      Hou, Yufang  and
      Kim, Gyuwan  and
      Kim, Young Jin  and
      Schuster, Tal  and
      Agrawal, Ameeta",
    booktitle = "Proceedings of the Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada (Hybrid)",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2023.sustainlp-1.20/",
    doi = "10.18653/v1/2023.sustainlp-1.20",
    pages = "257--263"
}Markdown (Informal)
[Corpus Complexity Matters in Pretraining Language Models](https://preview.aclanthology.org/ingest-emnlp/2023.sustainlp-1.20/) (Agrawal & Singh, sustainlp 2023)
ACL
- Ameeta Agrawal and Suresh Singh. 2023. Corpus Complexity Matters in Pretraining Language Models. In Proceedings of the Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP), pages 257–263, Toronto, Canada (Hybrid). Association for Computational Linguistics.