@inproceedings{khanuja-etal-2021-mergedistill,
    title = "{M}erge{D}istill: {M}erging Language Models using Pre-trained Distillation",
    author = "Khanuja, Simran  and
      Johnson, Melvin  and
      Talukdar, Partha",
    editor = "Zong, Chengqing  and
      Xia, Fei  and
      Li, Wenjie  and
      Navigli, Roberto",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2021.findings-acl.254/",
    doi = "10.18653/v1/2021.findings-acl.254",
    pages = "2874--2887"
}Markdown (Informal)
[MergeDistill: Merging Language Models using Pre-trained Distillation](https://preview.aclanthology.org/ingest-emnlp/2021.findings-acl.254/) (Khanuja et al., Findings 2021)
ACL