WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words
Lukas Wolf, Klemen Kotar, Greta Tuckute, Eghbal Hosseini, Tamar I. Regev, Ethan Gotlieb Wilcox, Alexander Scott Warstadt
- Anthology ID:
- 2023.conll-babylm.21
- Volume:
- Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Alex Warstadt, Aaron Mueller, Leshem Choshen, Ethan Wilcox, Chengxu Zhuang, Juan Ciro, Rafael Mosquera, Bhargavi Paranjabe, Adina Williams, Tal Linzen, Ryan Cotterell
- Venue:
- CoNLL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 253–258
- Language:
- URL:
- https://aclanthology.org/2023.conll-babylm.21
- DOI:
- 10.18653/v1/2023.conll-babylm.21
- Cite (ACL):
- Lukas Wolf, Klemen Kotar, Greta Tuckute, Eghbal Hosseini, Tamar I. Regev, Ethan Gotlieb Wilcox, and Alexander Scott Warstadt. 2023. WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, pages 253–258, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words (Wolf et al., CoNLL 2023)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2023.conll-babylm.21.pdf