- Anthology ID:
- 2025.swisstext-1.4
- Volume:
- Proceedings of the 10th edition of the Swiss Text Analytics Conference
- Month:
- May
- Year:
- 2025
- Address:
- Winterthur, Switzerland
- Editors:
- Jonathan Gerber, Mark Cieliebak, Don Tuggener, Manuela Hürlimann
- Venue:
- SwissText
- SIG:
- SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 31–56
- Language:
- URL:
- https://preview.aclanthology.org/more-markup/2025.swisstext-1.4/
- DOI:
- Cite (ACL):
- Bettina Messmer, Vinko Sabolčec, and Martin Jaggi. 2025. Enhancing Multilingual LLM Pretraining with Model-Based Data Selection. In Proceedings of the 10th edition of the Swiss Text Analytics Conference, pages 31–56, Winterthur, Switzerland. Association for Computational Linguistics.
- Cite (Informal):
- Enhancing Multilingual LLM Pretraining with Model-Based Data Selection (Messmer et al., SwissText 2025)
- PDF:
- https://preview.aclanthology.org/more-markup/2025.swisstext-1.4.pdf