Samy Bengio
2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li | Mohammadreza Armandpour | Seyed Iman Mirzadeh | Sachin Mehta | Vaishaal Shankar | Raviteja Vemulapalli | Samy Bengio | Oncel Tuzel | Mehrdad Farajtabar | Hadi Pouransari | Fartash Faghri
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jeffrey Li | Mohammadreza Armandpour | Seyed Iman Mirzadeh | Sachin Mehta | Vaishaal Shankar | Raviteja Vemulapalli | Samy Bengio | Oncel Tuzel | Mehrdad Farajtabar | Hadi Pouransari | Fartash Faghri
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) trained on historical web data inevitably become outdated. We investigate evaluation strategies and update methods for LLMs as new data becomes available. We introduce a web-scale dataset for time-continual pretraining of LLMs derived from 114 dumps of Common Crawl (CC) – orders of magnitude larger than previous continual language modeling benchmarks. We also design time-stratified evaluations across both general CC data and specific domains (Wikipedia, StackExchange, and code documentation) to assess how well various continual learning methods adapt to new data while retaining past knowledge. Our findings demonstrate that, on general CC data, autoregressive meta-schedules combined with a fixed-ratio replay of older data can achieve comparable held-out loss to re-training from scratch, while requiring significantly less computation (2.6x). However, the optimal balance between incorporating new data and replaying old data differs as replay is crucial to avoid forgetting on generic web data but less so on specific domains.
2018
Tensor2Tensor for Neural Machine Translation
Ashish Vaswani | Samy Bengio | Eugene Brevdo | Francois Chollet | Aidan Gomez | Stephan Gouws | Llion Jones | Łukasz Kaiser | Nal Kalchbrenner | Niki Parmar | Ryan Sepassi | Noam Shazeer | Jakob Uszkoreit
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
Ashish Vaswani | Samy Bengio | Eugene Brevdo | Francois Chollet | Aidan Gomez | Stephan Gouws | Llion Jones | Łukasz Kaiser | Nal Kalchbrenner | Niki Parmar | Ryan Sepassi | Noam Shazeer | Jakob Uszkoreit
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
2016
Generating Sentences from a Continuous Space
Samuel R. Bowman | Luke Vilnis | Oriol Vinyals | Andrew Dai | Rafal Jozefowicz | Samy Bengio
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning
Samuel R. Bowman | Luke Vilnis | Oriol Vinyals | Andrew Dai | Rafal Jozefowicz | Samy Bengio
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning
2006
Search
Fix author
Co-authors
- Mohammadreza Armandpour 1
- Samuel Bowman 1
- Eugene Brevdo 1
- Francois Chollet 1
- Walter Daelemans 1
- Ido Dagan 1
- Andrew Dai 1
- Fartash Faghri 1
- Mehrdad Farajtabar 1
- Oren Glickman 1
- Aidan Gomez 1
- Stephan Gouws 1
- Llion Jones 1
- Rafal Jozefowicz 1
- Łukasz Kaiser 1
- Nal Kalchbrenner 1
- Mikaela Keller 1
- Jeffrey Li 1
- Sachin Mehta 1
- Seyed Iman Mirzadeh 1
- Niki Parmar 1
- Hadi Pouransari 1
- Ryan Sepassi 1
- Vaishaal Shankar 1
- Noam Shazeer 1
- Oncel Tuzel 1
- Jakob Uszkoreit 1
- Ashish Vaswani 1
- Raviteja Vemulapalli 1
- Luke Vilnis 1
- Oriol Vinyals 1