Dirk Groeneveld
2025
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
Jiacheng Liu
|
Taylor Blanton
|
Yanai Elazar
|
Sewon Min
|
Yen-Sung Chen
|
Arnavi Chheda-Kothary
|
Huy Tran
|
Byron Bischoff
|
Eric Marsh
|
Michael Schmitz
|
Cassidy Trier
|
Aaron Sarnat
|
Jenna James
|
Jon Borchardt
|
Bailey Kuehl
|
Evie Yu-Yen Cheng
|
Karen Farley
|
Taira Anderson
|
David Albright
|
Carissa Schoenick
|
Luca Soldaini
|
Dirk Groeneveld
|
Rock Yuren Pang
|
Pang Wei Koh
|
Noah A. Smith
|
Sophie Lebrecht
|
Yejin Choi
|
Hannaneh Hajishirzi
|
Ali Farhadi
|
Jesse Dodge
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
2024
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini
|
Rodney Kinney
|
Akshita Bhagia
|
Dustin Schwenk
|
David Atkinson
|
Russell Authur
|
Ben Bogin
|
Khyathi Chandu
|
Jennifer Dumas
|
Yanai Elazar
|
Valentin Hofmann
|
Ananya Jha
|
Sachin Kumar
|
Li Lucy
|
Xinxi Lyu
|
Nathan Lambert
|
Ian Magnusson
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Abhilasha Ravichander
|
Kyle Richardson
|
Zejiang Shen
|
Emma Strubell
|
Nishant Subramani
|
Oyvind Tafjord
|
Evan Walsh
|
Luke Zettlemoyer
|
Noah Smith
|
Hannaneh Hajishirzi
|
Iz Beltagy
|
Dirk Groeneveld
|
Jesse Dodge
|
Kyle Lo
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
|
Iz Beltagy
|
Evan Walsh
|
Akshita Bhagia
|
Rodney Kinney
|
Oyvind Tafjord
|
Ananya Jha
|
Hamish Ivison
|
Ian Magnusson
|
Yizhong Wang
|
Shane Arora
|
David Atkinson
|
Russell Authur
|
Khyathi Chandu
|
Arman Cohan
|
Jennifer Dumas
|
Yanai Elazar
|
Yuling Gu
|
Jack Hessel
|
Tushar Khot
|
William Merrill
|
Jacob Morrison
|
Niklas Muennighoff
|
Aakanksha Naik
|
Crystal Nam
|
Matthew Peters
|
Valentina Pyatkin
|
Abhilasha Ravichander
|
Dustin Schwenk
|
Saurabh Shah
|
William Smith
|
Emma Strubell
|
Nishant Subramani
|
Mitchell Wortsman
|
Pradeep Dasigi
|
Nathan Lambert
|
Kyle Richardson
|
Luke Zettlemoyer
|
Jesse Dodge
|
Kyle Lo
|
Luca Soldaini
|
Noah Smith
|
Hannaneh Hajishirzi
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2022
Continued Pretraining for Better Zero- and Few-Shot Promptability
Zhaofeng Wu
|
Robert L Logan IV
|
Pete Walsh
|
Akshita Bhagia
|
Dirk Groeneveld
|
Sameer Singh
|
Iz Beltagy
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge
|
Maarten Sap
|
Ana Marasović
|
William Agnew
|
Gabriel Ilharco
|
Dirk Groeneveld
|
Margaret Mitchell
|
Matt Gardner
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
2020
A Simple Yet Strong Pipeline for HotpotQA
Dirk Groeneveld
|
Tushar Khot
|
Mausam
|
Ashish Sabharwal
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
2018
Construction of the Literature Graph in Semantic Scholar
Waleed Ammar
|
Dirk Groeneveld
|
Chandra Bhagavatula
|
Iz Beltagy
|
Miles Crawford
|
Doug Downey
|
Jason Dunkelberger
|
Ahmed Elgohary
|
Sergey Feldman
|
Vu Ha
|
Rodney Kinney
|
Sebastian Kohlmeier
|
Kyle Lo
|
Tyler Murray
|
Hsu-Han Ooi
|
Matthew Peters
|
Joanna Power
|
Sam Skjonsberg
|
Lucy Lu Wang
|
Chris Wilhelm
|
Zheng Yuan
|
Madeleine van Zuylen
|
Oren Etzioni
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)
2016
IKE - An Interactive Tool for Knowledge Extraction
Bhavana Dalvi
|
Sumithra Bhakthavatsalam
|
Chris Clark
|
Peter Clark
|
Oren Etzioni
|
Anthony Fader
|
Dirk Groeneveld
Proceedings of the 5th Workshop on Automated Knowledge Base Construction
Co-authors
- Iz Beltagy 4
- Jesse Dodge 4
- Akshita Bhagia 3
- Yanai Elazar 3
- Hannaneh Hajishirzi 3
- show all...
- Rodney Kinney 3
- Kyle Lo 3
- Matthew E. Peters 3
- Noah A. Smith 3
- Luca Soldaini 3
- David Atkinson 2
- Russell Authur 2
- Khyathi Chandu 2
- Jennifer Dumas 2
- Oren Etzioni 2
- Ananya Jha 2
- Tushar Khot 2
- Nathan Lambert 2
- Ian Magnusson 2
- Jacob Morrison 2
- Niklas Muennighoff 2
- Aakanksha Naik 2
- Crystal Nam 2
- Abhilasha Ravichander 2
- Kyle Richardson 2
- Dustin Schwenk 2
- Emma Strubell 2
- Nishant Subramani 2
- Oyvind Tafjord 2
- Evan Walsh 2
- Luke Zettlemoyer 2
- Mausam - 1
- William Agnew 1
- David Albright 1
- Waleed Ammar 1
- Taira Anderson 1
- Shane Arora 1
- Chandra Bhagavatula 1
- Sumithra Bhakthavatsalam 1
- Byron Bischoff 1
- Taylor Blanton 1
- Ben Bogin 1
- Jon Borchardt 1
- Yen-Sung Chen 1
- Evie Yu-Yen Cheng 1
- Arnavi Chheda-Kothary 1
- Yejin Choi 1
- Chris Clark 1
- Peter Clark 1
- Arman Cohan 1
- Miles Crawford 1
- Bhavana Dalvi 1
- Pradeep Dasigi 1
- Doug Downey 1
- Jason Dunkelberger 1
- Ahmed Elgohary 1
- Anthony Fader 1
- Ali Farhadi 1
- Karen Farley 1
- Sergey Feldman 1
- Matt Gardner 1
- Yuling Gu 1
- Vu Ha 1
- Jack Hessel 1
- Valentin Hofmann 1
- Gabriel Ilharco 1
- Hamish Ivison 1
- Jenna James 1
- Pang Wei Koh 1
- Sebastian Kohlmeier 1
- Bailey Kuehl 1
- Sachin Kumar 1
- Sophie Lebrecht 1
- Jiacheng Liu 1
- Robert L. Logan IV 1
- Li Lucy 1
- Xinxi Lyu 1
- Ana Marasović 1
- Eric Marsh 1
- William Merrill 1
- Sewon Min 1
- Margaret Mitchell 1
- Tyler Murray 1
- Hsu-Han Ooi 1
- Rock Yuren Pang 1
- Joanna Power 1
- Valentina Pyatkin 1
- Ashish Sabharwal 1
- Maarten Sap 1
- Aaron Sarnat 1
- Michael Schmitz 1
- Carissa Schoenick 1
- Saurabh Shah 1
- Zejiang Shen 1
- Sameer Singh 1
- Sam Skjonsberg 1
- William Smith 1
- Huy Tran 1
- Cassidy Trier 1
- Pete Walsh 1
- Yizhong Wang 1
- Lucy Lu Wang 1
- Chris Wilhelm 1
- Mitchell Wortsman 1
- Zhaofeng Wu 1
- Zheng Yuan 1
- Madeleine van Zuylen 1