Michael Schmitz
2025
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
Jiacheng Liu | Taylor Blanton | Yanai Elazar | Sewon Min | Yen-Sung Chen | Arnavi Chheda-Kothary | Huy Tran | Byron Bischoff | Eric Marsh | Michael Schmitz | Cassidy Trier | Aaron Sarnat | Jenna James | Jon Borchardt | Bailey Kuehl | Evie Yu-Yen Cheng | Karen Farley | Taira Anderson | David Albright | Carissa Schoenick | Luca Soldaini | Dirk Groeneveld | Rock Yuren Pang | Pang Wei Koh | Noah A. Smith | Sophie Lebrecht | Yejin Choi | Hannaneh Hajishirzi | Ali Farhadi | Jesse Dodge
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Jiacheng Liu | Taylor Blanton | Yanai Elazar | Sewon Min | Yen-Sung Chen | Arnavi Chheda-Kothary | Huy Tran | Byron Bischoff | Eric Marsh | Michael Schmitz | Cassidy Trier | Aaron Sarnat | Jenna James | Jon Borchardt | Bailey Kuehl | Evie Yu-Yen Cheng | Karen Farley | Taira Anderson | David Albright | Carissa Schoenick | Luca Soldaini | Dirk Groeneveld | Rock Yuren Pang | Pang Wei Koh | Noah A. Smith | Sophie Lebrecht | Yejin Choi | Hannaneh Hajishirzi | Ali Farhadi | Jesse Dodge
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
We present OLMoTrace, the first system that traces the outputs of language models back to their full, multi-trillion-token training data in real time. OLMoTrace finds and shows verbatim matches between segments of language model output and documents in the training text corpora. Powered by an extended version of infini-gram (Liu et al., 2024), our system returns tracing results within a few seconds. OLMoTrace can help users understand the behavior of language models through the lens of their training data. We showcase how it can be used to explore fact checking, hallucination, and the creativity of language models. OLMoTrace is publicly available and fully open-source.
2018
AllenNLP: A Deep Semantic Natural Language Processing Platform
Matt Gardner | Joel Grus | Mark Neumann | Oyvind Tafjord | Pradeep Dasigi | Nelson F. Liu | Matthew Peters | Michael Schmitz | Luke Zettlemoyer
Proceedings of Workshop for NLP Open Source Software (NLP-OSS)
Matt Gardner | Joel Grus | Mark Neumann | Oyvind Tafjord | Pradeep Dasigi | Nelson F. Liu | Matthew Peters | Michael Schmitz | Luke Zettlemoyer
Proceedings of Workshop for NLP Open Source Software (NLP-OSS)
Modern natural language processing (NLP) research requires writing code. Ideally this code would provide a precise definition of the approach, easy repeatability of results, and a basis for extending the research. However, many research codebases bury high-level parameters under implementation details, are challenging to run and debug, and are difficult enough to extend that they are more likely to be rewritten. This paper describes AllenNLP, a library for applying deep learning methods to NLP research that addresses these issues with easy-to-use command-line tools, declarative configuration-driven experiments, and modular NLP abstractions. AllenNLP has already increased the rate of research experimentation and the sharing of NLP components at the Allen Institute for Artificial Intelligence, and we are working to have the same impact across the field.
2012
Search
Fix author
Co-authors
- Mausam . 1
- David Albright 1
- Taira Anderson 1
- Robert Bart 1
- Byron Bischoff 1
- Taylor Blanton 1
- Jon Borchardt 1
- Yen-Sung Chen 1
- Evie Yu-Yen Cheng 1
- Arnavi Chheda-Kothary 1
- Yejin Choi 1
- Pradeep Dasigi 1
- Jesse Dodge 1
- Yanai Elazar 1
- Oren Etzioni 1
- Ali Farhadi 1
- Karen Farley 1
- Matt Gardner 1
- Dirk Groeneveld 1
- Joel Grus 1
- Hannaneh Hajishirzi 1
- Jenna James 1
- Pang Wei Koh 1
- Bailey Kuehl 1
- Sophie Lebrecht 1
- Jiacheng Liu 1
- Nelson F. Liu 1
- Eric Marsh 1
- Sewon Min 1
- Mark Neumann 1
- Rock Yuren Pang 1
- Matthew E. Peters 1
- Aaron Sarnat 1
- Carissa Schoenick 1
- Noah A. Smith 1
- Stephen Soderland 1
- Luca Soldaini 1
- Oyvind Tafjord 1
- Huy Tran 1
- Cassidy Trier 1
- Luke Zettlemoyer 1