Breck Baldwin
2025
Non-Determinism of “Deterministic” LLM System Settings in Hosted Environments
Berk Atıl | Sarp Aykent | Alexa Chittams | Lisheng Fu | Rebecca J. Passonneau | Evan Radcliffe | Guru Rajan Rajagopal | Adam Sloan | Tomasz Tudrej | Ferhan Ture | Zhe Wu | Lixinyu Xu | Breck Baldwin
Proceedings of the 5th Workshop on Evaluation and Comparison of NLP Systems
Berk Atıl | Sarp Aykent | Alexa Chittams | Lisheng Fu | Rebecca J. Passonneau | Evan Radcliffe | Guru Rajan Rajagopal | Adam Sloan | Tomasz Tudrej | Ferhan Ture | Zhe Wu | Lixinyu Xu | Breck Baldwin
Proceedings of the 5th Workshop on Evaluation and Comparison of NLP Systems
LLM (large language model) users of hosted providers commonly notice that outputs can vary for the same inputs under settings expected to be deterministic. While it is difficult to get exact statistics, recent reports on specialty news sites and discussion boards suggest that among users in all communities, the majority of LLM usage today is through cloud-based APIs. Yet the questions of how pervasive non- determinism is, and how much it affects perfor- mance results, have not to our knowledge been systematically investigated. We apply five API- based LLMs configured to be deterministic to eight diverse tasks across 10 runs. Experiments reveal accuracy variations of up to 15% across runs, with a gap of up to 70% between best pos- sible performance and worst possible perfor- mance. No LLM consistently delivers the same outputs or accuracies, regardless of task. We speculate about the sources of non-determinism such as input buffer packing across multiple jobs. To better quantify our observations, we introduce metrics focused on quantifying de- terminism, TARr@N for the total agreement rate at N runs over raw output, and TARa@N for total agreement rate of parsed-out answers. Our code and data will be publicly available at https://github.com/Anonymous.
2003
Alias-i Threat Trackers
Breck Baldwin | Bob Carpenter | Aaron Ross
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations
Breck Baldwin | Bob Carpenter | Aaron Ross
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations
1999
Cross-Document Event Coreference: Annotations, Experiments, and Observations
Amit Bagga | Breck Baldwin
Coreference and Its Applications
Amit Bagga | Breck Baldwin
Coreference and Its Applications
1998
Entity-Based Cross-Document Coreferencing Using the Vector Space Model
Amit Bagga | Breck Baldwin
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics
Amit Bagga | Breck Baldwin
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics
Description of the UPENN CAMP System as Used for Coreference
Breck Baldwin | Tom Morton | Amit Bagga | Jason Baldridge | Raman Chandraseker | Alexis Dimitriadis | Kieran Snyder | Magdalena Wolska
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998
Breck Baldwin | Tom Morton | Amit Bagga | Jason Baldridge | Raman Chandraseker | Alexis Dimitriadis | Kieran Snyder | Magdalena Wolska
Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998
Entity-Based Cross-Document Coreferencing Using the Vector Space Model
Amit Bagga | Breck Baldwin
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1
Amit Bagga | Breck Baldwin
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1
Coreference as the Foundations for Link Analysis over Free Text Databases
Breck Baldwin
Content Visualization and Intermedia Representations (CVIR’98)
Breck Baldwin
Content Visualization and Intermedia Representations (CVIR’98)
Dynamic Coreference-Based Summarization
Breck Baldwin | Thomas S. Morton
Proceedings of the Third Conference on Empirical Methods for Natural Language Processing
Breck Baldwin | Thomas S. Morton
Proceedings of the Third Conference on Empirical Methods for Natural Language Processing
Overview of the University of Pennsylvania’s TIPSTER Project
Breck Baldwin | Thomas S. Morton | Amit Bagga
TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998
Breck Baldwin | Thomas S. Morton | Amit Bagga
TIPSTER TEXT PROGRAM PHASE III: Proceedings of a Workshop held at Baltimore, Maryland, October 13-15, 1998
1997
EAGLE: An Extensible Architecture for General Linguistic Engineering
Breck Baldwin | Christine Doran | Jeffrey C. Reynar | Michael Niv | B. Srinivas
Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos
Breck Baldwin | Christine Doran | Jeffrey C. Reynar | Michael Niv | B. Srinivas
Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos
CogNIAC: high precision coreference with limited knowledge and linguistic resources
Breck Baldwin
Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts
Breck Baldwin
Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts
1995
University of Pennsylvania: Description of the University of Pennsylvania System Used for MUC-6
Breck Baldwin | Jeff Reynar | Mike Collins | Jason Eisner | Adwait Ratnaparkhi | Joseph Rosenzweig | Anoop Sarkar | Srinivas
Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995
Breck Baldwin | Jeff Reynar | Mike Collins | Jason Eisner | Adwait Ratnaparkhi | Joseph Rosenzweig | Anoop Sarkar | Srinivas
Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995
1992
Search
Fix author
Co-authors
- Amit Bagga 5
- Srinivas Bangalore 2
- Thomas S. Morton 2
- Berk Atıl 1
- Sarp Aykent 1
- Jason Baldridge 1
- Bob Carpenter 1
- Raman Chandrasekar 1
- Alexa Chittams 1
- Michael Collins 1
- Alexis Dimitriadis 1
- Christine Doran 1
- Jason Eisner 1
- Lisheng Fu 1
- Tom Morton 1
- Michael Niv 1
- Rebecca J. Passonneau 1
- Evan Radcliffe 1
- Guru Rajan Rajagopal 1
- Adwait Ratnaparkhi 1
- Jeffrey C. Reynar 1
- Jeff Reynar 1
- Joseph Rosenzweig 1
- Aaron Ross 1
- Anoop Sarkar 1
- Adam Sloan 1
- Kieran Snyder 1
- Tomasz Tudrej 1
- Ferhan Türe 1
- Bonnie Webber 1
- Magdalena Wolska 1
- Zhe Wu 1
- Lixinyu Xu 1