Mitash Ashish Shah


2025

pdf bib
TruthTorchLM: A Comprehensive Library for Predicting Truthfulness in LLM Outputs
Duygu Nur Yaldiz | Yavuz Faruk Bakman | Sungmin Kang | Alperen Öziş | Hayrettin Eren Yildiz | Mitash Ashish Shah | Zhiqi Huang | Anoop Kumar | Alfy Samuel | Daben Liu | Sai Praneeth Karimireddy | Salman Avestimehr
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Generative Large Language Models (LLMs) inevitably produce untruthful responses. Accurately predicting the truthfulness of these outputs is critical, especially in high-stakes settings. To accelerate research in this domain and make truthfulness prediction methods more accessible, we introduce TruthTorchLM an open-source, comprehensive Python library featuring over 30 truthfulness prediction methods, which we refer to as Truth Methods. Unlike existing toolkits such as Guardrails, which focus solely on document-grounded verification, or LM-Polygraph, which is limited to uncertainty-based methods, TruthTorchLM offers a broad and extensible collection of techniques. These methods span diverse trade-offs in computational cost, access level (e.g., black-box vs. white-box), grounding document requirements, and supervision type (self-supervised or supervised). TruthTorchLM is seamlessly compatible with both HuggingFace and LiteLLM, enabling support for locally hosted and API-based models. It also provides a unified interface for generation, evaluation, calibration, and long-form truthfulness prediction, along with a flexible framework for extending the library with new methods. We conduct an evaluation of representative truth methods on three datasets, TriviaQA, GSM8K, and FactScore-Bio.