Statistical inference on black-box generative models in the data kernel perspective space

Hayden Helm, Aranyak Acharyya, Youngser Park, Brandon Duderstadt, Carey Priebe


Abstract
Generative models are capable of producing human-expert level content across a variety of topics and domains. As the impact of generative models grows, it is necessary to develop statistical methods to understand collections of available models. These methods are particularly important in settings where the user may not have access to information related to a model’s pre-training data, weights, or other relevant model-level covariates. In this paper we extend recent results on representations of black-box generative models to model-level statistical inference tasks. We demonstrate that the model-level representations are effective for multiple inference tasks.
Anthology ID:
2025.findings-acl.204
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3955–3970
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.204/
DOI:
10.18653/v1/2025.findings-acl.204
Bibkey:
Cite (ACL):
Hayden Helm, Aranyak Acharyya, Youngser Park, Brandon Duderstadt, and Carey Priebe. 2025. Statistical inference on black-box generative models in the data kernel perspective space. In Findings of the Association for Computational Linguistics: ACL 2025, pages 3955–3970, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Statistical inference on black-box generative models in the data kernel perspective space (Helm et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.findings-acl.204.pdf