Hyperion: Private Token Sampling with Homomorphic Encryption

Lawrence Lim, Jiaming Liu, Vikas Kalagi, Divyakant Agrawal, Amr El Abbadi


Abstract
A promising direction for enabling private queries to large language models (LLMs) is with homomorphic encryption (HE). An open problem is performing token sampling under HE. In this paper, we introduce Hyperion, an efficient HE algorithm for inverse transform sampling, enabling private token sampling with 1 comparison depth, O(1) amortized comparisons, and O(log n) rotations. We implement our approach and demonstrate that it samples tokens in 0.14 seconds for 32k tokens (≈ 4.4\ 𝜇 s per token) on GPU, achieving a 100× latency improvement over prior work.
Anthology ID:
2026.acl-long.644
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14150–14159
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.644/
DOI:
Bibkey:
Cite (ACL):
Lawrence Lim, Jiaming Liu, Vikas Kalagi, Divyakant Agrawal, and Amr El Abbadi. 2026. Hyperion: Private Token Sampling with Homomorphic Encryption. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14150–14159, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Hyperion: Private Token Sampling with Homomorphic Encryption (Lim et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.644.pdf
Checklist:
 2026.acl-long.644.checklist.pdf