Generative Semantic Hashing Enhanced via Boltzmann Machines

Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen


Abstract
Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint. For the tractability of training, existing generative-hashing methods mostly assume a factorized form for the posterior distribution, enforcing independence among the bits of hash codes. From the perspectives of both model representation and code space size, independence is always not the best assumption. In this paper, to introduce correlations among the bits of hash codes, we propose to employ the distribution of Boltzmann machine as the variational posterior. To address the intractability issue of training, we first develop an approximate method to reparameterize the distribution of a Boltzmann machine by augmenting it as a hierarchical concatenation of a Gaussian-like distribution and a Bernoulli distribution. Based on that, an asymptotically-exact lower bound is further derived for the evidence lower bound (ELBO). With these novel techniques, the entire model can be optimized efficiently. Extensive experimental results demonstrate that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.
Anthology ID:
2020.acl-main.71
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
777–788
Language:
URL:
https://aclanthology.org/2020.acl-main.71
DOI:
10.18653/v1/2020.acl-main.71
Bibkey:
Cite (ACL):
Lin Zheng, Qinliang Su, Dinghan Shen, and Changyou Chen. 2020. Generative Semantic Hashing Enhanced via Boltzmann Machines. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 777–788, Online. Association for Computational Linguistics.
Cite (Informal):
Generative Semantic Hashing Enhanced via Boltzmann Machines (Zheng et al., ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2020.acl-main.71.pdf
Video:
 http://slideslive.com/38928915