Evaluating Lottery Tickets Under Distributional Shifts

Shrey Desai, Hongyuan Zhan, Ahmed Aly


Abstract
The Lottery Ticket Hypothesis suggests large, over-parameterized neural networks consist of small, sparse subnetworks that can be trained in isolation to reach a similar (or better) test accuracy. However, the initialization and generalizability of the obtained sparse subnetworks have been recently called into question. Our work focuses on evaluating the initialization of sparse subnetworks under distributional shifts. Specifically, we investigate the extent to which a sparse subnetwork obtained in a source domain can be re-trained in isolation in a dissimilar, target domain. In addition, we examine the effects of different initialization strategies at transfer-time. Our experiments show that sparse subnetworks obtained through lottery ticket training do not simply overfit to particular domains, but rather reflect an inductive bias of deep neural networks that can be exploited in multiple domains.
Anthology ID:
D19-6117
Volume:
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Colin Cherry, Greg Durrett, George Foster, Reza Haffari, Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
153–162
Language:
URL:
https://aclanthology.org/D19-6117
DOI:
10.18653/v1/D19-6117
Bibkey:
Cite (ACL):
Shrey Desai, Hongyuan Zhan, and Ahmed Aly. 2019. Evaluating Lottery Tickets Under Distributional Shifts. In Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pages 153–162, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Evaluating Lottery Tickets Under Distributional Shifts (Desai et al., 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/D19-6117.pdf