Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Tsvetomila Mihaylova; Vlad Niculae; André F. T. Martins

doi:10.18653/v1/2020.emnlp-main.171

Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

Abstract

Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT – a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases.

Anthology ID:: 2020.emnlp-main.171
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2186–2202
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.emnlp-main.171/
DOI:: 10.18653/v1/2020.emnlp-main.171
Bibkey:
Cite (ACL):: Tsvetomila Mihaylova, Vlad Niculae, and André F. T. Martins. 2020. Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2186–2202, Online. Association for Computational Linguistics.
Cite (Informal):: Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning (Mihaylova et al., EMNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.emnlp-main.171.pdf
Video:: https://slideslive.com/38939140
Code: deep-spin/understanding-spigot
Data: SNLI, SST

PDF Cite Search Code Video Fix data