An Algebra for Feature Extraction

Vivek Srikumar


Abstract
Though feature extraction is a necessary first step in statistical NLP, it is often seen as a mere preprocessing step. Yet, it can dominate computation time, both during training, and especially at deployment. In this paper, we formalize feature extraction from an algebraic perspective. Our formalization allows us to define a message passing algorithm that can restructure feature templates to be more computationally efficient. We show via experiments on text chunking and relation extraction that this restructuring does indeed speed up feature extraction in practice by reducing redundant computation.
Anthology ID:
P17-1173
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1891–1900
Language:
URL:
https://aclanthology.org/P17-1173
DOI:
10.18653/v1/P17-1173
Bibkey:
Cite (ACL):
Vivek Srikumar. 2017. An Algebra for Feature Extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1891–1900, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
An Algebra for Feature Extraction (Srikumar, ACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/P17-1173.pdf