Distilling weighted finite automata from arbitrary probabilistic models
Ananda Theertha Suresh, Brian Roark, Michael Riley, Vlad Schogol
Abstract
Weighted finite automata (WFA) are often used to represent probabilistic models, such as n-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leibler divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization, both of which can be performed efficiently. We demonstrate the usefulness of our approach on some tasks including distilling n-gram models from neural models.- Anthology ID:
- W19-3112
- Volume:
- Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing
- Month:
- September
- Year:
- 2019
- Address:
- Dresden, Germany
- Editors:
- Heiko Vogler, Andreas Maletti
- Venue:
- FSMNLP
- SIG:
- SIGFSM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 87–97
- Language:
- URL:
- https://aclanthology.org/W19-3112
- DOI:
- 10.18653/v1/W19-3112
- Cite (ACL):
- Ananda Theertha Suresh, Brian Roark, Michael Riley, and Vlad Schogol. 2019. Distilling weighted finite automata from arbitrary probabilistic models. In Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing, pages 87–97, Dresden, Germany. Association for Computational Linguistics.
- Cite (Informal):
- Distilling weighted finite automata from arbitrary probabilistic models (Suresh et al., FSMNLP 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/W19-3112.pdf