Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs
Anej Svete, Benjamin Dayan, Ryan Cotterell, Tim Vieira, Jason Eisner
Abstract
Weighted finite-state automata (WSFAs) arecommonly used in NLP. Failure transitions area useful extension for compactly representingbackoffs or interpolation in n-gram modelsand CRFs, which are special cases of WFSAs.Unfortunately, applying standard algorithmsfor computing the pathsum requires expand-ing these compact failure transitions. As aresult, na ̈ıve computation of the pathsum inacyclic WFSAs with failure transitions runs inO(|Q|2|Σ|) (O(|Q||Σ|) for deterministic WF-SAs) while the equivalent algorithm in normalWFSAs runs in O(|E|), where E representsthe set of transitions, Q the set of states, andΣ the alphabet. In this work, we present moreefficient algorithms for computing the pathsumin sparse acyclic WFSAs, i.e., WFSAs with av-erage out symbol fraction s ≪ 1. In those,backward runs in O(s|Q||Σ|). We proposean algorithm for semiring-weighted automatawhich runs in O(|E| + s|Σ||Q||Tmax| log |Σ|),where |Tmax| is the size of the largest con-nected component of failure transitions. Ad-ditionally, we propose faster algorithms fortwo specific cases. For ring-weighted WF-SAs we propose an algorithm with complex-ity O(|E| + s|Σ||Q||πmax|), where |πmax| de-notes the longest path length of failure transi-tions stemming from q and Σ(q) the set of sym-bols on the outgoing transitions from q. Forsemiring-weighted WFSAs whose failure tran-sition topology satisfies a condition exemplifiedby CRFs, we propose an algorithm with com-plexity O(|E| + s|Σ||Q| log |Σ|).- Anthology ID:
- 2022.emnlp-main.567
- Original:
- 2022.emnlp-main.567v1
- Version 2:
- 2022.emnlp-main.567v2
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 8289–8305
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.567
- DOI:
- 10.18653/v1/2022.emnlp-main.567
- Cite (ACL):
- Anej Svete, Benjamin Dayan, Ryan Cotterell, Tim Vieira, and Jason Eisner. 2022. Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8289–8305, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs (Svete et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.567.pdf