Abstract
We present an incremental syntactic representation that consists of assigning a single discrete label to each word in a sentence, where the label is predicted using strictly incremental processing of a prefix of the sentence, and the sequence of labels for a sentence fully determines a parse tree. Our goal is to induce a syntactic representation that commits to syntactic choices only as they are incrementally revealed by the input, in contrast with standard representations that must make output choices such as attachments speculatively and later throw out conflicting analyses. Our learned representations achieve 93.72 F1 on the Penn Treebank with as few as 5 bits per word, and at 8 bits per word they achieve 94.97 F1, which is comparable with other state of the art parsing models when using the same pre-trained embeddings. We also provide an analysis of the representations learned by our system, investigating properties such as the interpretable syntactic features captured by the system and mechanisms for deferred resolution of syntactic ambiguities.- Anthology ID:
- 2022.acl-long.220
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3086–3095
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.220
- DOI:
- 10.18653/v1/2022.acl-long.220
- Award:
- Best Paper
- Cite (ACL):
- Nikita Kitaev, Thomas Lu, and Dan Klein. 2022. Learned Incremental Representations for Parsing. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3086–3095, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Learned Incremental Representations for Parsing (Kitaev et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.acl-long.220.pdf
- Code
- thomaslu2000/incremental-parsing-representations
- Data
- Penn Treebank