Abstract
In this paper, we study the task of graph-based constituent parsing in the setting that binarization is not conducted as a pre-processing step, where a constituent tree may consist of nodes with more than two children. Previous graph-based methods on this setting typically generate hidden nodes with the dummy label inside the n-ary nodes, in order to transform the tree into a binary tree for prediction. The limitation is that the hidden nodes break the sibling relations of the n-ary node’s children. Consequently, the dependencies of such sibling constituents might not be accurately modeled and is being ignored. To solve this limitation, we propose a novel graph-based framework, which is called “recursive semi-Markov model”. The main idea is to utilize 1-order semi-Markov model to predict the immediate children sequence of a constituent candidate, which then recursively serves as a child candidate of its parent. In this manner, the dependencies of sibling constituents can be described by 1-order transition features, which solves the above limitation. Through experiments, the proposed framework obtains the F1 of 95.92% and 92.50% on the datasets of PTB and CTB 5.1 respectively. Specially, the recursive semi-Markov model shows advantages in modeling nodes with more than two children, whose average F1 can be improved by 0.3-1.1 points in PTB and 2.3-6.8 points in CTB 5.1.- Anthology ID:
- 2021.acl-long.205
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2631–2642
- Language:
- URL:
- https://aclanthology.org/2021.acl-long.205
- DOI:
- 10.18653/v1/2021.acl-long.205
- Cite (ACL):
- Xin Xin, Jinlong Li, and Zeqi Tan. 2021. N-ary Constituent Tree Parsing with Recursive Semi-Markov Model. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2631–2642, Online. Association for Computational Linguistics.
- Cite (Informal):
- N-ary Constituent Tree Parsing with Recursive Semi-Markov Model (Xin et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2021.acl-long.205.pdf
- Code
- NP-NET-research/Recursive-Semi-Markov-Model
- Data
- Penn Treebank