Abstract
This paper reduces discontinuous parsing to sequence labeling. It first shows that existing reductions for constituent parsing as labeling do not support discontinuities. Second, it fills this gap and proposes to encode tree discontinuities as nearly ordered permutations of the input sequence. Third, it studies whether such discontinuous representations are learnable. The experiments show that despite the architectural simplicity, under the right representation, the models are fast and accurate.- Anthology ID:
- 2020.emnlp-main.221
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2771–2785
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.221
- DOI:
- 10.18653/v1/2020.emnlp-main.221
- Cite (ACL):
- David Vilares and Carlos Gómez-Rodríguez. 2020. Discontinuous Constituent Parsing as Sequence Labeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2771–2785, Online. Association for Computational Linguistics.
- Cite (Informal):
- Discontinuous Constituent Parsing as Sequence Labeling (Vilares & Gómez-Rodríguez, EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2020.emnlp-main.221.pdf
- Code
- aghie/disco2labels