Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach
Anup Anand Deshmukh, Qianqiu Zhang, Ming Li, Jimmy Lin, Lili Mou
Abstract
In this paper, we address unsupervised chunking as a new task of syntactic structure induction, which is helpful for understanding the linguistic structures of human languages as well as processing low-resource languages. We propose a knowledge-transfer approach that heuristically induces chunk labels from state-of-the-art unsupervised parsing models; a hierarchical recurrent neural network (HRNN) learns from such induced chunk labels to smooth out the noise of the heuristics. Experiments show that our approach largely bridges the gap between supervised and unsupervised chunking.- Anthology ID:
- 2021.findings-emnlp.307
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3626–3634
- Language:
- URL:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2021.findings-emnlp.307/
- DOI:
- 10.18653/v1/2021.findings-emnlp.307
- Cite (ACL):
- Anup Anand Deshmukh, Qianqiu Zhang, Ming Li, Jimmy Lin, and Lili Mou. 2021. Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3626–3634, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach (Deshmukh et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/Author-page-Marten-During-lu/2021.findings-emnlp.307.pdf
- Code
- anup-deshmukh/unsupervised-chunking
- Data
- English Web Treebank