Abstract
Grammar induction aims to discover syntactic structures from unannotated sentences. In this paper, we propose a framework in which the learning process of the grammar model of one language is influenced by knowledge from the model of another language. Unlike previous work on multilingual grammar induction, our approach does not rely on any external resource, such as parallel corpora, word alignments or linguistic phylogenetic trees. We propose three regularization methods that encourage similarity between model parameters, dependency edge scores, and parse trees respectively. We deploy our methods on a state-of-the-art unsupervised discriminative parser and evaluate it on both transfer grammar induction and bilingual grammar induction. Empirical results on multiple languages show that our methods outperform strong baselines.- Anthology ID:
- D19-1148
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1423–1428
- Language:
- URL:
- https://aclanthology.org/D19-1148
- DOI:
- 10.18653/v1/D19-1148
- Cite (ACL):
- Yong Jiang, Wenjuan Han, and Kewei Tu. 2019. A Regularization-based Framework for Bilingual Grammar Induction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1423–1428, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- A Regularization-based Framework for Bilingual Grammar Induction (Jiang et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/D19-1148.pdf