A Regularization-based Framework for Bilingual Grammar Induction

Yong Jiang, Wenjuan Han, Kewei Tu


Abstract
Grammar induction aims to discover syntactic structures from unannotated sentences. In this paper, we propose a framework in which the learning process of the grammar model of one language is influenced by knowledge from the model of another language. Unlike previous work on multilingual grammar induction, our approach does not rely on any external resource, such as parallel corpora, word alignments or linguistic phylogenetic trees. We propose three regularization methods that encourage similarity between model parameters, dependency edge scores, and parse trees respectively. We deploy our methods on a state-of-the-art unsupervised discriminative parser and evaluate it on both transfer grammar induction and bilingual grammar induction. Empirical results on multiple languages show that our methods outperform strong baselines.
Anthology ID:
D19-1148
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1423–1428
Language:
URL:
https://aclanthology.org/D19-1148
DOI:
10.18653/v1/D19-1148
Bibkey:
Cite (ACL):
Yong Jiang, Wenjuan Han, and Kewei Tu. 2019. A Regularization-based Framework for Bilingual Grammar Induction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1423–1428, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
A Regularization-based Framework for Bilingual Grammar Induction (Jiang et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D19-1148.pdf