Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank

Meishan Zhang, Yue Zhang, Guohong Fu


Abstract
Treebank translation is a promising method for cross-lingual transfer of syntactic dependency knowledge. The basic idea is to map dependency arcs from a source treebank to its target translation according to word alignments. This method, however, can suffer from imperfect alignment between source and target words. To address this problem, we investigate syntactic transfer by code mixing, translating only confident words in a source treebank. Cross-lingual word embeddings are leveraged for transferring syntactic knowledge to the target from the resulting code-mixed treebank. Experiments on University Dependency Treebanks show that code-mixed treebanks are more effective than translated treebanks, giving highly competitive performances among cross-lingual parsing methods.
Anthology ID:
D19-1092
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
997–1006
Language:
URL:
https://aclanthology.org/D19-1092
DOI:
10.18653/v1/D19-1092
Bibkey:
Cite (ACL):
Meishan Zhang, Yue Zhang, and Guohong Fu. 2019. Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 997–1006, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank (Zhang et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D19-1092.pdf