Cross-Lingual Dependency Parsing via Self-Training

Meishan Zhang, Yue Zhang


Abstract
Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings. Thus self-training, which is effective for these settings, could be possibly beneficial to cross-lingual as well. This paper presents the first comprehensive study for self-training in cross-lingual dependency parsing. Three instance selection strategies are investigated, where two of which are based on the baseline dependency parsing model, and the third one adopts an auxiliary cross-lingual POS tagging model as evidence. We conduct experiments on the universal dependencies for eleven languages. Results show that self-training can boost the dependency parsing performances on the target languages. In addition, the POS tagger assistant instance selection can achieve further improvements consistently. Detailed analysis is conducted to examine the potentiality of self-training in-depth.
Anthology ID:
2020.ccl-1.75
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
807–819
Language:
English
URL:
https://aclanthology.org/2020.ccl-1.75
DOI:
Bibkey:
Cite (ACL):
Meishan Zhang and Yue Zhang. 2020. Cross-Lingual Dependency Parsing via Self-Training. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 807–819, Haikou, China. Chinese Information Processing Society of China.
Cite (Informal):
Cross-Lingual Dependency Parsing via Self-Training (Zhang & Zhang, CCL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.ccl-1.75.pdf
Data
Universal Dependencies