Abstract
Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings. Thus self-training, which is effective for these settings, could be possibly beneficial to cross-lingual as well. This paper presents the first comprehensive study for self-training in cross-lingual dependency parsing. Three instance selection strategies are investigated, where two of which are based on the baseline dependency parsing model, and the third one adopts an auxiliary cross-lingual POS tagging model as evidence. We conduct experiments on the universal dependencies for eleven languages. Results show that self-training can boost the dependency parsing performances on the target languages. In addition, the POS tagger assistant instance selection can achieve further improvements consistently. Detailed analysis is conducted to examine the potentiality of self-training in-depth.- Anthology ID:
- 2020.ccl-1.75
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Editors:
- Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 807–819
- Language:
- English
- URL:
- https://aclanthology.org/2020.ccl-1.75
- DOI:
- Cite (ACL):
- Meishan Zhang and Yue Zhang. 2020. Cross-Lingual Dependency Parsing via Self-Training. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 807–819, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Cross-Lingual Dependency Parsing via Self-Training (Zhang & Zhang, CCL 2020)
- PDF:
- https://preview.aclanthology.org/landing_page/2020.ccl-1.75.pdf
- Data
- Universal Dependencies