Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement

Yingting Wu; Hai Zhao; Jia-Jun Tong

doi:10.18653/v1/K18-2007

Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement

Abstract

This paper describes the system of our team Phoenix for participating CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Given the annotated gold standard data in CoNLL-U format, we train the tokenizer, tagger and parser separately for each treebank based on an open source pipeline tool UDPipe. Our system reads the plain texts for input, performs the pre-processing steps (tokenization, lemmas, morphology) and finally outputs the syntactic dependencies. For the low-resource languages with no training data, we use cross-lingual techniques to build models with some close languages instead. In the official evaluation, our system achieves the macro-averaged scores of 65.61%, 52.26%, 55.71% for LAS, MLAS and BLEX respectively.

Anthology ID:: K18-2007
Volume:: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Month:: October
Year:: 2018
Address:: Brussels, Belgium
Editors:: Daniel Zeman, Jan Hajič
Venue:: CoNLL
SIG:: SIGNLL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 74–80
Language:
URL:: https://aclanthology.org/K18-2007
DOI:: 10.18653/v1/K18-2007
Bibkey:
Cite (ACL):: Yingting Wu, Hai Zhao, and Jia-Jun Tong. 2018. Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 74–80, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement (Wu et al., CoNLL 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/K18-2007.pdf
Data: Universal Dependencies

PDF Search