Delexicalized transfer parsing for low-resource languages using transformed and combined treebanks

Ayan Das, Affan Zaffar, Sudeshna Sarkar


Abstract
This paper describes our dependency parsing system in CoNLL-2017 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. We primarily focus on the low-resource languages (surprise languages). We have developed a framework to combine multiple treebanks to train parsers for low resource languages by delexicalization method. We have applied transformation on source language treebanks based on syntactic features of the low-resource language to improve performance of the parser. In the official evaluation, our system achieves an macro-averaged LAS score of 67.61 and 37.16 on the entire blind test data and the surprise language test data respectively.
Anthology ID:
K17-3019
Volume:
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Jan Hajič, Dan Zeman
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
182–190
Language:
URL:
https://aclanthology.org/K17-3019
DOI:
10.18653/v1/K17-3019
Bibkey:
Cite (ACL):
Ayan Das, Affan Zaffar, and Sudeshna Sarkar. 2017. Delexicalized transfer parsing for low-resource languages using transformed and combined treebanks. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 182–190, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Delexicalized transfer parsing for low-resource languages using transformed and combined treebanks (Das et al., CoNLL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/K17-3019.pdf
Data
Universal Dependencies