L1-L2 Parallel Dependency Treebank as Learner Corpus

John S. Y. Lee; Keying Li; Herman Leung

L1-L2 Parallel Dependency Treebank as Learner Corpus

Abstract

This opinion paper proposes the use of parallel treebank as learner corpus. We show how an L1-L2 parallel treebank — i.e., parse trees of non-native sentences, aligned to the parse trees of their target hypotheses — can facilitate retrieval of sentences with specific learner errors. We argue for its benefits, in terms of corpus re-use and interoperability, over a conventional learner corpus annotated with error tags. As a proof of concept, we conduct a case study on word-order errors made by learners of Chinese as a foreign language. We report precision and recall in retrieving a range of word-order error categories from L1-L2 tree pairs annotated in the Universal Dependency framework.

Anthology ID:: W17-6306
Volume:: Proceedings of the 15th International Conference on Parsing Technologies
Month:: September
Year:: 2017
Address:: Pisa, Italy
Venue:: IWPT
SIG:: SIGPARSE
Publisher:: Association for Computational Linguistics
Note:
Pages:: 44–49
Language:
URL:: https://aclanthology.org/W17-6306
DOI:
Bibkey:
Cite (ACL):: John Lee, Keying Li, and Herman Leung. 2017. L1-L2 Parallel Dependency Treebank as Learner Corpus. In Proceedings of the 15th International Conference on Parsing Technologies, pages 44–49, Pisa, Italy. Association for Computational Linguistics.
Cite (Informal):: L1-L2 Parallel Dependency Treebank as Learner Corpus (Lee et al., IWPT 2017)
Copy Citation:
PDF:: https://preview.aclanthology.org/nodalida-main-page/W17-6306.pdf
Data: Universal Dependencies

PDF Search