Boosting the creation of a treebank

Blanca Arias; Núria Bel; Mercè Lorente; Montserrat Marimon; Alba Milà; Jorge Vivaldi; Muntsa Padró; Marina Fomicheva; Imanol Larrea

Boosting the creation of a treebank

Blanca Arias, Núria Bel, Mercè Lorente, Montserrat Marimón, Alba Milà, Jorge Vivaldi, Muntsa Padró, Marina Fomicheva, Imanol Larrea

[How to correct problems with metadata yourself]

Abstract

In this paper we present the results of an ongoing experiment of bootstrapping a Treebank for Catalan by using a Dependency Parser trained with Spanish sentences. In order to save time and cost, our approach was to profit from the typological similarities between Catalan and Spanish to create a first Catalan data set quickly by automatically: (i) annotating with a de-lexicalized Spanish parser, (ii) manually correcting the parses, and (iii) using the Catalan corrected sentences to train a Catalan parser. The results showed that the number of parsed sentences required to train a Catalan parser is about 1000 that were achieved in 4 months, with 2 annotators.

Anthology ID:: L14-1218
Volume:: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:: May
Year:: 2014
Address:: Reykjavik, Iceland
Editors:: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:: 775–781
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/225_Paper.pdf
DOI:
Bibkey:
Cite (ACL):: Blanca Arias, Núria Bel, Mercè Lorente, Montserrat Marimón, Alba Milà, Jorge Vivaldi, Muntsa Padró, Marina Fomicheva, and Imanol Larrea. 2014. Boosting the creation of a treebank. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 775–781, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):: Boosting the creation of a treebank (Arias et al., LREC 2014)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/225_Paper.pdf

PDF Search