Rafa Saiz

Also published as: R. Saiz


2006

pdf
Structure, Annotation and Tools in the Basque ZT Corpus
N. Areta | A. Gurrutxaga | I. Leturia | Z. Polin | R. Saiz | I. Alegria | X. Artola | A. Diaz de Ilarraza | N. Ezeiza | A. Sologaistoa | A. Soroa | A. Valverde
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The ZT corpus (Basque Corpus of Science and Technology) is a tagged collection of specialized texts in Basque, which wants to be a main resource in research and development about written technical Basque: terminology, syntax and style. It will be the first written corpus in Basque which will be distributed by ELDA (at the end of 2006) and it wants to be a methodological and functional reference for new projects in the future (i.e. a national corpus for Basque). We also present the technology and the tools to build this Corpus. These tools, Corpusgile and Eulia, provide a flexible and extensible infrastructure for creating, visualizing and managing corpora and for consulting, visualizing and modifying annotations generated by linguistic tools.