Abstract
The Aranea Project is targeted at creation of a family of Gigaword web-corpora for a dozen of languages that could be used for teaching language- and linguistics-related subjects at Slovak universities, as well as for research purposes in various areas of linguistics. All corpora are being built according to a standard methodology and using the same set of tools for processing and annotation, which ― together with their standard size and― makes them also a valuable resource for translators and contrastive studies. All our corpora are freely available either via a web interface or in a source form in an annotated vertical format.- Anthology ID:
- L16-1672
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 4245–4248
- Language:
- URL:
- https://aclanthology.org/L16-1672
- DOI:
- Cite (ACL):
- Vladimír Benko. 2016. Two Years of Aranea: Increasing Counts and Tuning the Pipeline. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 4245–4248, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Two Years of Aranea: Increasing Counts and Tuning the Pipeline (Benko, LREC 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/L16-1672.pdf