Abstract
TANL is a suite of tools for text analytics based on the software architecture paradigm of data driven pipelines. The strategies for upgrading TANL to the use of Universal Dependencies range from a minimalistic approach consisting of introducing pre/post-processing steps into the native pipeline to revising the whole pipeline. We explore the issue in the context of the Italian Treebank, considering both the efforts involved, how to avoid losing linguistically relevant information and the loss of accuracy in the process. In particular we compare different strategies for parsing and discuss the implications of simplifying the pipeline when detailed part-of-speech and morphological annotations are not available, as it is the case for less resourceful languages. The experiments are relative to the Italian linguistic pipeline, but the use of different parsers in our evaluations and the avoidance of language specific tagging make the results general enough to be useful in helping the transition to UD for other languages.- Anthology ID:
- L16-1264
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1672–1678
- Language:
- URL:
- https://aclanthology.org/L16-1264
- DOI:
- Cite (ACL):
- Maria Simi and Giuseppe Attardi. 2016. Adapting the TANL tool suite to Universal Dependencies. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1672–1678, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Adapting the TANL tool suite to Universal Dependencies (Simi & Attardi, LREC 2016)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/L16-1264.pdf