Abstract
There are a number of morphological analysers for Polish. Most of these, however, are non-free resources. What is more, different analysers employ different tagsets and tokenisation strategies. This situation calls for a simple and universal framework to join different sources of morphological information, including the existing resources as well as user-provided dictionaries. We present such a configurable framework that allows to write simple configuration files that define tokenisation strategies and the behaviour of morphological analysers, including simple tagset conversion.- Anthology ID:
- 2011.freeopmt-1.6
- Volume:
- Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation
- Month:
- January 20-21
- Year:
- 2011
- Address:
- Barcelona, Spain
- Editors:
- Felipe Sánchez-Martinez, Juan Antonio Pérez-Ortiz
- Venue:
- FreeOpMT
- SIG:
- Publisher:
- Note:
- Pages:
- 29–36
- Language:
- URL:
- https://aclanthology.org/2011.freeopmt-1.6
- DOI:
- Cite (ACL):
- Adam Radziszewski and Tomasz Śniatowski. 2011. Maca – a configurable tool to integrate Polish morphological data. In Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation, pages 29–36, Barcelona, Spain.
- Cite (Informal):
- Maca – a configurable tool to integrate Polish morphological data (Radziszewski & Śniatowski, FreeOpMT 2011)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2011.freeopmt-1.6.pdf