Abstract
In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles.- Anthology ID:
- L16-1234
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1471–1480
- Language:
- URL:
- https://aclanthology.org/L16-1234
- DOI:
- Cite (ACL):
- Maria Sukhareva and Christian Chiarcos. 2016. Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1471–1480, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German (Sukhareva & Chiarcos, LREC 2016)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/L16-1234.pdf
- Data
- MULTEXT-East