Abstract
We introduce a language-independent, graph-based probabilistic model of morphology, which uses transformation rules operating on whole words instead of the traditional morphological segmentation. The morphological analysis of a set of words is expressed through a graph having words as vertices and structural relationships between words as edges. We define a probability distribution over such graphs and develop a sampler based on the Metropolis-Hastings algorithm. The sampling is applied in order to determine the strength of morphological relationships between words, filter out accidental similarities and reduce the set of rules necessary to explain the data. The model is evaluated on the task of finding pairs of morphologically similar words, as well as generating new words. The results are compared to a state-of-the-art segmentation-based approach.- Anthology ID:
- R17-1093
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 723–732
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_093
- DOI:
- 10.26615/978-954-452-049-6_093
- Cite (ACL):
- Maciej Sumalvico. 2017. Unsupervised Learning of Morphology with Graph Sampling. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 723–732, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Unsupervised Learning of Morphology with Graph Sampling (Sumalvico, RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_093