An Unsupervised Method for Weighting Finite-state Morphological Analyzers

Amr Keleg; Francis Tyers; Nick Howell; Tommi A. Pirinen

An Unsupervised Method for Weighting Finite-state Morphological Analyzers

Amr Keleg, Francis Tyers, Nick Howell, Tommi Pirinen

Abstract

Morphological analysis is one of the tasks that have been studied for years. Different techniques have been used to develop models for performing morphological analysis. Models based on finite state transducers have proved to be more suitable for languages with low available resources. In this paper, we have developed a method for weighting a morphological analyzer built using finite state transducers in order to disambiguate its results. The method is based on a word2vec model that is trained in a completely unsupervised way using raw untagged corpora and is able to capture the semantic meaning of the words. Most of the methods used for disambiguating the results of a morphological analyzer relied on having tagged corpora that need to manually built. Additionally, the method developed uses information about the token irrespective of its context unlike most of the other techniques that heavily rely on the word’s context to disambiguate its set of candidate analyses.

Anthology ID:: 2020.lrec-1.474
Volume:: Proceedings of the 12th Language Resources and Evaluation Conference
Month:: May
Year:: 2020
Address:: Marseille, France
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 3842–3850
Language:: English
URL:: https://aclanthology.org/2020.lrec-1.474
DOI:
Bibkey:
Cite (ACL):: Amr Keleg, Francis Tyers, Nick Howell, and Tommi Pirinen. 2020. An Unsupervised Method for Weighting Finite-state Morphological Analyzers. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 3842–3850, Marseille, France. European Language Resources Association.
Cite (Informal):: An Unsupervised Method for Weighting Finite-state Morphological Analyzers (Keleg et al., LREC 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/update-css-js/2020.lrec-1.474.pdf
Code: additional community code

PDF Cite Search Code