Abstract
This paper proposes a partially deterministic morphological analysis method for improved processing speed. Maximum matching is a fast deterministic method for morphological analysis. However, the method tends to decrease performance due to lack of consideration of contextual information. In order to use maximum matching safely, we propose the use of Context Independent Strings (CISs), which are strings that do not have ambiguity in terms of morphological analysis. Our method first identifies CISs in a sentence using maximum matching without contextual information, then analyzes the unprocessed part of the sentence using a bi-gram-based morphological analysis model. We evaluate the method on a Japanese morphological analysis task. The experimental results show a 30% reduction of running time while maintaining improved accuracy.- Anthology ID:
- R19-1093
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 804–809
- Language:
- URL:
- https://aclanthology.org/R19-1093
- DOI:
- 10.26615/978-954-452-056-4_093
- Cite (ACL):
- Hajime Morita and Tomoya Iwakura. 2019. A Fast and Accurate Partially Deterministic Morphological Analysis. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 804–809, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- A Fast and Accurate Partially Deterministic Morphological Analysis (Morita & Iwakura, RANLP 2019)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/R19-1093.pdf