Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation
Na Liu, Xiangdong Su, Haoran Zhang, Guanglai Gao, Feilong Bao
Abstract
Mongolian morphological segmentation is regarded as a crucial preprocessing step in many Mongolian related NLP applications and has received extensive attention. Recently, end-to-end segmentation approaches with long short-term memory networks (LSTM) have achieved excellent results. However, the inner-word features among characters in the word and the out-word features from context are not well utilized in the segmentation process. In this paper, we propose a neural network incorporating inner-word and out-word features for Mongolian morphological segmentation. The network consists of two encoders and one decoder. The inner-word encoder uses the self-attention mechanisms to capture the inner-word features of the target word. The out-word encoder employs a two layers BiLSTM network to extract out-word features in the sentence. Then, the decoder adopts a multi-head double attention layer to fuse the inner-word features and out-word features and produces the segmentation result. The evaluation experiment compares the proposed network with the baselines and explores the effectiveness of the sub-modules.- Anthology ID:
- 2020.coling-main.408
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 4638–4648
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.408
- DOI:
- 10.18653/v1/2020.coling-main.408
- Cite (ACL):
- Na Liu, Xiangdong Su, Haoran Zhang, Guanglai Gao, and Feilong Bao. 2020. Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4638–4648, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Incorporating Inner-word and Out-word Features for Mongolian Morphological Segmentation (Liu et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2020.coling-main.408.pdf