Abstract
Question classification is a crucial subtask in question answering system. Mongolian is a kind of few resource language. It lacks public labeled corpus. And the complex morphological structure of Mongolian vocabulary makes the data-sparse problem. This paper proposes a classification model, which combines the Bi-LSTM model with the Multi-Head Attention mechanism. The Multi-Head Attention mechanism extracts relevant information from different dimensions and representation subspace. According to the characteristics of Mongolian word-formation, this paper introduces Mongolian morphemes representation in the embedding layer. Morpheme vector focuses on the semantics of the Mongolian word. In this paper, character vector and morpheme vector are concatenated to get word vector, which sends to the Bi-LSTM getting context representation. Finally, the Multi-Head Attention obtains global information for classification. The model experimented on the Mongolian corpus. Experimental results show that our proposed model significantly outperforms baseline systems.- Anthology ID:
- 2020.ccl-1.95
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Editors:
- Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 1026–1034
- Language:
- English
- URL:
- https://aclanthology.org/2020.ccl-1.95
- DOI:
- Cite (ACL):
- Guangyi Wang, Feilong Bao, and Weihua Wang. 2020. Mongolian Questions Classification Based on Mulit-Head Attention. In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 1026–1034, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Mongolian Questions Classification Based on Mulit-Head Attention (Wang et al., CCL 2020)
- PDF:
- https://preview.aclanthology.org/landing_page/2020.ccl-1.95.pdf