Chinese Spelling Check based on N-gram and String Matching Algorithm

Jui-Feng Yeh, Li-Ting Chang, Chan-Yi Liu, Tsung-Wei Hsu


Abstract
This paper presents a Chinese spelling check approach based on language models combined with string match algorithm to treat the problems resulted from the influence caused by Cantonese mother tone. N-grams first used to detecting the probability of sentence constructed by the writers, a string matching algorithm called Knuth-Morris-Pratt (KMP) Algorithm is used to detect and correct the error. According to the experimental results, the proposed approach can detect the error and provide the corresponding correction.
Anthology ID:
W17-5906
Volume:
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)
Month:
December
Year:
2017
Address:
Taipei, Taiwan
Editors:
Yuen-Hsien Tseng, Hsin-Hsi Chen, Lung-Hao Lee, Liang-Chih Yu
Venue:
NLP-TEA
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
35–38
Language:
URL:
https://aclanthology.org/W17-5906
DOI:
Bibkey:
Cite (ACL):
Jui-Feng Yeh, Li-Ting Chang, Chan-Yi Liu, and Tsung-Wei Hsu. 2017. Chinese Spelling Check based on N-gram and String Matching Algorithm. In Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017), pages 35–38, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Chinese Spelling Check based on N-gram and String Matching Algorithm (Yeh et al., NLP-TEA 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/W17-5906.pdf