Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis

Shen Huang, Houfeng Wang


Abstract
Grammatical Error Diagnosis for Chinese has always been a challenge for both foreign learners and NLP researchers, for the variousity of grammar and the flexibility of expression. In this paper, we present a model based on Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the task as a sequence labeling problem, so as to detect Chinese grammatical errors, to identify the error types and to locate the error positions. In the corpora of this year’s shared task, there can be multiple errors in a single offset of a sentence, to address which, we simutaneously train three Bi-LSTM models sharing word embeddings which label Missing, Redundant and Selection errors respectively. We regard word ordering error as a special kind of word selection error which is longer during training phase, and then separate them by length during testing phase. In NLP-TEA 3 shared task for Chinese Grammatical Error Diagnosis(CGED), Our system achieved relatively high F1 for all the three levels in the traditional Chinese track and for the detection level in the Simpified Chinese track.
Anthology ID:
W16-4919
Volume:
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
NLP-TEA
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
148–154
Language:
URL:
https://aclanthology.org/W16-4919
DOI:
Bibkey:
Cite (ACL):
Shen Huang and Houfeng Wang. 2016. Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis. In Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016), pages 148–154, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis (Huang & Wang, NLP-TEA 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/W16-4919.pdf