Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet

Yafang Huang, Hai Zhao


Abstract
Chinese pinyin input method engine (IME) converts pinyin into character so that Chinese characters can be conveniently inputted into computer through common keyboard. IMEs work relying on its core component, pinyin-to-character conversion (P2C). Usually Chinese IMEs simply predict a list of character sequences for user choice only according to user pinyin input at each turn. However, Chinese inputting is a multi-turn online procedure, which can be supposed to be exploited for further user experience promoting. This paper thus for the first time introduces a sequence-to-sequence model with gated-attention mechanism for the core task in IMEs. The proposed neural P2C model is learned by encoding previous input utterance as extra context to enable our IME capable of predicting character sequence with incomplete pinyin input. Our model is evaluated in different benchmark datasets showing great user experience improvement compared to traditional models, which demonstrates the first engineering practice of building Chinese aided IME.
Anthology ID:
D18-1321
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2923–2929
Language:
URL:
https://aclanthology.org/D18-1321
DOI:
10.18653/v1/D18-1321
Bibkey:
Cite (ACL):
Yafang Huang and Hai Zhao. 2018. Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2923–2929, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet (Huang & Zhao, EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D18-1321.pdf
Code
 YvonneHuang/gaIME