Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese

Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi


Abstract
Chinese characters are used both in Japanese and Chinese, which are called Kanji and Hanzi respectively. Chinese characters contain significant semantic information, a mapping table between Kanji and Hanzi can be very useful for many Japanese-Chinese bilingual applications, such as machine translation and cross-lingual information retrieval. Because Kanji characters are originated from ancient China, most Kanji have corresponding Chinese characters in Hanzi. However, the relation between Kanji and Hanzi is quite complicated. In this paper, we propose a method of making a Chinese characters mapping table of Japanese, Traditional Chinese and Simplified Chinese automatically by means of freely available resources. We define seven categories for Kanji based on the relation between Kanji and Hanzi, and classify mappings of Chinese characters into these categories. We use a resource from Wiktionary to show the completeness of the mapping table we made. Statistical comparison shows that our proposed method makes a more complete mapping table than the current version of Wiktionary.
Anthology ID:
L12-1140
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2149–2152
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/306_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi. 2012. Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2149–2152, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese (Chu et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/306_Paper.pdf