Automatic Construction of Japanese WordNet

Hiroyuki Kaji, Mariko Watanabe


Abstract
Although WordNets have been developed for a number of languages, no attempts to construct a Japanese WordNet have been known to exist. Taking this into account, we launched a project to automatically translate the Princeton WordNet into Japanese by a method of unsupervised word-sense disambiguation using bilingual comparable corpora. The method we propose aligns English word associations with those in Japanese and iteratively calculates a correlation matrix of Japanese translations of an English word versus its associated words. It then determines the Japanese translation for the English word in a synset by calculating scores for translation candidates according to the correlation matrix and the associated words appearing in the gloss appended to the synset. This method is not robust because a gloss only contains a few associated words. To overcome this difficulty, we extended the method so that it retrieves texts by using the gloss as a query and uses the retrieved texts as well as the gloss to calculate scores for translation candidates. A preliminary experiment using Wall Street Journal and Nihon Keizai Shimbun corpora demonstrated that the proposed method is promising for constructing a Japanese WordNet.
Anthology ID:
L06-1259
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/439_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Hiroyuki Kaji and Mariko Watanabe. 2006. Automatic Construction of Japanese WordNet. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Automatic Construction of Japanese WordNet (Kaji & Watanabe, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/439_pdf.pdf