Constructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora

Hiroyuki Kaji, Osamu Imaichi


Abstract
A novel thesaurus named a gword-sense association networkh is proposed for the first time. It consists of nodes representing word senses, each of which is defined as a set consisting of a word and its translation equivalents, and edges connecting topically associated word senses. This word-sense association network is produced from a bilingual dictionary and comparable corpora by means of a newly developed fully automatic method. The feasibility and effectiveness of the method were demonstrated experimentally by using the EDR English-Japanese dictionary together with Wall Street Journal and Nihon Keizai Shimbun corpora. The word-sense association networks were applied to word-sense disambiguation as well as to a query interface for information retrieval.
Anthology ID:
L04-1231
Volume:
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
Month:
May
Year:
2004
Address:
Lisbon, Portugal
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/401.pdf
DOI:
Bibkey:
Cite (ACL):
Hiroyuki Kaji and Osamu Imaichi. 2004. Constructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal. European Language Resources Association (ELRA).
Cite (Informal):
Constructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora (Kaji & Imaichi, LREC 2004)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2004/pdf/401.pdf