Synset Ranking of Hindi WordNet

Sudha Bhingardive, Rajita Shukla, Jaya Saraswati, Laxmi Kashyap, Dhirendra Singh, Pushpak Bhattacharyya

[How to correct problems with metadata yourself]


Abstract
Word Sense Disambiguation (WSD) is one of the open problems in the area of natural language processing. Various supervised, unsupervised and knowledge based approaches have been proposed for automatically determining the sense of a word in a particular context. It has been observed that such approaches often find it difficult to beat the WordNet First Sense (WFS) baseline which assigns the sense irrespective of context. In this paper, we present our work on creating the WFS baseline for Hindi language by manually ranking the synsets of Hindi WordNet. A ranking tool is developed where human experts can see the frequency of the word senses in the sense-tagged corpora and have been asked to rank the senses of a word by using this information and also his/her intuition. The accuracy of WFS baseline is tested on several standard datasets. F-score is found to be 60%, 65% and 55% on Health, Tourism and News datasets respectively. The created rankings can also be used in other NLP applications viz., Machine Translation, Information Retrieval, Text Summarization, etc.
Anthology ID:
L16-1485
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3039–3043
Language:
URL:
https://aclanthology.org/L16-1485
DOI:
Bibkey:
Cite (ACL):
Sudha Bhingardive, Rajita Shukla, Jaya Saraswati, Laxmi Kashyap, Dhirendra Singh, and Pushpak Bhattacharyya. 2016. Synset Ranking of Hindi WordNet. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3039–3043, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Synset Ranking of Hindi WordNet (Bhingardive et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/L16-1485.pdf