Building the Old Javanese Wordnet

David Moeljadi, Zakariya Pamuji Aminullah


Abstract
This paper discusses the construction and the ongoing development of the Old Javanese Wordnet. The words were extracted from the digitized version of the Old Javanese–English Dictionary (Zoetmulder, 1982). The wordnet is built using the ‘expansion’ approach (Vossen, 1998), leveraging on the Princeton Wordnet’s core synsets and semantic hierarchy, as well as scientific names. The main goal of our project was to produce a high quality, human-curated resource. As of December 2019, the Old Javanese Wordnet contains 2,054 concepts or synsets and 5,911 senses. It is released under a Creative Commons Attribution 4.0 International License (CC BY 4.0). We are still developing it and adding more synsets and senses. We believe that the lexical data made available by this wordnet will be useful for a variety of future uses such as the development of Modern Javanese Wordnet and many language processing tasks and linguistic research on Javanese.
Anthology ID:
2020.lrec-1.359
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2940–2946
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.359
DOI:
Bibkey:
Cite (ACL):
David Moeljadi and Zakariya Pamuji Aminullah. 2020. Building the Old Javanese Wordnet. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2940–2946, Marseille, France. European Language Resources Association.
Cite (Informal):
Building the Old Javanese Wordnet (Moeljadi & Aminullah, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.lrec-1.359.pdf