Abstract
Detection of MultiWord Expressions (MWEs) is one of the fundamental problems in Natural Language Processing. In this paper, we focus on two categories of MWEs - Compound Nouns and Light Verb Constructions. These two categories can be tackled using knowledge bases, rather than pure statistics. We investigate usability of IndoWordNet for the detection of MWEs. Our IndoWordNet based approach uses semantic and ontological features of words that can be extracted from IndoWordNet. This approach has been tested on Indian languages viz., Assamese, Bengali, Hindi, Konkani, Marathi, Odia and Punjabi. Results show that ontological features are found to be very useful for the detection of light verb constructions, while use of semantic properties for the detection of compound nouns is found to be satisfactory. This approach can be easily adapted by other Indian languages. Detected MWEs can be interpolated into WordNets as they help in representing semantic knowledge.- Anthology ID:
- 2016.gwc-1.56
- Volume:
- Proceedings of the 8th Global WordNet Conference (GWC)
- Month:
- 27--30 January
- Year:
- 2016
- Address:
- Bucharest, Romania
- Editors:
- Christiane Fellbaum, Piek Vossen, Verginica Barbu Mititelu, Corina Forascu
- Venue:
- GWC
- SIG:
- SIGLEX
- Publisher:
- Global Wordnet Association
- Note:
- Pages:
- 404–410
- Language:
- URL:
- https://aclanthology.org/2016.gwc-1.56
- DOI:
- Cite (ACL):
- Dhirendra Singh, Sudha Bhingardive, and Pushpak Bhattacharyyaa. 2016. Detection of Compound Nouns and Light Verb Constructions using IndoWordNet. In Proceedings of the 8th Global WordNet Conference (GWC), pages 404–410, Bucharest, Romania. Global Wordnet Association.
- Cite (Informal):
- Detection of Compound Nouns and Light Verb Constructions using IndoWordNet (Singh et al., GWC 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2016.gwc-1.56.pdf