Gaurav Mohanty


Annotated Corpus for Sentiment Analysis in Odia Language
Gaurav Mohanty | Pruthwik Mishra | Radhika Mamidi
Proceedings of the Twelfth Language Resources and Evaluation Conference

Given the lack of an annotated corpus of non-traditional Odia literature which serves as the standard when it comes sentiment analysis, we have created an annotated corpus of Odia sentences and made it publicly available to promote research in the field. Secondly, in order to test the usability of currently available Odia sentiment lexicon, we experimented with various classifiers by training and testing on the sentiment annotated corpus while using identified affective words from the same as features. Annotation and classification are done at sentence level as the usage of sentiment lexicon is best suited to sentiment analysis at this level. The created corpus contains 2045 Odia sentences from news domain annotated with sentiment labels using a well-defined annotation scheme. An inter-annotator agreement score of 0.79 is reported for the corpus.


Building a SentiWordNet for Odia
Gaurav Mohanty | Abishek Kannan | Radhika Mamidi
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

As a discipline of Natural Language Processing, Sentiment Analysis is used to extract and analyze subjective information present in natural language data. The task of Sentiment Analysis has acquired wide commercial uses including social media monitoring tasks, survey responses, review systems, etc. Languages like English have several resources which aid in the task of Sentiment Analysis. SentiWordNet and Subjectivity WordList are examples of such tools and resources. With more data being available in native vernacular, language-specific SentiWordNet(s) have become essential. For resource poor languages, creating such SentiWordNet(s) is a difficult task to achieve. One solution is to use available resources in English and translate the final source lexicon to target lexicon via machine translation. Machine translation systems for the English-Odia language pair have not yet been developed. In this paper, we discuss a method to create a SentiWordNet for Odia, which is resource-poor, by only using resources which are currently available for Indian languages. The lexicon created, would serve as a tool for Sentiment Analysis related task specific to Odia data.


Towards Building a SentiWordNet for Tamil
Abishek Kannan | Gaurav Mohanty | Radhika Mamidi
Proceedings of the 13th International Conference on Natural Language Processing