Annotated Corpus for Sentiment Analysis in Odia Language

Gaurav Mohanty, Pruthwik Mishra, Radhika Mamidi


Abstract
Given the lack of an annotated corpus of non-traditional Odia literature which serves as the standard when it comes sentiment analysis, we have created an annotated corpus of Odia sentences and made it publicly available to promote research in the field. Secondly, in order to test the usability of currently available Odia sentiment lexicon, we experimented with various classifiers by training and testing on the sentiment annotated corpus while using identified affective words from the same as features. Annotation and classification are done at sentence level as the usage of sentiment lexicon is best suited to sentiment analysis at this level. The created corpus contains 2045 Odia sentences from news domain annotated with sentiment labels using a well-defined annotation scheme. An inter-annotator agreement score of 0.79 is reported for the corpus.
Anthology ID:
2020.lrec-1.339
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2788–2795
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.339
DOI:
Bibkey:
Cite (ACL):
Gaurav Mohanty, Pruthwik Mishra, and Radhika Mamidi. 2020. Annotated Corpus for Sentiment Analysis in Odia Language. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2788–2795, Marseille, France. European Language Resources Association.
Cite (Informal):
Annotated Corpus for Sentiment Analysis in Odia Language (Mohanty et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.lrec-1.339.pdf