Appidi Abhinav Reddy
Fixing paper assignments
- Please select all papers that belong to the same person.
- Indicate below which author they should be assigned to.
TODO: "submit" and "cancel" buttons here
2019
Corpus Creation and Analysis for Named Entity Recognition in Telugu-English Code-Mixed Social Media Data
Vamshi Krishna Srirangam
|
Appidi Abhinav Reddy
|
Vinay Singh
|
Manish Shrivastava
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Named Entity Recognition(NER) is one of the important tasks in Natural Language Processing(NLP) and also is a subtask of Information Extraction. In this paper we present our work on NER in Telugu-English code-mixed social media data. Code-Mixing, a progeny of multilingualism is a way in which multilingual people express themselves on social media by using linguistics units from different languages within a sentence or speech context. Entity Extraction from social media data such as tweets(twitter) is in general difficult due to its informal nature, code-mixed data further complicates the problem due to its informal, unstructured and incomplete information. We present a Telugu-English code-mixed corpus with the corresponding named entity tags. The named entities used to tag data are Person(‘Per’), Organization(‘Org’) and Location(‘Loc’). We experimented with the machine learning models Conditional Random Fields(CRFs), Decision Trees and BiLSTMs on our corpus which resulted in a F1-score of 0.96, 0.94 and 0.95 respectively.