Prasanna Kumar Kumaresan


2022

pdf
Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann
Shankar Mahadevan | Rahul Ponnusamy | Prasanna Kumar Kumaresan | Prabakaran Chandran | Ruba Priyadharshini | Sangeetha S | Bharathi Raja Chakravarthi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Thirumurai, also known as Panniru Thirumurai, is a collection of Tamil Shaivite poems dating back to the Hindu revival period between the 6th and the 10th century. These poems are par excellence, in both literary and musical terms. They have been composed based on the ancient, now non-existent Tamil Pann system and can be set to music. We present a large dataset containing all the Thirumurai poems and also attempt to classify the Pann and author of each poem using transformer based architectures. Our work is the first of its kind in dealing with ancient Tamil text datasets, which are severely under-resourced. We explore several Deep Learning-based techniques for solving this challenge effectively and provide essential insights into the problem and how to address it.

2021

pdf
Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada
Bharathi Raja Chakravarthi | Ruba Priyadharshini | Navya Jose | Anand Kumar M | Thomas Mandl | Prasanna Kumar Kumaresan | Rahul Ponnusamy | Hariharan R L | John P. McCrae | Elizabeth Sherly
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages

Detecting offensive language in social media in local languages is critical for moderating user-generated content. Thus, the field of offensive language identification in under-resourced Tamil, Malayalam and Kannada languages are essential. As the user-generated content is more code-mixed and not well studied for under-resourced languages, it is imperative to create resources and conduct benchmarking studies to encourage research in under-resourced Dravidian languages. We created a shared task on offensive language detection in Dravidian languages. We summarize here the dataset for this challenge which are openly available at https://competitions.codalab.org/competitions/27654, and present an overview of the methods and the results of the competing systems.

pdf
IIITK@LT-EDI-EACL2021: Hope Speech Detection for Equality, Diversity, and Inclusion in Tamil , Malayalam and English
Nikhil Ghanghor | Rahul Ponnusamy | Prasanna Kumar Kumaresan | Ruba Priyadharshini | Sajeetha Thavareesan | Bharathi Raja Chakravarthi
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion

This paper describes the IIITK’s team submissions to the hope speech detection for equality, diversity and inclusion in Dravidian languages shared task organized by LT-EDI 2021 workshop@EACL 2021. Our best configurations for the shared tasks achieve weighted F1 scores of 0.60 for Tamil, 0.83 for Malayalam, and 0.93 for English. We have secured ranks of 4, 3, 2 in Tamil, Malayalam and English respectively.