Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann
Shankar Mahadevan, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Prabakaran Chandran, Ruba Priyadharshini, Sangeetha S, Bharathi Raja Chakravarthi
Abstract
Thirumurai, also known as Panniru Thirumurai, is a collection of Tamil Shaivite poems dating back to the Hindu revival period between the 6th and the 10th century. These poems are par excellence, in both literary and musical terms. They have been composed based on the ancient, now non-existent Tamil Pann system and can be set to music. We present a large dataset containing all the Thirumurai poems and also attempt to classify the Pann and author of each poem using transformer based architectures. Our work is the first of its kind in dealing with ancient Tamil text datasets, which are severely under-resourced. We explore several Deep Learning-based techniques for solving this challenge effectively and provide essential insights into the problem and how to address it.- Anthology ID:
- 2022.lrec-1.704
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 6556–6562
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.704
- DOI:
- Cite (ACL):
- Shankar Mahadevan, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Prabakaran Chandran, Ruba Priyadharshini, Sangeetha S, and Bharathi Raja Chakravarthi. 2022. Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6556–6562, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Thirumurai: A Large Dataset of Tamil Shaivite Poems and Classification of Tamil Pann (Mahadevan et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.lrec-1.704.pdf