Anupama Ryali


Shaastra Maps: Enabling Conceptual Exploration of Indic Shaastra Texts
Sai Susarla | Suryanarayana Jammalamadaka | Vaishnavi Nishankar | Siva Panuganti | Anupama Ryali | S Sushrutha
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference


lakṣyārtha (Indicated Meaning) of Śabdavyāpāra (Function of a Word) framework from kāvyaśāstra (The Science of Literary Studies) in Samskṛtam : Its application to Literary Machine Translation and other NLP tasks
Sripathi Sripada | Anupama Ryali | Raghuram Sheshadri
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

A key challenge in Literary Machine Translation is that the meaning of a sentence can be different from the sum of meanings of all the words it possesses. This poses the problem of requiring large amounts of consistently labelled training data across a variety of usages and languages. In this paper, we propose that we can economically train machine translation models to identify and paraphrase such sentences by leveraging the language independent framework of Śabdavyāpāra (Function of a Word), from Literary Sciences in Saṃskṛtam, and its definition of lakṣyārtha (‘Indicated’ meaning). An Indicated meaning exists where there is incompatibility among the literal meanings of the words in a sentence (irrespective of language). The framework defines seven categories of Indicated meaning and their characteristics. As a pilot, we identified 300 such sentences from literary and regular usage, labelled them and trained a 2d Convolutional Neural Network to categorise a sentence based on the category of Indicated meaning and finetuned a T5 to paraphrase them. We compared these paraphrased sentences with those paraphrased by a T5 finetuned on Quora Paraphrase dataset of 400,000 sentence pairs. The T5 finetuned on the Indicated meaning examples performed consistently better. Moreover, a Google Translate translates these paraphrased sentences accurately and consistently across languages