A Word-Splitting Approach to Kannada Sanskrit Sandhi Words Useful in Effective English Translation

Shanta Kallur, Basavaraj S. Anami


Abstract
Natural Language Processing is a branch of artificial intelligence that enables man- machine interactions through regional languages. In Kannada, there are two types of Sandhi: Kannada Sandhi and Sanskrit Sandhi. A morph-phonemic word “Sandhi” is created when two words or distinct morphemes are joined or combined. Conversely, Sandhi word splitting reverses this process. Rules governing Sandhi exist across all the Dravidian languages. A rule-based method has been developed to split Sanskrit Sandhi words into their components within Kannada sentences. Once the Sanskrit Sandhi (SS) words are split, the type of Sandhi is also identified, facilitating accurate translation of the Sanskrit Sandhi words into English. This paper discusses seven types of SS words: SavarNadeergha, YaN, GuNa, Vruddhi, Jatva, Shchutva and Anunasika Sandhi. The identified split points adhere precisely to Sandhi rules. A dataset of 4900 SanskritSandhi words found in Kannada sentences was used to evaluate the proposed method, which achieved an accuracy of 90.03% for Sanskrit Sandhi Identification and 85.87% for reliable English Translation. This work has potential applications in other Dravidian languages.
Anthology ID:
2025.findings-ijcnlp.34
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venue:
Findings
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
579–588
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.34/
DOI:
Bibkey:
Cite (ACL):
Shanta Kallur and Basavaraj S. Anami. 2025. A Word-Splitting Approach to Kannada Sanskrit Sandhi Words Useful in Effective English Translation. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 579–588, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
A Word-Splitting Approach to Kannada Sanskrit Sandhi Words Useful in Effective English Translation (Kallur & Anami, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.34.pdf