Amba Kulkarni

2025

Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025
Amba Kulkarni | Oliver Hellwig
Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025

pdf bib

Itaretara Dvandva: A challenge for Dependency Tree semantics
Amba Kulkarni | Vasudha Neelamana
Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025

pdf bib

Compound Type Identification in Sanskrit
Sriram Krishnan | Pavankumar Satuluri | Amruta Barbadikar | T S Prasanna Venkatesh | Amba Kulkarni
Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025

pdf bib

Challenges in Processing Vedic Sanskrit: Towards creating a normalized dataset for the Ṛgveda-saṃhitā
Sriram Krishnan | Sepuri Gayathri | Amba Kulkarni
Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025

2024

pdf bib abs

Automatic Sanskrit Poetry Classification Based on Kāvyaguṇa
Amruta Barbadikar | Amba Kulkarni
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

Kāvyaguṇa denotes the syntactic and phonetic attributes or qualities of Sanskrit poetry that enhance its artistic appeal, commonly classified into three categories: Mādhyurya (Sweetness), Oja (Floridity), and Prasāda (Lucidity). This paper presents the Kāvyaguṇa Classifier, a machine learning module, designed to classify Sanskrit literary texts into three distinct guṇas, by employing a diverse range of machine learning algorithms, including Random Forest, Gradient Boosting, XGBoost, Multi-Layer Perceptron and Support Vector Machine. For vectorization, we employed two methods: the neural network-based Word2vec and a custom feature engineering approach grounded in the theoretical understanding of Kāvyaguṇas as described in Sanskrit poetics. The feature engineering model significantly outperformed, achieving an accuracy of up to 90.6%

pdf bib

Word Sense Alignment of Sanskrit Lexica
Dhaval K Patel | Amba Kulkarni
Proceedings of the 7th International Sanskrit Computational Linguistics Symposium

pdf bib

Inter Sentential Discourse Relations
Saee Vaze | Amba Kulkarni
Proceedings of the 7th International Sanskrit Computational Linguistics Symposium

pdf bib

Anuprāsa Identifier and Classifier: A computational tool to analyze Sanskrit figure of sound
Amruta Vilas Barbadikar | Amba Kulkarni
Proceedings of the 7th International Sanskrit Computational Linguistics Symposium

pdf bib

START: Sanskrit Teaching; Annotation; and Research Tool – Bridging Tradition and Technology in Scholarly Exploration
Anil Kumar | Amba Kulkarni | Nakka Shailaj
Proceedings of the 7th International Sanskrit Computational Linguistics Symposium

2023

pdf bib abs

Multi-component compounding is a prevalent phenomenon in Sanskrit, and understanding the implicit structure of a compound’s components is crucial for deciphering its meaning. Earlier approaches in Sanskrit have focused on binary compounds and neglected the multi-component compound setting. This work introduces the novel task of nested compound type identification (NeCTI), which aims to identify nested spans of a multi-component compound and decode the implicit semantic relations between them. To the best of our knowledge, this is the first attempt in the field of lexical semantics to propose this task. We present 2 newly annotated datasets including an out-of-domain dataset for this task. We also benchmark these datasets by exploring the efficacy of the standard problem formulations such as nested named entity recognition, constituency parsing and seq2seq, etc. We present a novel framework named DepNeCTI: Dependency-based Nested Compound Type Identifier that surpasses the performance of the best baseline with an average absolute improvement of 13.1 points F1-score in terms of Labeled Span Score (LSS) and a 5-fold enhancement in inference efficiency. In line with the previous findings in the binary Sanskrit compound identification task, context provides benefits for the NeCTI task. The codebase and datasets are publicly available at: https://github.com/yaswanth-iitkgp/DepNeCTI

pdf bib abs

Issues in the computational processing of Upamāalaṅkāra.
Bhakti Jadhav | Amruta Barbadikar | Amba Kulkarni | Malhar Kulkarni
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Processing and understanding of figurative speech is a challenging task for computers as well as humans. In this paper, we present a case of Upamā alaṅkāra (simile). The verbal cognition of the Upamā alaṅkāra by a human is presented as a dependency tree, which involves the identification of various components such as upamāna (vehicle), upameya (topic), sādhāran.a-dharma (common property) and upamādyotaka (word indicating similitude). This involves the repetition of elliptical elements. Further, we show, how the same dependency tree may be represented without any loss of information, even without repetition of elliptical elements. Such a representation would be useful for the computational processing of the alaṅkāras.

pdf bib

Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference
Amba Kulkarni | Oliver Hellwig
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference

pdf bib

Validation and Normalization of DCS corpus and Development of the Sanskrit Heritage Engine’s Segmenter
Sriram Krishnan | Amba Kulkarni | Gérard Huet
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference

pdf bib

Disambiguation of Instrumental, Dative and Ablative Case suffixes in Sanskrit
Malay Maity | Sanjeev Panchal | Amba Kulkarni
Proceedings of the Computational Sanskrit & Digital Humanities: Selected papers presented at the 18th World Sanskrit Conference

2021

pdf bib abs

Parsing Subordinate Clauses in Telugu using Rule-based Dependency Parser
P Sangeetha | Parameswari Krishnamurthy | Amba Kulkarni
Proceedings of the First Workshop on Parsing and its Applications for Indian Languages

Parsing has been gaining popularity in recent years and attracted the interest of NLP researchers around the world. It is challenging when the language under study is a free-word order language that allows ellipsis like Telugu. In this paper, an attempt is made to parse subordinate clauses especially, non-finite verb clauses and relative clauses in Telugu which are highly productive and constitute a large chunk in parsing tasks. This study adopts a knowledge-driven approach to parse subordinate structures using linguistic cues as rules. Challenges faced in parsing ambiguous structures are elaborated alongside providing enhanced tags to handle them. Results are encouraging and this parser proves to be efficient for Telugu.

2020

pdf bib abs

Free Word Order in Sanskrit and Well-nestedness
Sanal Vikram | Amba Kulkarni
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

The common wisdom about Sanskrit is that it is free word order language. This word order poses challenges such as handling non-projectivity in parsing. The earlier works on the word order of Sanskrit have shown that there are syntactic structures in Sanskrit which cannot be covered under even the non-planarity. In this paper, we study these structures further to investigate if they can fall under well-nestedness or not. A small manually tagged corpus of the verses of Śrīmad-Bhagavad-Gītā was considered for this study. It was noticed that there are as many well-nested trees as there are ill-nested ones. From the linguistic point of view, we could get a list of relations that are involved in the planarity violations. All these relations had one thing in common - that they have unilateral expectancy. It was this loose binding, as against the mutual expectancy with certain other relations, that allowed them to cross the phrasal boundaries.

pdf bib

Dependency Relations for Sanskrit Parsing and Treebank
Amba Kulkarni | Pavankumar Satuluri | Sanjeev Panchal | Malay Maity | Amruta Malvade
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories

2019

pdf bib abs

Sanskrit Segmentation revisited
Sriram Krishnan | Amba Kulkarni
Proceedings of the 16th International Conference on Natural Language Processing

Computationally analyzing Sanskrit texts requires proper segmentation in the initial stages. There have been various tools developed for Sanskrit text segmentation. Of these, Gérard Huet’s Reader in the Sanskrit Heritage Engine analyzes the input text and segments it based on the word parameters - phases like iic, ifc, Pr, Subst, etc., and sandhi (or transition) that takes place at the end of a word with the initial part of the next word. And it enlists all the possible solutions differentiating them with the help of the phases. The phases and their analyses have their use in the domain of sentential parsers. In segmentation, though, they are not used beyond deciding whether the words formed with the phases are morphologically valid. This paper tries to modify the above segmenter by ignoring the phase details (except for a few cases), and also proposes a probability function to prioritize the list of solutions to bring up the most valid solutions at the top.

pdf bib

Sanskrit Sentence Generator
Amba Kulkarni | Madhusoodana Pai
Proceedings of the 6th International Sanskrit Computational Linguistics Symposium

pdf bib

Dependency Parser for Sanskrit Verses
Amba Kulkarni | Sanal Vikram | Sriram K
Proceedings of the 6th International Sanskrit Computational Linguistics Symposium

pdf bib

Pāṇinian Syntactico-Semantic Relation Labels
Amba Kulkarni | Dipti Sharma
Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019)