Sadhana Kumaravel


2024

pdf
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs
Kinjal Basu | Ibrahim Abdelaziz | Subhajit Chaudhury | Soham Dan | Maxwell Crouse | Asim Munawar | Vernon Austel | Sadhana Kumaravel | Vinod Muthusamy | Pavan Kapanipathi | Luis Lastras
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire sufficient quantities of train and test data that involve calls to tools / APIs. Two lines of research have emerged as the predominant strategies for addressing this challenge. The first has focused on synthetic data generation techniques, while the second has involved curating task-adjacent datasets which can be transformed into API / Tool-based tasks. In this paper, we focus on the task of identifying, curating, and transforming existing datasets and, in turn, introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs. The datasets mimic real-world scenarios involving API-tasks such as API / tool detection, slot filling, and sequencing of the detected APIs. We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.

2022

pdf
DocAMR: Multi-Sentence AMR Representation and Evaluation
Tahira Naseem | Austin Blodgett | Sadhana Kumaravel | Tim O’Gorman | Young-Suk Lee | Jeffrey Flanigan | Ramón Astudillo | Radu Florian | Salim Roukos | Nathan Schneider
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Despite extensive research on parsing of English sentences into Abstract Meaning Representation (AMR) graphs, which are compared to gold graphs via the Smatch metric, full-document parsing into a unified graph representation lacks well-defined representation and evaluation. Taking advantage of a super-sentential level of coreference annotation from previous work, we introduce a simple algorithm for deriving a unified graph representation, avoiding the pitfalls of information loss from over-merging and lack of coherence from under merging. Next, we describe improvements to the Smatch metric to make it tractable for comparing document-level graphs and use it to re-evaluate the best published document-level AMR parser. We also present a pipeline approach combining the top-performing AMR parser and coreference resolution systems, providing a strong baseline for future research.

2016

pdf
Cross Sentence Inference for Process Knowledge
Samuel Louvan | Chetan Naik | Sadhana Kumaravel | Heeyoung Kwon | Niranjan Balasubramanian | Peter Clark
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing