2025
pdf
bib
Agent Ideate: A Framework for Product Idea Generation from Patents Using Agentic AI
Gopichand Kanumolu
|
Ashok Urlana
|
Vinayak Kumar Charaka
|
Bala Mallikarjunarao Garlapati
Proceedings of the 2nd Workshop on Agent AI for Scenario Planning
pdf
bib
abs
No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size
Ashok Urlana
|
Charaka Vinayak Kumar
|
Bala Mallikarjunarao Garlapati
|
Ajeet Kumar Singh
|
Rahul Mishra
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Large language models (LLMs) are playing a pivotal role in deploying strategic use cases across a range of organizations, from large pan-continental companies to emerging startups. The issues and challenges involved in the successful utilization of LLMs can vary significantly depending on the size of the organization. It is important to study and discuss these pertinent issues of LLM adaptation with a focus on the scale of the industrial concerns and brainstorm possible solutions and prospective directions. Such a study has not been prominently featured in the current research literature. In this study, we adopt a threefold strategy: first, we conduct a case study with industry practitioners to formulate the key research questions; second, we examine existing industrial publications to address these questions; and finally, we provide a practical guide for industries to utilize LLMs more efficiently. We release the GitHub repository with the most recent papers in the field.
pdf
bib
abs
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
Ashok Urlana
|
Gopichand Kanumolu
|
Charaka Vinayak Kumar
|
Bala Mallikarjunarao Garlapati
|
Rahul Mishra
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Response consistency-based, reference-free hallucination detection (RFHD) methods do not depend on internal model states, such as generation probabilities or gradients, which Grey-box models typically rely on but are inaccessible in closed-source LLMs. However, their inability to capture query-response alignment patterns often results in lower detection accuracy. Additionally, the lack of large-scale benchmark datasets spanning diverse domains remains a challenge, as most existing datasets are limited in size and scope. To this end, we propose HalluCounter, a novel reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. This enables the training of a classifier that detects hallucinations and provides a confidence score and an optimal response for user queries. Furthermore, we introduce HalluCounterEval, a benchmark dataset comprising both synthetically generated and human-curated samples across multiple domains. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90% average confidence in hallucination detection across datasets.
pdf
bib
abs
Cyber for AI at SemEval-2025 Task 4: Forgotten but Not Lost: The Balancing Act of Selective Unlearning in Large Language Models
Dinesh Srivasthav P
|
Bala Mallikarjunarao Garlapati
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Large Language Models (LLMs) face significant challenges in maintaining privacy, ethics, and compliance, when sensitive or obsolete data must be selectively removed. Retraining these models from scratch is computationally infeasible, necessitating efficient alternatives. As part of the SemEval 2025 Task 4, this work focuses on the application of selective unlearning in LLMs to address this challenge. In this paper, we present our experiments and findings, primarily leveraging global weight modification to achieve an equilibrium between effectiveness of unlearning, knowledge retention, and target model’s post-unlearning utility. We also detail the task-specific evaluation mechanism, results, and challenges. Our algorithms have achieved an aggregate score of 0.409 and 0.389 on the test set for 7B and 1B target models, respectively, demonstrating promising results in verifiable LLM unlearning.
2024
pdf
bib
abs
TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques
Ashok Urlana
|
Aditya Saibewar
|
Bala Mallikarjunarao Garlapati
|
Charaka Vinayak Kumar
|
Ajeet Singh
|
Srinivasa Rao Chalamala
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The Large Language Models (LLMs) exhibit remarkable ability to generate fluent content across a wide spectrum of user queries. However, this capability has raised concerns regarding misinformation and personal information leakage. In this paper, we present our methods for the SemEval2024 Task8, aiming to detect machine-generated text across various domains in both mono-lingual and multi-lingual contexts. Our study comprehensively analyzes various methods to detect machine-generated text, including statistical, neural, and pre-trained model approaches. We also detail our experimental setup and perform a in-depth error analysis to evaluate the effectiveness of these methods. Our methods obtain an accuracy of 86.9% on the test set of subtask-A mono and 83.7% for subtask-B. Furthermore, we also highlight the challenges and essential factors for consideration in future studies.