Manohar Kaul

2025

pdf bib abs
SafeQuant: LLM Safety Analysis via Quantized Gradient Inspection
Sindhu Padakandla | Sadbhavana Babar | Rathod Darshan D | Manohar Kaul
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Contemporary jailbreak attacks on Large Language Models (LLMs) employ sophisticated techniques with obfuscated content to bypass safety guardrails. Existing defenses either use computationally intensive LLM verification or require adversarial fine-tuning, leaving models vulnerable to advanced attacks. We introduce SafeQuant, a novel defense framework that leverages quantized gradient patterns to identify harmful prompts efficiently. Our key insight is that when generating identical responses like “Sure”, LLMs exhibit distinctly different internal gradient patterns for safe versus harmful prompts, reflecting conflicts with safety training. By capturing these patterns through selective gradient masking and quantization, SafeQuant significantly outperforms existing defenses across multiple benchmarks while maintaining model utility. The method demonstrates particular effectiveness against sophisticated attacks like WordGame prompts and persuasive adversarial attacks, achieving an F1-score of 0.80 on WordGame dataset and outperforming state-of-the-art (SoTA) methods like GradSafe by an absolute margin of 57%.

2024

pdf bib abs
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
Pranoy Panda | Ankush Agarwal | Chaitanya Devaguptapu | Manohar Kaul | Prathosh Ap
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Given unstructured text, Large Language Models (LLMs) are adept at answering simple (single-hop) questions. However, as the complexity of the questions increase, the performance of LLMs degrade. We believe this is due to the overhead associated with understanding the complex question followed by filtering and aggregating unstructured information in the raw text. Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text, aiming to provide a structured overview that simplifies information processing. However, this simplistic approach is query-agnostic and the extracted facts are ambiguous as they lack context. To address these drawbacks and to enable LLMs to answer complex (multi-hop) questions with ease, we propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information. The use of our compressed distilled KG as input to the LLM results in our method utilizing up to 67% fewer tokens to represent the query relevant information present in the supporting documents, compared to the state-of-the-art (SoTA) method.Our experiments show consistent improvements over the SoTA across several metrics (EM, F1, BERTScore, and Human Eval) on two popular benchmark datasets (HotpotQA and MuSiQue).

2019

pdf bib abs
Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs
Deepak Nathani | Jatin Chauhan | Charu Sharma | Manohar Kaul
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention-based feature embedding that captures both entity and relation features in any given entity’s neighborhood. Additionally, we also encapsulate relation clusters and multi-hop relations in our model. Our empirical study offers insights into the efficacy of our attention-based model and we show marked performance gains in comparison to state-of-the-art methods on all datasets.

Co-authors

Chaitanya Devaguptapu 1

Venues

acl2
naacl1

Fix data