2025
pdf
bib
abs
STAR: Spectral Truncation and Rescale for Model Merging
Yu-Ang Lee
|
Ching-Yun Ko
|
Tejaswini Pedapati
|
I-Hsin Chung
|
Mi-Yen Yeh
|
Pin-Yu Chen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Model merging is an efficient way of obtaining a multi-task model from several pretrained models without further fine-tuning, and it has gained attention in various domains, including natural language processing (NLP). Despite the efficiency, a key challenge in model merging is the seemingly inevitable decrease in task performance as the number of models increases. In this paper, we propose **S**pectral **T**runcation **A**nd **R**escale (STAR) that aims at mitigating “merging conflicts” by truncating small components in the respective spectral spaces, which is followed by an automatic parameter rescaling scheme to retain the nuclear norm of the original matrix. STAR requires no additional inference on original training data and is robust to hyperparamater choice. We demonstrate the effectiveness of STAR through extensive model merging cases on diverse NLP tasks. Specifically, STAR works robustly across varying model sizes, and can outperform baselines by 4.2% when merging 12 models on Flan-T5. Our code is publicly available at https://github.com/IBM/STAR.
pdf
bib
abs
Granite Guardian: Comprehensive LLM Safeguarding
Inkit Padhi
|
Manish Nagireddy
|
Giandomenico Cornacchia
|
Subhajit Chaudhury
|
Tejaswini Pedapati
|
Pierre Dognin
|
Keerthiram Murugesan
|
Erik Miehling
|
Martín Santillán Cooper
|
Kieran Fraser
|
Giulio Zizzo
|
Muhammad Zaid Hameed
|
Mark Purcell
|
Michael Desmond
|
Qian Pan
|
Inge Vejsbjerg
|
Elizabeth M. Daly
|
Michael Hind
|
Werner Geyer
|
Ambrish Rawat
|
Kush R. Varshney
|
Prasanna Sattigeri
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
The deployment of language models in real-world applications exposes users to various risks, including hallucinations and harmful or unethical content. These challenges highlight the urgent need for robust safeguards to ensure safe and responsible AI. To address this, we introduce Granite Guardian, a suite of advanced models designed to detect and mitigate risks associated with prompts and responses, enabling seamless integration with any large language model (LLM). Unlike existing open-source solutions, our Granite Guardian models provide comprehensive coverage across a wide range of risk dimensions, including social bias, profanity, violence, sexual content, unethical behavior, jailbreaking, and hallucination-related issues such as context relevance, groundedness, and answer accuracy in retrieval-augmented generation (RAG) scenarios. Trained on a unique dataset combining diverse human annotations and synthetic data, Granite Guardian excels in identifying risks often overlooked by traditional detection systems, particularly jailbreak attempts and RAG-specific challenges. https://github.com/ibm-granite/granite-guardian
2024
pdf
bib
abs
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
|
Tejaswini Pedapati
|
Ronny Luss
|
Soham Dan
|
Aurelie Lozano
|
Payel Das
|
Georgios Kollias
Findings of the Association for Computational Linguistics: ACL 2024
Transformer-based Language Models have become ubiquitous in Natural Language Processing (NLP) due to their impressive performance on various tasks. However, expensive training as well as inference remains a significant impediment to their widespread applicability. While enforcing sparsity at various levels of the model architecture has found promise in addressing scaling and efficiency issues, there remains a disconnect between how sparsity affects network topology. Inspired by brain neuronal networks, we explore sparsity approaches through the lens of network topology. Specifically, we exploit mechanisms seen in biological networks, such as preferential attachment and redundant synapse pruning, and show that principled, model-agnostic sparsity approaches are performant and efficient across diverse NLP tasks, spanning both classification (such as natural language inference) and generation (summarization, machine translation), despite our sole objective not being optimizing performance. NeuroPrune is competitive with (or sometimes superior to) baselines on performance and can be up to 10x faster in terms of training time for a given level of sparsity, simultaneously exhibiting measurable improvements in inference time in many cases.