GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection

Melissa Kazemi Rad; Alberto Purpura; Himanshu Kumar; Emily Chen; Mohammad Shahed Sorower

GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection

Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, Mohammad Shahed Sorower

Abstract

We address the problem of data scarcity in harmful text classification for guardrailing applications and introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation. GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process that promotes stylistic diversity and uncovers edge cases. This combination enables both reliable coverage of the input space and nuanced exploration of harmful content. Using two benchmark data sets, we demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.

Anthology ID:: 2025.emnlp-main.1528
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30047–30065
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1528/
DOI:
Bibkey:
Cite (ACL):: Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, and Mohammad Shahed Sorower. 2025. GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 30047–30065, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection (Rad et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1528.pdf
Checklist:: 2025.emnlp-main.1528.checklist.pdf

PDF Cite Search Checklist Fix data