Shivangi Bithel


2025

pdf bib
BI-Bench : A Comprehensive Benchmark Dataset and Unsupervised Evaluation for BI Systems
Ankush Gupta | Aniya Aggarwal | Shivangi Bithel | Arvind Agarwal
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

A comprehensive benchmark is crucial for evaluating automated Business Intelligence (BI) systems and their real-world effectiveness. We propose BI-Bench, a holistic, end-to-end benchmarking framework that assesses BI systems based on the quality, relevance, and depth of insights. It categorizes queries into descriptive, diagnostic, predictive, and prescriptive types, aligning with practical BI needs. Our fully automated approach enables custom benchmark generation tailored to specific datasets. Additionally, we introduce an automated evaluation mechanism within BI-Bench that removes reliance on strict ground truth, ensuring scalable and adaptable assessments. By addressing key limitations, it offers a flexible and robust, user-centered methodology for advancing next-generation BI systems.

pdf bib
Goal-Driven Data Story, Narrations and Explanations
Aniya Aggarwal | Ankush Gupta | Shivangi Bithel | Arvind Agarwal
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

In this paper, we propose a system designed to process and interpret vague, open-ended, and multi-line complex natural language queries, transforming them into coherent, actionable data stories. Our system’s modular architecture comprises five components—Question Generation, Answer Generation, NLG/Chart Generation, Chart2Text, and Story Representation—each utilizing LLMs to transform data into human-readable narratives and visualizations. Unlike existing tools, our system uniquely addresses the ambiguity of vague, multi-line queries, setting a new benchmark in data storytelling by tackling complexities no existing system comprehensively handles. Our system is cost-effective, which uses open-source models without extra training and emphasizes transparency by showcasing end-to-end processing and intermediate outputs. This enhances explainability, builds user trust, and clarifies the data story generation process.