Sreyoshi Bhaduri

2026

Position: A Semiotic-Hermeneutic Approach to Evaluating Meaning in LLM Summaries via the Inductive Conceptual Rating Metric
Natalie Perez | Sreyoshi Bhaduri | Aman Chadha
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)

Meaning in human language is relational and context-dependent, and it emerges, according to Saussure (1916), through a dynamic system of signs rather than fixed relationships between words and concepts. Insights from the study of semiotics and hermeneutics emphasize that meaning arises through interpretive processes shaped by context, which has historically posed challenges for computational processing and evaluation. Building on these perspectives, this article advances an interdisciplinary framework for evaluating meaning in machine-generated language and introduces the Inductive Conceptual Rating (ICR) metric, a qualitative approach grounded in inductive content analysis and reflective thematic analysis that assesses semantic accuracy and meaning alignment in generative artificial intelligence (GenAI) outputs beyond surface-level lexical and similarity metrics. The ICR metric is applied in an empirical study that compares thematic summaries generated by the large language model (LLM) with the human-generated output in five datasets (N = 50-800). Results show that although models achieve high linguistic similarity scores, they consistently unperformed relative to human outputs in capturing recurring, contextually grounded meanings. This work concludes by discussing implications for meaning evaluation and future research.

pdf bib abs

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
Sankalp Jajee | Ashutosh Kumar | Nikunj Kotecha | Vinija Jain | Aman Chadha | Sreyoshi Bhaduri
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)

Indic languages, spoken by over 1.5 billion people, pose unique challenges for NLP due to their cultural richness, linguistic diversity, and structural complexity. We present IndicMMLU-Pro, a comprehensive benchmark extending the MMLU-Pro framework to nine major Indic languages: Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi, Tamil, Telugu, and Urdu. Covering a wide range of tasks in comprehension, reasoning, and generation, IndicMMLU-Pro offers a standardized evaluation framework to advance AI model development in Indic contexts. This paper details the benchmark’s design, taxonomy, and data curation, and establishes baseline results using state-of-the-art multilingual models. As an open resource IndicMMLU-Pro aims to accelerate progress in Indic language technologies and support inclusive research in multilingual NLP.

Co-authors

Natalie Perez 1

Venues

GEM2
WS2

Fix author