Natalie Perez


2026

Meaning in human language is relational and context-dependent, and it emerges, according to Saussure (1916), through a dynamic system of signs rather than fixed relationships between words and concepts. Insights from the study of semiotics and hermeneutics emphasize that meaning arises through interpretive processes shaped by context, which has historically posed challenges for computational processing and evaluation. Building on these perspectives, this article advances an interdisciplinary framework for evaluating meaning in machine-generated language and introduces the Inductive Conceptual Rating (ICR) metric, a qualitative approach grounded in inductive content analysis and reflective thematic analysis that assesses semantic accuracy and meaning alignment in generative artificial intelligence (GenAI) outputs beyond surface-level lexical and similarity metrics. The ICR metric is applied in an empirical study that compares thematic summaries generated by the large language model (LLM) with the human-generated output in five datasets (N = 50-800). Results show that although models achieve high linguistic similarity scores, they consistently unperformed relative to human outputs in capturing recurring, contextually grounded meanings. This work concludes by discussing implications for meaning evaluation and future research.