Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework

Mohna Chakraborty; Lu Wang; David Jurgens

Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework

Mohna Chakraborty, Lu Wang, David Jurgens

Abstract

Large language models (LLMs) are increasingly deployed in domains requiring moral understanding, yet their reasoning often remains shallow, and misaligned with human reasoning. Unlike humans, whose moral reasoning integrates contextual trade-offs, value systems, and ethical theories, LLMs often rely on surface patterns, leading to biased decisions in morally and ethically complex scenarios. To address this gap, we present a value-grounded framework for evaluating and distilling structured moral reasoning in LLMs. We benchmark 12 open-source models across four moral datasets using a taxonomy of prompts grounded in value systems, ethical theories, and cognitive reasoning strategies. Our evaluation is guided by four questions: (1) Does reasoning improve LLM decision-making over direct prompting? (2) Which types of value/ethical frameworks most effectively guide LLM reasoning? (3) Which cognitive reasoning strategies lead to better moral performance? (4) Can small-sized LLMs acquire moral competence through distillation? We find that prompting with explicit moral structure consistently improves accuracy and coherence, with first-principles reasoning and Schwartz’s + care-ethics scaffolds yielding the strongest gains. Furthermore, our supervised distillation approach transfers moral competence from large to small models without additional inference cost. Together, our results offer a scalable path toward interpretable and value-grounded models.

Anthology ID:: 2025.emnlp-main.1541
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30283–30311
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1541/
DOI:
Bibkey:
Cite (ACL):: Mohna Chakraborty, Lu Wang, and David Jurgens. 2025. Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 30283–30311, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Structured Moral Reasoning in Language Models: A Value-Grounded Evaluation Framework (Chakraborty et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1541.pdf
Checklist:: 2025.emnlp-main.1541.checklist.pdf

PDF Cite Search Checklist Fix data