EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models

Hadi Mohammadi; Anastasia Giachanou; Ayoub Bagheri

EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models

Hadi Mohammadi, Anastasia Giachanou, Robert A. Bagheri

Abstract

We present EvalMORAAL, a transparent chain-of-thought (CoT) framework that uses two scoring methods (log-probabilities and direct ratings) plus a model-as-judge peer review to evaluate moral alignment in 20 large language models. We assess models on the World Values Survey (55 countries, 19 topics) and the PEW Global Attitudes Survey (39 countries, 8 topics). With EvalMORAAL, top models align closely with survey responses (Pearson’s r ≈ 0.90 on WVS). Yet we find a clear regional difference: Western regions average r=0.82 while non-Western regions average r=0.61 (a 0.21 absolute gap), indicating a persistent regional alignment gap. Our framework adds three parts: (1) two scoring methods for all models to enable fair comparison, (2) a structured CoT protocol with self-consistency checks, and (3) a model-as-judge peer review that flags 348 conflicts using a data-driven threshold. Peer agreement relates to WVS survey alignment (r=0.74, p<.001; PEW r=0.39, n.s.), supporting automated quality checks. These results show real progress toward culture-aware AI while highlighting open challenges for use across regions.

Anthology ID:: 2026.starsem-conference.34
Volume:: Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Saif M. Mohammad, Nedjma Ousidhoum
Venues:: *SEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 497–515
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.starsem-conference.34/
DOI:
Bibkey:
Cite (ACL):: Hadi Mohammadi, Anastasia Giachanou, and Robert A. Bagheri. 2026. EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models. In Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026), pages 497–515, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation for Moral Alignment in Large Language Models (Mohammadi et al., *SEM 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.starsem-conference.34.pdf

PDF Cite Search Fix data