TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs

Ziyi Wang, Chen Zhang, Wenjun Peng, Qi Wu, Xinyu Wang


Abstract
Explainability for Large Language Model (LLM) agents is especially challenging in interactive, partially observable settings, where decisions depend on evolving beliefs and other agents. We present TriEx, a tri-view explainability framework that instruments sequential decision making with aligned artifacts: (i) structured first-person self-reasoning bound to an action, (ii) explicit second-person belief states about opponents updated over time, and (iii) third-person oracle audits grounded in environment-derived reference signals. This design turns explanations from free-form narratives into evidence-anchored objects that can be compared and checked across time and perspectives. Using imperfect-information strategic games as a controlled testbed, we show that TriEx enables scalable analysis of explanation faithfulness, belief dynamics, and evaluator reliability, revealing systematic mismatches between what agents say, what they believe, and what they do. Our results highlight explainability as an interaction-dependent property and motivate multi-view, evidence-grounded evaluation for LLM agents.
Anthology ID:
2026.acl-long.292
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6448–6479
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.292/
DOI:
Bibkey:
Cite (ACL):
Ziyi Wang, Chen Zhang, Wenjun Peng, Qi Wu, and Xinyu Wang. 2026. TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6448–6479, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs (Wang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.292.pdf
Checklist:
 2026.acl-long.292.checklist.pdf