A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models
Soham Petkar, Hari Aakash K, Anirudh Vempati, Akshit Sinha, Ponnurangam Kumaraguru, Chirag Agarwal
Abstract
Recent research has extensively explored the graph-reasoning capabilities of Large Language Models (LLMs) through textual descriptions. However, benchmarks specifically designed for Graph-Language Models (GLMs), which integrate Graph Neural Networks (GNNs) with LLMs, remain significantly underdeveloped. In this work, we first demonstrate that existing GLM evaluations, largely repurposed from unimodal node and edge level tasks, fail to assess true multimodal integration. Our analysis reveals that strong performance on these benchmarks is achievable using textual or structural features in isolation, bypassing the need for joint reasoning. To bridge this gap, we introduce CLEGR (Compositional Language-Graph Reasoning), a benchmark explicitly designed to evaluate multimodal reasoning over graph topology and textual semantics. Evaluation of representative GLMs on CLEGR shows that they exhibit significant performance degradation on CLEGR tasks and unimodal soft-prompted LLMs perform on par with complex multimodal GLMs. These findings collectively highlight limitations in the graph reasoning capabilities of existing GLMs and provide a foundation for advancing the community toward explicit multimodal reasoning involving graph structure and language.- Anthology ID:
- 2026.findings-acl.1624
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 32441–32462
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1624/
- DOI:
- Cite (ACL):
- Soham Petkar, Hari Aakash K, Anirudh Vempati, Akshit Sinha, Ponnurangam Kumaraguru, and Chirag Agarwal. 2026. A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 32441–32462, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models (Petkar et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1624.pdf