Decomposition Does Not Help: Evidence from Semantic Clustering in LLM-based Causal Graph Discovery

Nikolay Babakov, Alberto Bugarín-Diz


Abstract
Recent advances in large language models (LLMs) have enabled their application to non-traditional tasks such as causal graph construction, a key component of reasoning frameworks, including Bayesian Networks. The most effective existing approaches rely on direct prompting, where an LLM generates a complete graph from a full set of variables in a single step. However, the performance of such methods degrades as the number of graph nodes increases. To address this limitation, we explore a divide-and-conquer alternative based on semantic clustering. Node representations are first embedded and clustered, after which subgraphs are constructed independently for each cluster using LLM prompting. The resulting subgraphs are then merged pairwise into a global graph. Contrary to our expectations, this approach leads to a substantial degradation in performance compared to direct prompting baselines, as measured by Structural Hamming Distance (SHD). We attribute this to the misalignment between semantic similarity and causal structure, as well as error propagation during subgraph merging. We report these negative results to highlight the limitations of decomposition strategies in LLM-based causal graphs construction.
Anthology ID:
2026.retroeval-main.1
Volume:
Proceedings of the 1st Symposium on Natural Language Generation Evaluations
Month:
June
Year:
2026
Address:
Aberdeen, United Kingdom
Editors:
Saad Mahamood, David M. Howcroft, Kees van Deemter, Simone Balloccu, Adarsa Sivaprasad, Barkavi Sundararajan, Alberto Bugarín Diz, Jose María Alonso-Moral
Venue:
RetroEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–7
Language:
URL:
https://preview.aclanthology.org/ingest-retroeval/2026.retroeval-main.1/
DOI:
Bibkey:
Cite (ACL):
Nikolay Babakov and Alberto Bugarín-Diz. 2026. Decomposition Does Not Help: Evidence from Semantic Clustering in LLM-based Causal Graph Discovery. In Proceedings of the 1st Symposium on Natural Language Generation Evaluations, pages 1–7, Aberdeen, United Kingdom. Association for Computational Linguistics.
Cite (Informal):
Decomposition Does Not Help: Evidence from Semantic Clustering in LLM-based Causal Graph Discovery (Babakov & Bugarín-Diz, RetroEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-retroeval/2026.retroeval-main.1.pdf