Rishabh Baral
2026
Moneyball with LLMs: Analyzing Tabular Summarization in Sports Narratives
Ritam Upadhyay | Naman Ahuja | Rishabh Baral | Aparna Garimella | Vivek Gupta
Findings of the Association for Computational Linguistics: ACL 2026
Ritam Upadhyay | Naman Ahuja | Rishabh Baral | Aparna Garimella | Vivek Gupta
Findings of the Association for Computational Linguistics: ACL 2026
Large language model (LLM) approaches to tabular summarization rely on extensive prompt engineering, decomposition pipelines, or entity-level intermediate representations to achieve strong performance. While effective, these strategies are computationally expensive and offer limited insight into how well models maintain state over long, evolving narratives. We introduce SporTabSet, a diagnostic benchmark for long-context tabular summarization across two complementary sports domains that require tracking multiple entities and aggregating statistics under domain-specific rules. Using SporTabSet, we systematically evaluate decomposition-based strategies across several long context LLMs. Results show that although decomposition substantially improves accuracy and numerical fidelity, gains stem mainly from dissecting multi-entity interference rather than improved local arithmetic. Robustness experiments further reveal high sensitivity to surface-level cues with structured failures, including hallucination, omission, and role confusion. Together, these findings identify consistent multi-entity memory as a key bottleneck in long-context table generation, motivating diagnostic evaluation as a prerequisite for scalable, efficient, and reliable tabular summarization models.