Live-Aid: A Large-Scale Dialogue Dataset and Benchmark for Interleaved Multi-party Interactions in Live Streaming

Yiming Lei, Yize Fan, Zeming Liu, Jiaji Dong, Hui Qiu, Haitao Leng, Qingjie Liu, Kehai Chen, Tingting Gao, Yunhong Wang


Abstract
Recent advancements in Multimodal Large Language Models (MLLMs) have achieved significant success in understanding static pre-recorded video scenarios (e.g., event-centric or narrative-driven content). However, existing MLLMs are largely trained on datasets restricted to static content due to the scarcity of high-quality interleaved data, causing them to struggle with dynamic interactions. Distinct from pre-recorded videos, live streaming is characterized by high-density, interleaved multimodal turns, where viewer comments (danmaku) are tightly coupled with real-time audio-visual evidence and evolving dialogue context. In such settings, purely textual annotations fail to capture fine-grained visual and temporal dependencies. To bridge this gap, we introduce **Live-Aid**, the first large-scale interleaved live interaction Chinese dataset with **human-annotated**, temporally aligned video responses, spanning over **1,100 hours** and 80,037 dialogue turns across 8,053 video sessions. Building on this, we leverage these high-quality annotations within a novel multi-agent pipeline to construct evaluation tasks targeting core capabilities of live interactions. Extensive evaluations of strong Video-LLMs and Omni-LLMs reveal critical limitations in interleaved multi-turn interactions requiring temporal reasoning, highlighting the value of **Live-Aid** in advancing interleaved multimodal reasoning and dynamic audio-visual dependencies.
Anthology ID:
2026.findings-acl.1193
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23813–23850
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1193/
DOI:
Bibkey:
Cite (ACL):
Yiming Lei, Yize Fan, Zeming Liu, Jiaji Dong, Hui Qiu, Haitao Leng, Qingjie Liu, Kehai Chen, Tingting Gao, and Yunhong Wang. 2026. Live-Aid: A Large-Scale Dialogue Dataset and Benchmark for Interleaved Multi-party Interactions in Live Streaming. In Findings of the Association for Computational Linguistics: ACL 2026, pages 23813–23850, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Live-Aid: A Large-Scale Dialogue Dataset and Benchmark for Interleaved Multi-party Interactions in Live Streaming (Lei et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1193.pdf
Checklist:
 2026.findings-acl.1193.checklist.pdf