Sandro Pollastrini
2025
Detecting and Mitigating Challenges in Zero-Shot Video Summarization with Video LLMs
Luca Cagliero
|
Lorenzo Vaiani
|
Eliana Pastor
|
Alkis Koudounas
|
Elena Baralis
|
Vittorio Mazzia
|
Sandro Pollastrini
|
Thomas Gueudre
|
Manuel Giollo
|
Daniele Amberti
|
Yue Wu
Findings of the Association for Computational Linguistics: ACL 2025
Video summarization aims to generate a condensed textual version of an original video. Summaries may consist of either plain text or a shortlist of salient events, possibly including temporal or spatial references. Video Large Language Models (VLLMs) exhibit impressive zero-shot capabilities in video analysis. However, their performance varies significantly according to the LLM prompt, the characteristics of the video, and the properties of the training data and LLM architecture.In this work, we thoroughly evaluate the zero-shot summarization performance of four state-of-the-art open-source VLLMs specifically designed to address spatial and temporal reasoning. In light of the detected summarization issues, we propose different cost-effective mitigation strategies, based on Chain-of-Thought prompting, that involve the injection of knowledge extracted by external, lightweight models. To perform the VLLM evaluation, we design a new video summarization benchmark consisting of 100 videos with varying characteristics in terms of domain, duration, and spatio-temporal properties. Videos are manually annotated by three independent human experts with plain text, event-based, and spatio-temporal summaries. The experimental evaluation shows that VLLMs significantly benefit from prompting a list of recognized actions, whereas injecting automatically recognized objects and scene changes respectively improve spatially contextualized and event-based summaries in specific cases.
2023
Supervised Clustering Loss for Clustering-Friendly Sentence Embeddings: an Application to Intent Clustering
Giorgio Barnabò
|
Antonio Uva
|
Sandro Pollastrini
|
Chiara Rubagotti
|
Davide Bernardi
Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)
Search
Fix author
Co-authors
- Daniele Amberti 1
- Elena Baralis 1
- Giorgio Barnabò 1
- Davide Bernardi 1
- Luca Cagliero 1
- show all...