Open-domain Video Commentary Generation

Edison Marrese-Taylor; Yumi Hamazono; Tatsuya Ishigaki; Goran Topić; Yusuke Miyao; Ichiro Kobayashi; Hiroya Takamura

doi:10.18653/v1/2022.emnlp-main.495

Open-domain Video Commentary Generation

Edison Marrese-Taylor, Yumi Hamazono, Tatsuya Ishigaki, Goran Topić, Yusuke Miyao, Ichiro Kobayashi, Hiroya Takamura

Abstract

Live commentary plays an important role in sports broadcasts and video games, making spectators more excited and immersed. In this context, though approaches for automatically generating such commentary have been proposed in the past, they have been generally concerned with specific fields, where it is possible to leverage domain-specific information. In light of this, we propose the task of generating video commentary in an open-domain fashion. We detail the construction of a new large-scale dataset of transcribed commentary aligned with videos containing various human actions in a variety of domains, and propose approaches based on well-known neural architectures to tackle the task. To understand the strengths and limitations of current approaches, we present an in-depth empirical study based on our data. Our results suggest clear trade-offs between textual and visual inputs for the models and highlight the importance of relying on external knowledge in this open-domain setting, resulting in a set of robust baselines for our task.

Anthology ID:: 2022.emnlp-main.495
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7326–7339
Language:
URL:: https://aclanthology.org/2022.emnlp-main.495
DOI:: 10.18653/v1/2022.emnlp-main.495
Bibkey:
Cite (ACL):: Edison Marrese-Taylor, Yumi Hamazono, Tatsuya Ishigaki, Goran Topić, Yusuke Miyao, Ichiro Kobayashi, and Hiroya Takamura. 2022. Open-domain Video Commentary Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7326–7339, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Open-domain Video Commentary Generation (Marrese-Taylor et al., EMNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2022.emnlp-main.495.pdf

PDF Search