Open-domain Video Commentary Generation
Edison Marrese-Taylor, Yumi Hamazono, Tatsuya Ishigaki, Goran Topić, Yusuke Miyao, Ichiro Kobayashi, Hiroya Takamura
Abstract
Live commentary plays an important role in sports broadcasts and video games, making spectators more excited and immersed. In this context, though approaches for automatically generating such commentary have been proposed in the past, they have been generally concerned with specific fields, where it is possible to leverage domain-specific information. In light of this, we propose the task of generating video commentary in an open-domain fashion. We detail the construction of a new large-scale dataset of transcribed commentary aligned with videos containing various human actions in a variety of domains, and propose approaches based on well-known neural architectures to tackle the task. To understand the strengths and limitations of current approaches, we present an in-depth empirical study based on our data. Our results suggest clear trade-offs between textual and visual inputs for the models and highlight the importance of relying on external knowledge in this open-domain setting, resulting in a set of robust baselines for our task.- Anthology ID:
- 2022.emnlp-main.495
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7326–7339
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.495
- DOI:
- 10.18653/v1/2022.emnlp-main.495
- Cite (ACL):
- Edison Marrese-Taylor, Yumi Hamazono, Tatsuya Ishigaki, Goran Topić, Yusuke Miyao, Ichiro Kobayashi, and Hiroya Takamura. 2022. Open-domain Video Commentary Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7326–7339, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Open-domain Video Commentary Generation (Marrese-Taylor et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.emnlp-main.495.pdf