Semantic Frames and Visual Scenes: Learning Semantic Role Inventories from Image and Video Descriptions

Ekaterina Shutova, Andreas Wundsam, Helen Yannakoudakis


Abstract
Frame-semantic parsing and semantic role labelling, that aim to automatically assign semantic roles to arguments of verbs in a sentence, have become an active strand of research in NLP. However, to date these methods have relied on a predefined inventory of semantic roles. In this paper, we present a method to automatically learn argument role inventories for verbs from large corpora of text, images and videos. We evaluate the method against manually constructed role inventories in FrameNet and show that the visual model outperforms the language-only model and operates with a high precision.
Anthology ID:
S17-1018
Volume:
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Nancy Ide, Aurélie Herbelot, Lluís Màrquez
Venue:
*SEM
SIGs:
SIGLEX | SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
149–154
Language:
URL:
https://aclanthology.org/S17-1018
DOI:
10.18653/v1/S17-1018
Bibkey:
Cite (ACL):
Ekaterina Shutova, Andreas Wundsam, and Helen Yannakoudakis. 2017. Semantic Frames and Visual Scenes: Learning Semantic Role Inventories from Image and Video Descriptions. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), pages 149–154, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Semantic Frames and Visual Scenes: Learning Semantic Role Inventories from Image and Video Descriptions (Shutova et al., *SEM 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/S17-1018.pdf