Spatial Language Understanding with Multimodal Graphs using Declarative Learning based Programming

Parisa Kordjamshidi, Taher Rahgooy, Umar Manzoor


Abstract
This work is on a previously formalized semantic evaluation task of spatial role labeling (SpRL) that aims at extraction of formal spatial meaning from text. Here, we report the results of initial efforts towards exploiting visual information in the form of images to help spatial language understanding. We discuss the way of designing new models in the framework of declarative learning-based programming (DeLBP). The DeLBP framework facilitates combining modalities and representing various data in a unified graph. The learning and inference models exploit the structure of the unified graph as well as the global first order domain constraints beyond the data to predict the semantics which forms a structured meaning representation of the spatial context. Continuous representations are used to relate the various elements of the graph originating from different modalities. We improved over the state-of-the-art results on SpRL.
Anthology ID:
W17-4306
Volume:
Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Kai-Wei Chang, Ming-Wei Chang, Vivek Srikumar, Alexander M. Rush
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–43
Language:
URL:
https://aclanthology.org/W17-4306
DOI:
10.18653/v1/W17-4306
Bibkey:
Cite (ACL):
Parisa Kordjamshidi, Taher Rahgooy, and Umar Manzoor. 2017. Spatial Language Understanding with Multimodal Graphs using Declarative Learning based Programming. In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, pages 33–43, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Spatial Language Understanding with Multimodal Graphs using Declarative Learning based Programming (Kordjamshidi et al., 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/W17-4306.pdf
Data
Visual Question Answering