Modeling Referential Gaze in Task-oriented Settings of Varying Referential Complexity
Özge Alacam, Eugen Ruppert, Sina Zarrieß, Ganeshan Malhotra, Chris Biemann, Sina Zarrieß
Abstract
Referential gaze is a fundamental phenomenon for psycholinguistics and human-human communication. However, modeling referential gaze for real-world scenarios, e.g. for task-oriented communication, is lacking the well-deserved attention from the NLP community. In this paper, we address this challenging issue by proposing a novel multimodal NLP task; namely predicting when the gaze is referential. We further investigate how to model referential gaze and transfer gaze features to adapt to unseen situated settings that target different referential complexities than the training environment. We train (i) a sequential attention-based LSTM model and (ii) a multivariate transformer encoder architecture to predict whether the gaze is on a referent object. The models are evaluated on the three complexity datasets. The results indicate that the gaze features can be transferred not only among various similar tasks and scenes but also across various complexity levels. Taking the referential complexity of a scene into account is important for successful target prediction using gaze parameters especially when there is not much data for fine-tuning.- Anthology ID:
- 2022.findings-aacl.19
- Volume:
- Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
- Month:
- November
- Year:
- 2022
- Address:
- Online only
- Editors:
- Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 197–210
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/2022.findings-aacl.19/
- DOI:
- 10.18653/v1/2022.findings-aacl.19
- Cite (ACL):
- Özge Alacam, Eugen Ruppert, Sina Zarrieß, Ganeshan Malhotra, Chris Biemann, and Sina Zarrieß. 2022. Modeling Referential Gaze in Task-oriented Settings of Varying Referential Complexity. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 197–210, Online only. Association for Computational Linguistics.
- Cite (Informal):
- Modeling Referential Gaze in Task-oriented Settings of Varying Referential Complexity (Alacam et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/2022.findings-aacl.19.pdf