Abstract
In the next decade, we will see a considerable need for NLP models for situated settings where diversity of situations and also different modalities including eye-movements should be taken into account in order to grasp the intention of the user. However, language comprehension in situated settings can not be handled in isolation, where different multimodal cues are inherently present and essential parts of the situations. In this research proposal, we aim to quantify the influence of each modality in interaction with various referential complexities. We propose to encode the referential complexity of the situated settings in the embeddings during pre-training to implicitly guide the model to the most plausible situation-specific deviations. We summarize the challenges of intention extraction and propose a methodological approach to investigate a situation-specific feature adaptation to improve crossmodal mapping and meaning recovery from noisy communication settings.- Anthology ID:
- 2021.hcinlp-1.13
- Volume:
- Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Editors:
- Su Lin Blodgett, Michael Madaio, Brendan O'Connor, Hanna Wallach, Qian Yang
- Venue:
- HCINLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 81–87
- Language:
- URL:
- https://aclanthology.org/2021.hcinlp-1.13
- DOI:
- Cite (ACL):
- Özge Alacam. 2021. Situation-Specific Multimodal Feature Adaptation. In Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing, pages 81–87, Online. Association for Computational Linguistics.
- Cite (Informal):
- Situation-Specific Multimodal Feature Adaptation (Alacam, HCINLP 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2021.hcinlp-1.13.pdf