Dealing with Semantic Underspecification in Multimodal NLP

Sandro Pezzelle


Abstract
Intelligent systems that aim at mastering language as humans do must deal with its semantic underspecification, namely, the possibility for a linguistic signal to convey only part of the information needed for communication to succeed. Consider the usages of the pronoun they, which can leave the gender and number of its referent(s) underspecified. Semantic underspecification is not a bug but a crucial language feature that boosts its storage and processing efficiency. Indeed, human speakers can quickly and effortlessly integrate semantically-underspecified linguistic signals with a wide range of non-linguistic information, e.g., the multimodal context, social or cultural conventions, and shared knowledge. Standard NLP models have, in principle, no or limited access to such extra information, while multimodal systems grounding language into other modalities, such as vision, are naturally equipped to account for this phenomenon. However, we show that they struggle with it, which could negatively affect their performance and lead to harmful consequences when used for applications. In this position paper, we argue that our community should be aware of semantic underspecification if it aims to develop language technology that can successfully interact with human users. We discuss some applications where mastering it is crucial and outline a few directions toward achieving this goal.
Anthology ID:
2023.acl-long.675
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12098–12112
Language:
URL:
https://aclanthology.org/2023.acl-long.675
DOI:
10.18653/v1/2023.acl-long.675
Bibkey:
Cite (ACL):
Sandro Pezzelle. 2023. Dealing with Semantic Underspecification in Multimodal NLP. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12098–12112, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Dealing with Semantic Underspecification in Multimodal NLP (Pezzelle, ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2023.acl-long.675.pdf
Video:
 https://preview.aclanthology.org/landing_page/2023.acl-long.675.mp4