Be Precise or Fuzzy: Learning the Meaning of Cardinals and Quantifiers from Vision

Sandro Pezzelle, Marco Marelli, Raffaella Bernardi


Abstract
People can refer to quantities in a visual scene by using either exact cardinals (e.g. one, two, three) or natural language quantifiers (e.g. few, most, all). In humans, these two processes underlie fairly different cognitive and neural mechanisms. Inspired by this evidence, the present study proposes two models for learning the objective meaning of cardinals and quantifiers from visual scenes containing multiple objects. We show that a model capitalizing on a ‘fuzzy’ measure of similarity is effective for learning quantifiers, whereas the learning of exact cardinals is better accomplished when information about number is provided.
Anthology ID:
E17-2054
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
337–342
Language:
URL:
https://aclanthology.org/E17-2054
DOI:
Bibkey:
Cite (ACL):
Sandro Pezzelle, Marco Marelli, and Raffaella Bernardi. 2017. Be Precise or Fuzzy: Learning the Meaning of Cardinals and Quantifiers from Vision. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 337–342, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Be Precise or Fuzzy: Learning the Meaning of Cardinals and Quantifiers from Vision (Pezzelle et al., EACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/E17-2054.pdf
Data
ImageNet