A Bayesian Model of Grounded Color Semantics

Brian McMahan, Matthew Stone


Abstract
Natural language meanings allow speakers to encode important real-world distinctions, but corpora of grounded language use also reveal that speakers categorize the world in different ways and describe situations with different terminology. To learn meanings from data, we therefore need to link underlying representations of meaning to models of speaker judgment and speaker choice. This paper describes a new approach to this problem: we model variability through uncertainty in categorization boundaries and distributions over preferred vocabulary. We apply the approach to a large data set of color descriptions, where statistical evaluation documents its accuracy. The results are available as a Lexicon of Uncertain Color Standards (LUX), which supports future efforts in grounded language understanding and generation by probabilistically mapping 829 English color descriptions to potentially context-sensitive regions in HSV color space.
Anthology ID:
Q15-1008
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
103–115
Language:
URL:
https://aclanthology.org/Q15-1008
DOI:
10.1162/tacl_a_00126
Bibkey:
Cite (ACL):
Brian McMahan and Matthew Stone. 2015. A Bayesian Model of Grounded Color Semantics. Transactions of the Association for Computational Linguistics, 3:103–115.
Cite (Informal):
A Bayesian Model of Grounded Color Semantics (McMahan & Stone, TACL 2015)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/Q15-1008.pdf