In the paper we use this dataset to show that our model can generalize from the synthetic langauge of the CLEVR dataset to questions using freeform natural language.
The dataset consists of:
Q:
What shape is the object reflected in the blue cylinder?
Q:
What number of cylinders share the same color?
Q:
How many objects are not purple and not metallic?
Q:
What color is the object partially blocked by the purple cylinder?