Multiple benchmarks have been developed to assess the alignment between deep neural networks (DNNs) and human vision. In almost all cases these benchmarks are observational in the sense they are composed of behavioural and brain responses to naturalistic images that have not been manipulated to test hypotheses regarding how DNNs or humans perceive and identify objects. Here we introduce the toolbox MindSet: Vision, consisting of a collection of image datasets and related scripts designed to test DNNs on 30 psychological findings. In all experimental conditions, the stimuli are systematically manipulated to test specific hypotheses regarding human visual perception and object recognition.
g., changes in degree of curvature) . provided evidence that several DNNs are also more sensitive to image manipulations that alter NAPs, although the effects were most pronounced in later layers of the networks whereas sensitivity to NAP is thought to occur relatively early in human visual processing to encode object parts.
id: 8233cb15b72738d911848f025dfebde2 - page: 16
Datasets. We have included images of both 2D line segments based on , and 3D Geon stimuli originally used in and obtained from 3 to assess the degree in which DNNs are sensitive to NAP vs MP changes. In the case of the Geon stimuli, we have provided a version with shade (as in ), a 3 ori 16
id: 8d60f629773537f109593f294ffde9a5 - page: 16
For each Geon or line segment, a feature dimension (such as the curvature of a Geon) is altered from a singular value (e.g. straight contour with 0 curvature) to two different values (e.g. slightly curved or very curved). The reference condition includes items with the intermediate feature value; in this example, the slightly curved geon. The MP change condition consists of items with a greater non-singular value; in this case, the greater curvature geon. Finally, the NAP change condition includes items with the singular value; the straight contour geon from this example. A human-like similarity judgment would correspond to higher similarity between the reference object to the MP variants than the NAP variants (that is,
id: 908ef95a30ccb4410eaab26b542fdd8e - page: 17
NAP changes are easier to discriminate). provides a more detailed description of human performance through reaction times that can be directly compared to similarity judgments in DNNs (where higher reaction times correspond to lower similarity).
id: c71a53cbf3efc6750b9d95abd7be7749 - page: 17