Recognition of the Gist of the Scene from Spatial Envelope Properties

Aude Oliva
Department of Brain and Cognitive Sciences
Massachusetts Institute of Technology

Studies of complex image recognition have shown that observers identify the 
category of a real world scene in a single glance, to form a semantic gist 
of the scene. In this talk, I provide a theoretical framework of scene 
gist, as well as computational and experimental evidence that a complex 
real world scene can be identified in a feed-forward manner efficiently 
enough to influence object detection. The gist model is based on Oliva & 
Torralba (2001) spatial envelope theory of scene understanding, which 
postulates that volumetric properties of a scene image (e.g., its mean 
depth, openness, perspective) can provide access to the semantic category 
of a scene.