How do we learn to see?

Jozsef Fiser
Department of Psychology
Brandeis University

Based on our human experimental and modeling work, in this talk I  
will put forward for two related points.  First, I will show how  
humans develop internal visual representations by incremental  
learning of the co-occurrence and predictability of visual elements  
when passively looking at images of an unknown underlying structure.   
I will argue that this type of learning is the basis of our ability  
to understand and utilize our visual environment.  Second, I present  
evidence that despite this statistics-based learning, humans do not  
encode the full second-order correlational structure of the scene.   
Rather, they learn a sufficient representation of the underlying  
independent causes generating the scene, and this strategy naturally  
leads to the emergence of chunking and some of the basic Gestalt  
rules of perception. If time permits, I will show that these learning  
processes can be well captured within the framework of Bayesian model  
learning using sigmoid belief networks, and I will illustrate how  
this learning maps onto cortical hemispheres and how it is integrated  
with attentional processes and eye movements.