How do we learn to see? Jozsef Fiser Department of Psychology Brandeis University Based on our human experimental and modeling work, in this talk I will put forward for two related points. First, I will show how humans develop internal visual representations by incremental learning of the co-occurrence and predictability of visual elements when passively looking at images of an unknown underlying structure. I will argue that this type of learning is the basis of our ability to understand and utilize our visual environment. Second, I present evidence that despite this statistics-based learning, humans do not encode the full second-order correlational structure of the scene. Rather, they learn a sufficient representation of the underlying independent causes generating the scene, and this strategy naturally leads to the emergence of chunking and some of the basic Gestalt rules of perception. If time permits, I will show that these learning processes can be well captured within the framework of Bayesian model learning using sigmoid belief networks, and I will illustrate how this learning maps onto cortical hemispheres and how it is integrated with attentional processes and eye movements.