[Comp-neuro] linking object category learning, spatial and object attention, and eye movement search

Stephen Grossberg steve at cns.bu.edu
Tue May 13 01:24:15 CEST 2008

The following article is now available at 

Fazl, A., Grossberg, S., & Mingolla, E.
View-invariant object category learning, recognition, and search:
How spatial and object attention are coordinated using surface-based 
attentional shrouds.
Cognitive Psychology, in press.


How does the brain learn to recognize an object from multiple 
viewpoints while scanning a scene with eye movements? How does the 
brain avoid the problem of erroneously classifying parts of different 
objects together? How are attention and eye movements intelligently 
coordinated to facilitate object learning? A neural model provides a 
unified mechanistic explanation of how spatial and object attention 
work together to search a scene and learn what is in it. The ARTSCAN 
model predicts how an object's surface representation generates a 
form-fitting distribution of spatial attention, or "attentional 
shroud." All surface representations dynamically compete for spatial 
attention to form a shroud. The winning shroud persists during active 
scanning of the object. The shroud maintains sustained activity of an 
emerging view-invariant category representation while multiple 
view-specific category representations are learned and are linked 
through associative learning to the view-invariant object category. 
The shroud also helps to restrict scanning eye movements to salient 
features on the attended object. Object attention plays a role in 
controlling and stabilizing the learning of view-specific object 
categories. Spatial attention hereby coordinates the deployment of 
object attention during object category learning. Shroud collapse 
releases a reset signal that inhibits the active view-invariant 
category in the What cortical processing stream. Then a new shroud, 
corresponding to a different object, forms in the Where cortical 
processing stream, and search using attention shifts and eye 
movements continues to learn new objects throughout a scene. The 
model mechanistically clarifies basic properties of attention shifts 
(engage, move, disengage) and inhibition of return. It simulates 
human reaction time data about object-based spatial attention shifts, 
and learns with 98.1% accuracy and a compression of 430 on a letter 
database whose letters vary in size, position, and orientation. The 
model provides a powerful framework for unifying many data about 
spatial and object attention, and their interactions during 
perception, cognition, and action.

Keywords: category learning, view-based learning, object recognition, 
spatial attention, object attention, parietal cortex, inferotemporal 
cortex, saccadic eye movements, attentional shroud, Adaptive 
Resonance Theory, surface perception, V2, V3A, V4, PPC, LIP, basal 

More information about the Comp-neuro mailing list