What does the phenomenal impression made by a scene consist of? The answer,
we conjecture, is scene structure: objects and their locations
("what+where"). If this is so, the impression must persist, even if for a
short time, at those stages of the visual pathway where units tuned both to
complex shapes and their locations are present (specifically, in areas V4
and TE). Studies of the neural correlates of visual awareness suggest that
information represented at this level should be available to conscious
access.
We study the ability of observers to recall, over a few intervening scenes, spatially anchored information concerning scene components, varying the number of objects, and the statistics of their absolute and relative location. Our results indicate that scene structure the phenomenal "what+where" is psychologically real, and is briefly available to conscious recall. Moreover, the representation of such structure is modifiable by statistical learning, which can produce insensitivity to scene changes that fall within the expected norm (see the "inside boundary" condition in the figure on the right), and heightened sensitivity to unusual changes, such as the translation of a familiar spatial arrangement to a new location (see objects 1 and 2 in the "outside boundary" condition).
Joint work with Claudia M. Hunter.
We are developing a computational model of structure representation, which
uses a common low-dimensional coarse code for shape and location, a notion
derived from our earlier work on shape recognition and categorization, and
supported by a body of recent neurobiological data, particularly the
reports of ``what+where'' neurons in inferotemporal and prefrontal
cortices. This effort explores the ability of a computational model of
unsupervised learning to mimic the detailed pattern of human performance in
the acquisition of composite structural-unit representations: the model
will be used as a ``subject'' in replicas of psychophysical experiments,
and as a testbed for new computational ideas and explanations.
We expect this project to facilitate the development of a detailed and
explicit, hence explanatory, computational model of statistically driven
unsupervised learning of structural primitives for vision. The resulting
model will be biologically relevant, being based on findings from monkey
electrophysiology. Our research should also result in the development of
practical applications in computer vision, where the problem of dealing
with object structure is a major challenge. Moreover, understanding the
computational basis of structure processing should also be useful in
cognitive domains other than vision, notably language, where new approaches
rooted in statistical concepts are emerging both in theoretical linguistics
and in the empirical field of natural language engineering.
Joint work with Nathan Intrator.
REPRESENTATIVE PUBLICATIONS: