Representation of visual structure
Intelligent processing of visual objects implies the ability to deal with
their structure. Understanding the ability of human observers to
perceive the arrangement of parts in a composite object, or the arrangement
of objects in a scene, is the central concern of the current theoretical
and experimental work in high-level vision. Theories of visual structure
processing necessarily analyze object representation in terms of
structural units that are, in a sense, smaller than the entire
object or scene. Common to all such theories is the need to explain the
origins of the structural units, and, in particular, the dependence of the
set of units used by a visual system on its experience with structured
stimuli. This dependence specifically, the probabilistic
processes that may govern the acquisition of structural units by the human
visual system is the focus of the vision research in my lab.
What does the phenomenal impression made by a scene consist of? The answer,
we conjecture, is scene structure: objects and their locations
("what+where"). If this is so, the impression must persist, even if for a
short time, at those stages of the visual pathway where units tuned both to
complex shapes and their locations are present (specifically, in areas V4
and TE). Studies of the neural correlates of visual awareness suggest that
information represented at this level should be available to conscious
We study the ability of observers to recall, over a few intervening scenes,
spatially anchored information concerning scene components, varying the
number of objects, and the statistics of their absolute and relative
location. Our results indicate that scene structure the phenomenal
"what+where" is psychologically real, and is briefly available to
conscious recall. Moreover, the representation of such structure is
modifiable by statistical learning, which can produce insensitivity to
scene changes that fall within the expected norm (see the "inside boundary"
condition in the figure on the right), and heightened sensitivity to
unusual changes, such as the translation of a familiar spatial arrangement
to a new location (see objects 1 and 2 in the "outside boundary"
Joint work with Claudia M. Hunter.
Computational theory and modeling
We are developing a computational model of structure representation, which
uses a common low-dimensional coarse code for shape and location, a notion
derived from our earlier work on shape recognition and categorization, and
supported by a body of recent neurobiological data, particularly the
reports of ``what+where'' neurons in inferotemporal and prefrontal
cortices. This effort explores the ability of a computational model of
unsupervised learning to mimic the detailed pattern of human performance in
the acquisition of composite structural-unit representations: the model
will be used as a ``subject'' in replicas of psychophysical experiments,
and as a testbed for new computational ideas and explanations.
We expect this project to facilitate the development of a detailed and
explicit, hence explanatory, computational model of statistically driven
unsupervised learning of structural primitives for vision. The resulting
model will be biologically relevant, being based on findings from monkey
electrophysiology. Our research should also result in the development of
practical applications in computer vision, where the problem of dealing
with object structure is a major challenge. Moreover, understanding the
computational basis of structure processing should also be useful in
cognitive domains other than vision, notably language, where new approaches
rooted in statistical concepts are emerging both in theoretical linguistics
and in the empirical field of natural language engineering.
Joint work with Nathan Intrator.
Shimon Edelman, Constraining the neural representation of the visual world (not the final version!), Trends in Cognitive Sciences 6:125-131, 2002
Shimon Edelman, Nathan Intrator and Judah S. Jacobson,
Unsupervised learning of visual
structure, Lecture Notes in Computer Science, vol. 2025,
H. H. Bülthoff, T. Poggio, S. W. Lee and C. Wallraven, eds.,
629-643, Springer, 2002 [see abstract].
Shimon Edelman and Nathan Intrator, Towards structural
systematicity in distributed, statically bound visual
representations, Cognitive Science, 27:73-110 (2003).
[see abstract, and also our response to John Hummel's comments on this article].
Claudia M. Hunter, Anne Warlaumont, and Shimon Edelman, A
behavioral handle on the phenomenology of scene perception,
Proc. Vision Sciences Society meeting, Sarasota, FL (May 2005).
Shimon Edelman <firstname.lastname@example.org>
Last modified on Thu Jun 16 12:34:09 2005