Psych/Cogst 4650/6650, Spring 2012: Reinforcement Learning:
Computational and Brain Aspects
Instructors: Shimon Edelman, Barbara Finlay
Time: Wednesdays 10:10-12:35. Place: 369 Uris Hall.
use it to post questions, suggestions for additional readings, etc.
Readings: a zipfile with all the PDFs is available on
Blackboard. For a week-by-week list, see below; for some annotations, see
I. Introduction: the idea of reinforcement learning (RL) in machine
learning and neuroscience
- Overview of reinforcement learning contrasted with other approaches (Edelman).
- General motor command structure in the brain (Finlay).
- Review topic assignments for the later dates.
Readings for Jan. 25 Feb 1 introductory material (required):
- Deeper into the basal ganglia (Finlay).
- Computational issues in action learning (Edelman).
- Choose presentation subjects and dates.
Background reading for basic brain structure review:
Woergoetter, W., and B. Porr (2007). Reinforcement learning. Scholarpedia,
Parent, A. and L.-N. Hazrati (1995). Functional anatomy of the basal
ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Research
Wolpert, D. M., J. Diedrichsen, J. R. Flanagan (2011). Principles of
sensorimotor learning. Nature Reviews Neuroscience 12:739-752.
Atallah, H. E., Frank, M. J., & O'Reilly, R. C. (2004). Hippocampus,
cortex, and basal ganglia: Insights from computational models of
complementary learning systems. Neurobiology of Learning and Memory, 82(3),
Other non-required readings:
Chater, N. (2009). Rational and mechanistic perspectives on reinforcement
learning. Cognition 113:350-364.
Sutton, R. and A. G. Barto (1998). Reinforcement Learning (BOOK).
Purves et al. (2008). The Human Nervous System. Chapter 1 in
Principles of Cognitive Science, Sinauer. Clear overview of basic
structure and terminology, including the assumed background on neurons and
Solari, S. V. H., & Stone, R. A. (2011). Cognitive consilience:
primate non-primary circuits underlying cognition. Frontiers in
Neuroanatomy, 5, 65. doi: 10.3389/fnana.2011.00065
II. Brain Mechanisms and Models of Reinforcement Learning
Basic basal ganglia circuitry and models
Additional material, not required (note: Most of these review similar
material, but frame the content in the specific academic perspective
indicated, except Cohen and Frank, which is a different version of Frank
Aldridge, J. W., & Berridge, K. C. (1998). Coding of serial order
by neostriatal neurons: A ''natural action'' approach to movement
sequence. Journal of Neuroscience, 18(7), 2777-2787. Also, Aldridge and
Frank, M. J. (2011). Computational models of motivated action
selection in corticostriatal circuits. Current Opinion in Neurobiology,
Graybiel, A. M. (2008). Habits, rituals, and the evaluative
brain. Annual Review of Neuroscience, 31(1), 359-387.
Redgrave, P., Vautrelle, N., & Reynolds, J. N. J. (2011).
Functional properties of the basal ganglia's re-entrant loop architecture:
selection and reinforcement. Neuroscience, 198, 138-151.
Ashby, F. G., Turner, B. O., & Horvitz, J. C. (2010). Cortical and basal
ganglia contributions to habit learning and automaticity. Trends in
Cognitive Sciences, 14(5), 208-215.
Bar-Gad, I., Morris, G., & Bergman, H. (2003). Information processing,
dimensionality reduction and reinforcement learning in the basal
ganglia. Progress in Neurobiology, 71(6), 439-473.
Balleine, B. W., Liljeholm, M., & Ostlund, S. B. (2009). The integrative
function of the basal ganglia in instrumental conditioning. Behavioural
Brain Research, 199:43-52.
Botvinick, M. M., Niv, Y., & Barto, A. C. (2009). Hierarchically organized
behavior and its neural foundations: A reinforcement learning
perspective. Cognition, 113(3), 262-280.
Chakravarthy, V. S., D. Joseph, R. S. Bapi (2010). What do the basal
ganglia do? A modeling perspective. Biol Cybern 103:237-253.
Cohen, M. X., & Frank, M. J. (2009). Neurocomputational models of basal
ganglia function in learning, memory and choice. Behavioural Brain
The process of learning and unlearning as insight into models
Jin, X., & Costa, R. M. (2010). Start/stop signals emerge in
nigrostriatal circuits during sequence learning. Nature, 466(7305),
Thorn CA, Atallah H, Howe M, Graybiel AM (2010). Differential dynamics
of activity changes in dorsolateral and dorsomedial striatal loops during
learning. Neuron 66:781-795.
Howe, M. W., Atallah, H. E., McCool, A., Gibson, D. J., & Graybiel,
A. M. (2011). Habit learning is associated with major shifts in
frequencies of oscillatory activity and synchronized spike firing in
striatum. Proceedings of the National Academy of Sciences, 108(40),
Jin, D. Z., Fujii, N., & Graybiel, A. M. (2009). Neural representation of
time in cortico-basal ganglia circuits. Proceedings of the National Academy
of Sciences, 106(45), 19156-19161.
Chaining, embedding, interrupting and unlearning (1)
Charlesworth, J. D., Tumer, E. C., Warren, T. L., & Brainard,
M. S. (2011). Learning the microstructure of successful
behavior. [10.1038/nn.2748]. Nat Neurosci, 14(3), 373-380
Bornstein, A. M., & Daw, N. D. (2011). Multiplicity of control in
the basal ganglia: computational roles of striatal subregions. Current
Opinion in Neurobiology, 21(3), 374-380
Amemori, K.-i., Gibb, L. G., & Graybiel, A. M. (2011). Shifting
responsibly: The importance of striatal modularity to reinforcement
learning in uncertain environments. Frontiers in Human Neuroscience, 5.
Ito, M., & Doya, K. (2011). Multiple representations and algorithms for
reinforcement learning in the cortico-basal ganglia circuit. Current
Opinion in Neurobiology, 21(3), 368-373.
Chaining, embedding, interrupting and unlearning (2)
Ding, J. B., Guzman, J. N., Peterson, J. D., Goldberg, J. A., &
Surmeier, D. J. (2010). Thalamic gating of corticostriatal signaling by
cholinergic interneurons. Neuron, 67(2), 294-307.
Isoda, M., & Hikosaka, O. (2011). Cortico-basal ganglia mechanisms
for overcoming innate, habitual and motivational
behaviors. [10.1111/j.1460-9568.2011.07698.x]. European Journal of
Neuroscience, 33(11), 2058-2069.
Special topic: Michael Anderson, Colloquium Speaker in
Psychology this week. Anderson, M. (2010). Neural re-use as a
fundamental organizational principal of the brain. Behavioral and Brain
Individual differences in performance of models and subjects, disorders
Montague, P. R., Dolan, R. J., Friston, K. J., & Dayan, P. (2012).
Computational psychiatry. Trends in Cognitive Sciences, 16(1), 72-80.
Redgrave, P., Rodriguez, M., Smith, Y., Rodriguez-Oroz, M. C.,
Lehericy, S., Bergman, H., . . . Obeso, J. A. (2010). Goal-directed and
habitual control in the basal ganglia: implications for Parkinson's
disease. [10.1038/nrn2915]. Nat Rev Neurosci, 11(11), 760-772.
Neiman, T. and Y. Loewenstein (2011). Reinforcement learning in
professional basketball players. Nature Communications 2:569.
Review Wolpert (under Jan. 25 readings).
Gallagher, S. (2012). Multiple aspects in the sense of agency.
New Ideas in Psychology 30:15-31.
Wegner D. (2004). Precis of The illusion of conscious will. Behavioral and
Brain Sciences, 27:649-659 (not commentaries). Whoever signs up for
this should get the book and present some of the experiments reviewed, with
an eye to the statistical assignment of agency, not the consciousness
Whitham, E. M., Fitzgibbon, S. P., Lewis, T. W., Pope, K. J.,
DeLosAngeles, D., Clark, C. R., . . . Willoughby, J. O. (2011). Visual
experiences during paralysis. Frontiers in Human Neuroscience, 5. doi:
Mar 21: SPRING BREAK
Multiple types of reinforcers and gates: Opiates and oxytocin
An example, but not necessarily a "model", of research in this area using
primates and neuroimaging:
Humphries, M. D., & Prescott, T. J. (2010). The ventral basal
ganglia, a selection mechanism at the crossroads of space, strategy, and
reward. Progress in Neurobiology, 90(4), 385-417.
Ross, H. E., & Young, L. J. (2009). Oxytocin and the neural
mechanisms regulating social cognition and affiliative behavior. Frontiers
in Neuroendocrinology, 30(4), 534-547.
Depue, R. L., & Collins, P. F. (1999). Neurobiology of the structure of
personality: Dopamine, facilitation of incentive motivation, and
extraversion. Behavioral and Brain Sciences, 22, 491-569.
Chang, S. W. C., Barter, J. W., Ebitz, R. B., Watson, K. K., & Platt,
M. L. (2012). Inhaled oxytocin amplifies both vicarious reinforcement and
self reinforcement in rhesus macaques (Macaca mulatta). Proceedings of the
National Academy of Sciences, 109:959-964.
Hierarchical and other control architectures: computation
Parr, P. and S. Russell (1997). Reinforcement Learning with Hierarchies of
Machines. Proc. NIPS.
Botvinick, M., Y. Niv, A. G. Barto (2009). Hierarchically organized behavior
and its neural foundations: A reinforcement learning perspective. Cognition
Ribas-Fernandes, J. J. F., A. Solway, C. Diuk, J. T. McGuire, A. Barto,
Y. Niv, and M. Botvinick (2011). A Neural Signature of Hierarchical
Reinforcement Learning. Neuron 71:370-379.
Barto, A. G. and S. Mahadevan (2003). Recent Advances in Hierarchical
Reinforcement Learning. Discrete Event Systems 14:41-77.
Vigorito, C. M. and A. G. Barto (2010). Intrinsically Motivated Hierarchical
Skill Learning in Structured Environments. IEEE Transactions on Autonomous
Mental Development 2:132-144.
Negative reinforcement and punishment
Matsumoto M, Hikosaka O (2009). Two types of dopamine neuron distinctly convey
positive and negative motivational signals. Nature 459:837-841.
LeDoux, J. E. (2003). The emotional brain, fear and the amygdala. Cellular
and Molecular Neurobiology, 23.
Leknes, S., & Tracey, I. (2008). A common neurobiology for pain and
pleasure. Nature Reviews Neuroscience, 9:314-320.
The special case of anxiety: resource allocation and control
Egner, T., Etkin, A., Gale, S., & Hirsch, J. (2008). Dissociable
neural systems resolve conflict from emotional versus nonemotional
distracters. Cerebral Cortex, 18(6), 1475-1484.
(Also Egner commentary).
Duncan, J. (2010). The multiple-demand (MD) system of the primate brain:
mental programs for intelligent behaviour. Trends in Cognitive Sciences,
Hurley, MM, Dennet, D. and Adams, R. (2011). Inside Jokes: Using
humor to reverse-engineer the mind. MIT Press. Chapts 2 and 5 for a
sketch of the argument and an interesting twist on what a "reinforcement"
Integrating basal ganglia function into memory and decision-making
O'Reilly, R. C. and M. J. Frank (2006). Making Working Memory Work: A
Computational Model of Learning in the Prefrontal Cortex and Basal
Ganglia. Neural Computation 18:283-328.
van der Meer MA, Johnson A, Schmitzer-Torbert NC, Redish
AD (2010). Triple dissociation of information processing in dorsal
striatum, ventral striatum, and hippocampus on a learned spatial decision
task. Neuron 67:25-32.
McNab, F. and T. Klingberg (2008). Prefrontal cortex and basal ganglia
control access to working memory. Nature Neuroscience 11:103-108.
Ullman, M. T. (2006). Is Broca's area part of a basal ganglia
thalamocortical circuit? Cortex 42:480-485.
Shmuelof, L., & Krakauer, J. W. (2011). Are we ready for a natural
history of motor learning? Neuron, 72(3), 469-476.
Shimon Edelman <se37 at cornell.edu>
Last modified on Mon Feb 27 14:50:27 2012