Associative Learning and Reinforcement¶
Scope: Learning of stimulus-stimulus, stimulus-response, and action-outcome contingencies under reinforcement; Pavlovian and instrumental conditioning; extinction; reversal; model-based and model-free reinforcement learning; reward prediction error.
Out of scope: Structure learning without scalar reinforcement (that is Implicit and Statistical Learning).
This category contains 13 processes.
Associative learning
Process ID: hed_associative_learning
Learning of co-occurrence relations between stimuli or between stimuli and responses.
Tasks
The following tasks engage this process:
Recent references
Shanks (2010) Annual Review of Psychology 61:273–301
Extinction
Process ID: hed_extinction
Decrease in a previously reinforced response when reinforcement is withheld; a form of new inhibitory learning rather than erasure.
Tasks
The following tasks engage this process:
Fundamental references
Pavlov (1927) Conditioned Reflexes
Bouton (2004) Learning & Memory 11:485–494
Recent references
Dunsmoor, Niv, Daw & Phelps (2015) Neuron 88:47–63
Goal-directed behavior
Process ID: hed_goal_directed_behavior
Behavior that is sensitive to current outcome value, characteristic of action–outcome learning.
Tasks
The following tasks engage this process:
Fundamental references
Dickinson & Balleine (1994) Animal Learning & Behavior 22:1–18
Recent references
Balleine & O’Doherty (2010) Neuropsychopharmacology 35:48–69
Habit
Process ID: hed_habit
Behavior that is insensitive to the current value of its outcome, characteristic of stimulus–response learning.
Tasks
The following tasks engage this process:
Recent references
Balleine & O’Doherty (2010) Neuropsychopharmacology 35:48–69
Instrumental conditioning
Process ID: hed_instrumental_conditioning
Also known as: Operant conditioning — Skinnerian terminology; emphasizes the operant response and reinforcement schedules. Merged from separate entry 2026-04-18.
Learning that an action produces an outcome; also called operant conditioning. Encompasses both goal-directed (action–outcome) and habitual (stimulus–response) control, studied via reinforcement schedules and outcome-devaluation procedures.
Tasks
The following tasks engage this process:
Recent references
Staddon & Cerutti (2003) Annual Review of Psychology 54:115–144
Model-based learning
Process ID: hed_model_based_learning
Reinforcement learning that uses an internal model of the environment’s transition and reward structure to plan.
Tasks
The following tasks engage this process:
Fundamental references
Daw, Niv & Dayan (2005) Nature Neuroscience 8:1704–1711
Recent references
Daw, Gershman, Seymour, Dayan & Dolan (2011) Neuron 69:1204–1215
Model-free learning
Process ID: hed_model_free_learning
Reinforcement learning from cached value estimates updated by prediction errors, without an explicit model of the environment.
Tasks
The following tasks engage this process:
Recent references
Daw, Niv & Dayan (2005) Nature Neuroscience 8:1704–1711
Pavlovian conditioning
Process ID: hed_pavlovian_conditioning
Learning that a neutral stimulus predicts a biologically significant outcome, leading to conditioned responding.
Tasks
The following tasks engage this process:
Fundamental references
Pavlov (1927) Conditioned Reflexes
Recent references
LeDoux (2014) PNAS 111:2871–2878
Policy learning
Process ID: hed_policy_learning
Direct learning of a mapping from states to actions without necessarily estimating values.
No tasks in the current catalog are linked to this process.
Reinforcement learning
Process ID: hed_reinforcement_learning
Learning to select actions that maximize cumulative reward through experience with reward prediction errors.
Tasks
The following tasks engage this process:
Fundamental references
Schultz, Dayan & Montague (1997) Science 275:1593–1599
Recent references
Niv (2009) Journal of Mathematical Psychology 53:139–154
Reversal learning
Process ID: hed_reversal_learning
Relearning after contingencies between stimuli (or responses) and outcomes are switched.
Tasks
The following tasks engage this process:
Fundamental references
Iversen & Mishkin (1970) Experimental Brain Research 11:376–386
Recent references
Izquierdo, Brigman, Radke, Rudebeck & Holmes (2017) Neuroscience 345:12–26
Reward prediction error
Process ID: hed_reward_prediction_error
Signed difference between received and expected reward, instantiated by phasic midbrain dopamine firing.
Tasks
The following tasks engage this process:
Fundamental references
Schultz, Dayan & Montague (1997) Science 275:1593–1599
Recent references
Glimcher (2011) PNAS 108(Suppl 3):15647–15654
Value learning
Process ID: hed_value_learning
Acquisition of the expected value of stimuli, actions, or states from experience with outcomes.
Tasks
The following tasks engage this process:
Recent references
Rangel, Camerer & Montague (2008) Nature Reviews Neuroscience 9:545–556