Associative Learning and Reinforcement¶

Scope: Learning of stimulus-stimulus, stimulus-response, and action-outcome contingencies under reinforcement; Pavlovian and instrumental conditioning; extinction; reversal; model-based and model-free reinforcement learning; reward prediction error.

Out of scope: Structure learning without scalar reinforcement (that is Implicit and Statistical Learning).

This category contains 13 processes.

Associative learning

Process ID: hed_associative_learning

Learning of co-occurrence relations between stimuli or between stimuli and responses.

Tasks

The following tasks engage this process:

Recent references

Shanks (2010) Annual Review of Psychology 61:273–301

Extinction

Process ID: hed_extinction

Decrease in a previously reinforced response when reinforcement is withheld; a form of new inhibitory learning rather than erasure.

Tasks

The following tasks engage this process:

Pavlovian Fear Conditioning Task

Fundamental references

Pavlov (1927) Conditioned Reflexes
Bouton (2004) Learning & Memory 11:485–494

Recent references

Dunsmoor, Niv, Daw & Phelps (2015) Neuron 88:47–63

Goal-directed behavior

Process ID: hed_goal_directed_behavior

Behavior that is sensitive to current outcome value, characteristic of action–outcome learning.

Tasks

The following tasks engage this process:

Instrumental Conditioning Task

Fundamental references

Dickinson & Balleine (1994) Animal Learning & Behavior 22:1–18

Recent references

Balleine & O’Doherty (2010) Neuropsychopharmacology 35:48–69

Habit

Process ID: hed_habit

Behavior that is insensitive to the current value of its outcome, characteristic of stimulus–response learning.

Tasks

The following tasks engage this process:

Instrumental Conditioning Task

Recent references

Balleine & O’Doherty (2010) Neuropsychopharmacology 35:48–69

Instrumental conditioning

Process ID: hed_instrumental_conditioning

Also known as: Operant conditioning — Skinnerian terminology; emphasizes the operant response and reinforcement schedules. Merged from separate entry 2026-04-18.

Learning that an action produces an outcome; also called operant conditioning. Encompasses both goal-directed (action–outcome) and habitual (stimulus–response) control, studied via reinforcement schedules and outcome-devaluation procedures.

Tasks

The following tasks engage this process:

Instrumental Conditioning Task

Recent references

Staddon & Cerutti (2003) Annual Review of Psychology 54:115–144

Model-based learning

Process ID: hed_model_based_learning

Reinforcement learning that uses an internal model of the environment’s transition and reward structure to plan.

Tasks

The following tasks engage this process:

Two-Stage Decision Task

Fundamental references

Daw, Niv & Dayan (2005) Nature Neuroscience 8:1704–1711

Recent references

Daw, Gershman, Seymour, Dayan & Dolan (2011) Neuron 69:1204–1215

Model-free learning

Process ID: hed_model_free_learning

Reinforcement learning from cached value estimates updated by prediction errors, without an explicit model of the environment.

Tasks

The following tasks engage this process:

Two-Stage Decision Task

Recent references

Daw, Niv & Dayan (2005) Nature Neuroscience 8:1704–1711

Pavlovian conditioning

Process ID: hed_pavlovian_conditioning

Learning that a neutral stimulus predicts a biologically significant outcome, leading to conditioned responding.

Tasks

The following tasks engage this process:

Pavlovian Fear Conditioning Task

Fundamental references

Pavlov (1927) Conditioned Reflexes

Recent references

LeDoux (2014) PNAS 111:2871–2878

Policy learning

Process ID: hed_policy_learning

Direct learning of a mapping from states to actions without necessarily estimating values.

No tasks in the current catalog are linked to this process.

Reinforcement learning

Process ID: hed_reinforcement_learning

Learning to select actions that maximize cumulative reward through experience with reward prediction errors.

Tasks

The following tasks engage this process:

Fundamental references

Schultz, Dayan & Montague (1997) Science 275:1593–1599

Recent references

Niv (2009) Journal of Mathematical Psychology 53:139–154

Reversal learning

Process ID: hed_reversal_learning

Relearning after contingencies between stimuli (or responses) and outcomes are switched.

Tasks

The following tasks engage this process:

Reversal Learning Task

Fundamental references

Iversen & Mishkin (1970) Experimental Brain Research 11:376–386

Recent references

Izquierdo, Brigman, Radke, Rudebeck & Holmes (2017) Neuroscience 345:12–26

Reward prediction error

Process ID: hed_reward_prediction_error

Signed difference between received and expected reward, instantiated by phasic midbrain dopamine firing.

Tasks

The following tasks engage this process:

Fundamental references

Schultz, Dayan & Montague (1997) Science 275:1593–1599

Recent references

Glimcher (2011) PNAS 108(Suppl 3):15647–15654

Value learning

Process ID: hed_value_learning

Acquisition of the expected value of stimuli, actions, or states from experience with outcomes.

Tasks

The following tasks engage this process:

Recent references

Rangel, Camerer & Montague (2008) Nature Reviews Neuroscience 9:545–556