Instrumental Conditioning Task¶

HED Task ID: hedtsk_instrumental_conditioning

Also known as: Operant Conditioning, Instrumental Learning, PIT

Actions are reinforced by contingent outcomes under defined schedules; response rate and choice probability across schedules index instrumental learning. Specific human instantiations include lever/button-press reward paradigms and free-operant tasks.

Description¶

In instrumental conditioning, voluntary actions are associated with rewarding or punishing consequences through repeated experience. Laboratory implementations present discrete choice options where specific responses are followed by desirable outcomes (food, money, points) or undesirable outcomes (loss, punishment). Common schedules include fixed-ratio, variable-ratio, fixed-interval, and variable-interval. Performance measures include response rates, choice patterns, and reaction times. The ventral striatum, dopamine system, and orbitofrontal cortex are critical for representing value predictions and learning from outcomes.

Inclusion test¶

Procedure	Participants perform actions (button presses, lever responses) that produce contingent outcomes (rewards or punishments) according to defined reinforcement schedules.
Manipulation	Reinforcement schedule (fixed ratio, variable ratio, fixed interval, variable interval); outcome valence; contingency degradation; Pavlovian-instrumental transfer.
Measurement	Response rate; choice probability; sensitivity to contingency and outcome devaluation; transfer effects between Pavlovian cues and instrumental actions.

Variations¶

Variation	Description	Justification
Fixed Ratio (FR)	Reinforcement after fixed number of responses.	Reward after fixed number of responses; canonical ratio schedule
Variable Ratio (VR)	Reinforcement after variable number of responses (average specified).	Reward after variable number of responses; different reinforcement statistics
Fixed Interval (FI)	Reinforcement for first response after fixed time period.	First response after fixed time rewarded; different temporal reinforcement structure
Variable Interval (VI)	Reinforcement for first response after variable time period.	Variable time intervals; different temporal unpredictability
Progressive Ratio	Ratio requirement increases progressively; breakpoint indexes motivation.	Ratio requirement escalates; measures motivational breakpoint
Concurrent Choice	Multiple response options with different reinforcement schedules; matching law studies.	Two simultaneously available schedules; choice behavior reveals preference
Two-Stage Decision Task (Daw)	Sequential choice task dissociating model-based (goal-directed) from model-free (habitual) learning.	Two-step Markov decision; measures model-based vs. model-free learning
Devaluation Paradigm	Reward value changed after learning; goal-directed behavior adjusts, habitual does not.	Outcome devaluation tests goal-directed vs. habitual control; different post-training procedure
Contingency Degradation	Weakening action-outcome relationship; tests sensitivity to causal structure.	Action-outcome contingency degraded; tests action sensitivity
Outcome-Specific Pavlovian-Instrumental Transfer (PIT)	Pavlovian cues bias instrumental responding.	Pavlovian CS influences instrumental responding; different multi-phase design
Avoidance Learning	Responses prevent aversive outcomes; safety signal learning.	Response prevents aversive outcome; different valence and contingency structure

Cognitive processes¶

This task engages the following cognitive processes:

Key references¶

{‘authors’: ‘Schultz, W., Dayan, P., & Montague, P. R.’, ‘year’: 1997, ‘title’: ‘A Neural Substrate of Prediction and Reward’, ‘venue’: ‘Science’, ‘venue_type’: ‘journal’, ‘journal’: ‘Science’, ‘volume’: ‘275’, ‘issue’: ‘5306’, ‘pages’: ‘1593-1599’, ‘doi’: ‘10.1126/science.275.5306.1593’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275(5306), 1593-1599.’, ‘url’: ‘https://doi.org/10.1126/science.275.5306.1593’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Haber, S. N., & Knutson, B.’, ‘year’: 2010, ‘title’: ‘The reward circuit: Linking primate anatomy and human imaging’, ‘venue’: ‘Neuropsychopharmacology’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuropsychopharmacology’, ‘volume’: ‘35’, ‘issue’: ‘1’, ‘pages’: ‘4-26’, ‘doi’: None, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Haber, S. N., & Knutson, B. (2010). The reward circuit: Linking primate anatomy and human imaging. Neuropsychopharmacology, 35(1), 4-26.’, ‘url’: None, ‘source’: ‘unresolved’, ‘confidence’: ‘none’, ‘verified_on’: ‘2026-04-20’}

Recent references¶

{‘authors’: “Balleine, B. W., & O’Doherty, J. P.”, ‘year’: 2010, ‘title’: ‘Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action’, ‘venue’: ‘Neuropsychopharmacology’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuropsychopharmacology’, ‘volume’: ‘35’, ‘issue’: ‘1’, ‘pages’: ‘48–69’, ‘doi’: None, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: “Balleine, B. W., & O’Doherty, J. P. (2010). Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35(1), 48–69.”, ‘url’: None, ‘source’: ‘unresolved’, ‘confidence’: ‘none’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Dolan, R. J., & Dayan, P.’, ‘year’: 2013, ‘title’: ‘Goals and Habits in the Brain’, ‘venue’: ‘Neuron’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuron’, ‘volume’: ‘80’, ‘issue’: ‘2’, ‘pages’: ‘312-325’, ‘doi’: ‘10.1016/j.neuron.2013.09.007’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.’, ‘url’: ‘https://doi.org/10.1016/j.neuron.2013.09.007’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Lee, S. W., Shimojo, S., & O’Doherty, J. P.’, ‘year’: 2014, ‘title’: ‘Neural Computations Underlying Arbitration between Model-Based and Model-free Learning’, ‘venue’: ‘Neuron’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuron’, ‘volume’: ‘81’, ‘issue’: ‘3’, ‘pages’: ‘687-699’, ‘doi’: ‘10.1016/j.neuron.2013.11.028’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: “Lee, S. W., Shimojo, S., & O’Doherty, J. P. (2014). Neural computations underlying arbitration between model-based and model-free learning. Neuron, 81(3), 687–699.”, ‘url’: ‘https://doi.org/10.1016/j.neuron.2013.11.028’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D.’, ‘year’: 2016, ‘title’: ‘Characterizing a psychiatric symptom dimension related to deficits in goal-directed control’, ‘venue’: ‘eLife’, ‘venue_type’: ‘journal’, ‘journal’: ‘eLife’, ‘volume’: ‘5’, ‘issue’: None, ‘pages’: None, ‘doi’: ‘10.7554/elife.11305’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D. (2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife, 5, e11305.’, ‘url’: ‘https://doi.org/10.7554/elife.11305’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}

External links¶

Cognitive Atlas entry