Instrumental Conditioning Task¶
HED Task ID: hedtsk_instrumental_conditioning
Also known as: Operant Conditioning, Instrumental Learning, PIT
Actions are reinforced by contingent outcomes under defined schedules; response rate and choice probability across schedules index instrumental learning. Specific human instantiations include lever/button-press reward paradigms and free-operant tasks.
Description¶
In instrumental conditioning, voluntary actions are associated with rewarding or punishing consequences through repeated experience. Laboratory implementations present discrete choice options where specific responses are followed by desirable outcomes (food, money, points) or undesirable outcomes (loss, punishment). Common schedules include fixed-ratio, variable-ratio, fixed-interval, and variable-interval. Performance measures include response rates, choice patterns, and reaction times. The ventral striatum, dopamine system, and orbitofrontal cortex are critical for representing value predictions and learning from outcomes.
Inclusion test¶
Procedure |
Participants perform actions (button presses, lever responses) that produce contingent outcomes (rewards or punishments) according to defined reinforcement schedules. |
Manipulation |
Reinforcement schedule (fixed ratio, variable ratio, fixed interval, variable interval); outcome valence; contingency degradation; Pavlovian-instrumental transfer. |
Measurement |
Response rate; choice probability; sensitivity to contingency and outcome devaluation; transfer effects between Pavlovian cues and instrumental actions. |
Variations¶
Variation |
Description |
Justification |
|---|---|---|
Fixed Ratio (FR) |
Reinforcement after fixed number of responses. |
Reward after fixed number of responses; canonical ratio schedule |
Variable Ratio (VR) |
Reinforcement after variable number of responses (average specified). |
Reward after variable number of responses; different reinforcement statistics |
Fixed Interval (FI) |
Reinforcement for first response after fixed time period. |
First response after fixed time rewarded; different temporal reinforcement structure |
Variable Interval (VI) |
Reinforcement for first response after variable time period. |
Variable time intervals; different temporal unpredictability |
Progressive Ratio |
Ratio requirement increases progressively; breakpoint indexes motivation. |
Ratio requirement escalates; measures motivational breakpoint |
Concurrent Choice |
Multiple response options with different reinforcement schedules; matching law studies. |
Two simultaneously available schedules; choice behavior reveals preference |
Two-Stage Decision Task (Daw) |
Sequential choice task dissociating model-based (goal-directed) from model-free (habitual) learning. |
Two-step Markov decision; measures model-based vs. model-free learning |
Devaluation Paradigm |
Reward value changed after learning; goal-directed behavior adjusts, habitual does not. |
Outcome devaluation tests goal-directed vs. habitual control; different post-training procedure |
Contingency Degradation |
Weakening action-outcome relationship; tests sensitivity to causal structure. |
Action-outcome contingency degraded; tests action sensitivity |
Outcome-Specific Pavlovian-Instrumental Transfer (PIT) |
Pavlovian cues bias instrumental responding. |
Pavlovian CS influences instrumental responding; different multi-phase design |
Avoidance Learning |
Responses prevent aversive outcomes; safety signal learning. |
Response prevents aversive outcome; different valence and contingency structure |
Cognitive processes¶
This task engages the following cognitive processes:
Key references¶
{‘authors’: ‘Schultz, W., Dayan, P., & Montague, P. R.’, ‘year’: 1997, ‘title’: ‘A Neural Substrate of Prediction and Reward’, ‘venue’: ‘Science’, ‘venue_type’: ‘journal’, ‘journal’: ‘Science’, ‘volume’: ‘275’, ‘issue’: ‘5306’, ‘pages’: ‘1593-1599’, ‘doi’: ‘10.1126/science.275.5306.1593’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275(5306), 1593-1599.’, ‘url’: ‘https://doi.org/10.1126/science.275.5306.1593’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Haber, S. N., & Knutson, B.’, ‘year’: 2010, ‘title’: ‘The reward circuit: Linking primate anatomy and human imaging’, ‘venue’: ‘Neuropsychopharmacology’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuropsychopharmacology’, ‘volume’: ‘35’, ‘issue’: ‘1’, ‘pages’: ‘4-26’, ‘doi’: None, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Haber, S. N., & Knutson, B. (2010). The reward circuit: Linking primate anatomy and human imaging. Neuropsychopharmacology, 35(1), 4-26.’, ‘url’: None, ‘source’: ‘unresolved’, ‘confidence’: ‘none’, ‘verified_on’: ‘2026-04-20’}
Recent references¶
{‘authors’: “Balleine, B. W., & O’Doherty, J. P.”, ‘year’: 2010, ‘title’: ‘Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action’, ‘venue’: ‘Neuropsychopharmacology’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuropsychopharmacology’, ‘volume’: ‘35’, ‘issue’: ‘1’, ‘pages’: ‘48–69’, ‘doi’: None, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: “Balleine, B. W., & O’Doherty, J. P. (2010). Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35(1), 48–69.”, ‘url’: None, ‘source’: ‘unresolved’, ‘confidence’: ‘none’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Dolan, R. J., & Dayan, P.’, ‘year’: 2013, ‘title’: ‘Goals and Habits in the Brain’, ‘venue’: ‘Neuron’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuron’, ‘volume’: ‘80’, ‘issue’: ‘2’, ‘pages’: ‘312-325’, ‘doi’: ‘10.1016/j.neuron.2013.09.007’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.’, ‘url’: ‘https://doi.org/10.1016/j.neuron.2013.09.007’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Lee, S. W., Shimojo, S., & O’Doherty, J. P.’, ‘year’: 2014, ‘title’: ‘Neural Computations Underlying Arbitration between Model-Based and Model-free Learning’, ‘venue’: ‘Neuron’, ‘venue_type’: ‘journal’, ‘journal’: ‘Neuron’, ‘volume’: ‘81’, ‘issue’: ‘3’, ‘pages’: ‘687-699’, ‘doi’: ‘10.1016/j.neuron.2013.11.028’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: “Lee, S. W., Shimojo, S., & O’Doherty, J. P. (2014). Neural computations underlying arbitration between model-based and model-free learning. Neuron, 81(3), 687–699.”, ‘url’: ‘https://doi.org/10.1016/j.neuron.2013.11.028’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}
{‘authors’: ‘Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D.’, ‘year’: 2016, ‘title’: ‘Characterizing a psychiatric symptom dimension related to deficits in goal-directed control’, ‘venue’: ‘eLife’, ‘venue_type’: ‘journal’, ‘journal’: ‘eLife’, ‘volume’: ‘5’, ‘issue’: None, ‘pages’: None, ‘doi’: ‘10.7554/elife.11305’, ‘openalex_id’: None, ‘pmid’: None, ‘citation_string’: ‘Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D. (2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife, 5, e11305.’, ‘url’: ‘https://doi.org/10.7554/elife.11305’, ‘source’: ‘crossref’, ‘confidence’: ‘high’, ‘verified_on’: ‘2026-04-20’}