PSYC 1100 Section 2.5 Instrumental Conditioning - Associative Learning

PSYC 1100 Section 2.5 Instrumental Conditioning - Associative Learning
2013-11-10

Sec 2.5
  1. Types of conditioning
    • Classical or Pavlovian conditioning
    • - associated CS with US
    • early figures: Pavlov, Watson
    • - Watson: experience affects behavior
    • Instrumental or Operant conditioning
    • - associate responses with various types of stimuli
    • - i.e. learning and performance of responses is affected by the rewarding or punishing consequences of those responses
    • early figures: Thorndike, Skinner
    • - ex: parent punishing and rewarding children; training animals
    • - reward increases behavior, punishment decreases behavior
  2. Thorndike's Law of Effect
    • Thorndike: puzzle box with the cat; learning curve, figure in education
    • Law of Effect:
    • If Stimulus1 => Response => Stimulus 2(satisfier; e.g. food)
    • Then S1 => R increased (frequency or speed)
    • If S1 => R => S2(annoyer; e.g.  stress)
    • Then S1 => R decreased (frequency or speed)
  3. Skinner's Empirical Law of Effect
    • - operant chamber (Skinner box)
    • - Thinks Thordike's wording needs to be changed

    • If R => S, and R probability frequency or speed has increased
    • then reinforcement has occurred; S is called "reinforcer"

    • If R => S And R probability, frequency or speed is decreased
    • then Punishment has occurred; S is called “punisher
  4. Four Basic Procedures of Instrumental Conditioning
    • Positive reinforcement- easily understood
    • - a response followed by presentation of a reinforcer increases in probability
    • Negative reinforcement
    • - Taking away something to increase behavior
    • - to reinforce means “to strengthen”
    • - response probability is increased
    • - ex: warnings, alarms, water kettle
    • Punishment- easily understood
    • - a response followed by presentation of a punisher decreases in probability
    • Omission (time out)
    • - removing the positive reinforcer
    • - these procedures decrease responding
    • - omission can separate organisms from stimuli, or separate stimuli from organisms
    • - ex: taking away the ice-cream cone (separating stimulus from organism), sitting in the time-out zone (separating organism from stimulus)
    • Establishing technology to modify behavior

    • Important points:
    • - negative reinforcement and punishment are not the same thing
    • - punishment can have negative consequences; can increase aggression (usually from unintended reinforcement)
    • - positive reinforcement is effective for directing behavior to a particular endpoint
  5. Schedules of Positive Reinforcement
    • A schedule of reinforcement is the relation between the response requirement and the delivery of the reinforcement
    • Continuous reinforcement: each response is reinforced
    • Intermittent reinforcement: every response is not necessarily reinforced; relation between responding and reinforcement is more complex (2 types)
  6. Classes of Intermittent Schedules
    • Ratio -  a specified number of responses must be emitted; when ratio is completed, reinforcer is delivered
    • Interval - a specified time interval must elapse; first response after interval is reinforced
    • There are both fixed (the ratio/time are the same) and variable ( the ratio/time are different) interval and ratio schedules
  7. Four Basic Classes of Reinforcement Schedules
    • Two types of interval schedules:
    • Fixed Ratio (FR): reinforcement occurs after a number of responses are made. The number of responses does not change
    • Variable Ratio (VR): reinforcement occurs after a specific number of responses occurs. The number needed to complete the ratios varies around an average value
    • - the organism usually doesn't know the number of responses needed for VR
  8. What is the Relation between schedule and...
    • response pattern?
    • response rate?
    • VI schedule: generated a steady, stable or consistent rate of responding, with few pauses; no periods of time where they're not responding
    • - i.e. pop quizzes
    • FI schedule: animals take a post-reinforcement pause, which is longer with larger intervals; sometimes they're responding sometimes they're not. The organism learns that they won't be reinforced for a while
    • - i.e. scheduled exams

    • Which schedule generate the highest response rates and why?
    • - cumulative records (number of responses recorded over time)
    • Ratio schedules generate the highest response rates. Up to a point, as ratio requirement gets higher, animal responds faster. If animal responds faster, it gets MORE reinforcement. The relation between response rate and reinforcement puts an upward pressure on response rates
    • - the organism will learn to response faster
  9. Ratio Strain
    - If the ratio is too high, organisms stops working for the reinforcement (the organism makes a benefit decision)

    • Which pay schedule are similar to ratio schedules?
    • - piecework: people are paid on how many units they produce (factories)
    • - commission: people are paid by the number of units they sell (selling $1000 gives you $100)

    • Progressive Ratio Schedules:
    • A progressive ratio schedule can be used to determine how much work an organism will do for reinforcement. With this schedule, the ratio value keeps getting higher and higher, until the animal stops responding (it's not worth it anymore)
  10. Behavioral
    • Views operant behavior as operating according to economic principles
    • Reinforecers are "goods" or "commodities"
    • Responses are the "costs" that are paid 
    • Ratio strain can be explained in economic terms 
    • Aspects of demand can be studied in operant conditioning experiments
    • Demand curves, elasticity of demand, and cost-benefit analyses are used in psychology as well as economics
  11. Instrumental Conditioning Phenomena
    • Extinction - response occurs but is no longer reinforced; responding declines over time
    • -  you present CS without the US
    • Discrimination - stimulus conditions signals which reinforcing or punishing contingencies are in place; organisms behave differently depending upon the condition (also known as "stimulus control" over behavior)
    • - rat responding to loud and soft buzzes differently

    • Signal is known ECONOMICS as a DISCRIMINATION STIMULUS (loud and soft buzzer)
    • - red (punishment) and green(no punishment) lights
    • - results in different behavior under differnt circumstances
  12. Types of Positive Reinforcers
    • Primary reinforcer: these stimuli are reinforcing in and of themselves (e.g. food, water, sex, sensory stimulation, social status)
    • Secondary reinforecers: these stimuli are reinforcing because they are paired with a specific primary reinforcer (e.g. red light paired with food can become a reinforcer)
    • - in many cases they are more convenient to deliver than primary reinforecers ("good boy" + doggy biscuit)
    • Generalized reinforcers: these stimuli are paired with, or signal access to multiple primary reinforcers (approval from your parents/boss, money, tokens or points that allow access to prizes or other reinforcers)
    • - no intrinsic reinforcement
    • - more efficient and convenient to administer
    • - society
  13. Similarities/Differences between Operant and Classical Conditioning
    • Both depend upon order
    • - CS => US (classical conditioning)
    • - Response => Reinforcer (operant conditioning)
    • Both depend upon temporal contiguity
    • Both can occur at same time (i.e. an animal in a operant box can be salivating in anticipation of food delivery)
    • Classical and operant conditioning can interact; presentation of a classical CS associated with food can increase food-reinforced operant responding

    • However
    • - classical conditioning: stimuli are presented independently of behavior
    • - operant conditioning: delivery of reinforcer and punisher depends upon behavior
    • Operant conditioning establishes an organism/environmental system
    • - behavior affects environment (pressing the lever to get food)
    • - environment affects behavior (work environment)