Positive reinforcement, using food rewards to increase the likelihood a dog will repeat a desirable behavior, is universally regarded as the most reliable method for teaching commands. While the basic concepts of rewardbased training are easy to understand, people sometimes inadvertently inhibit progress by using too many—or too few —treats.
Let’s say you’re in Las Vegas playing a slot machine, but every time you deposit a quarter and pull the arm, you get your one quarter in return. This wouldn’t keep your attention for long, and you’d probably opt for a different machine.
Now, what if you started feeding your hard-earned quarters into the next machine, but for hours on end got none back? Chances are you’d become equally frustrated and end your short gambling career.
Applied to dog training, both of these extremes—continuous reinforcement or none at all—can lead to lower command compliance.
GET THE BARK NEWSLETTER IN YOUR INBOX!
Sign up and get the answers to your questions.
“My dog will only sit if I have a treat.” Over the years, I have heard this refrain many times, and it almost always indicates that the dog was rewarded with treats for sitting on cue too often and for too long. Essentially, the dog had learned two things had to be true for him to comply: the sit cue plus a treat. If either were not true, he’d find something more interesting to do.
When initially teaching a new command, “Continuous Reinforcement”—CR in the geeky learning-theory world— is the most effective approach. For instance, when first teaching a puppy to sit, rewarding each successful completion (or “trial”) makes sense because your focus is on clearly pairing the verbal cue and hand gesture with the behavior: put the quarter in (your puppy sits on cue) and the reward appears (treat!).
But acting as your puppy’s loose slot machine for too long causes him to stop working so hard. Why bother sitting quickly, or at all, when a treat invariably appears? CR for too long also causes the dog to become dependent on the food reward: he will refuse to work unless food is presented. Before you get to that point—usually within a few days of teaching a new cue —it’s time to move to a less predictable reinforcement schedule.
Back to the gambling analogy. Once you’re sure your dog has a grasp on what you’re teaching him, it’s time to become a fair and honest slot machine, dispensing small food rewards less frequently for successful trials. (This is also a good time to find soft treats that won’t easily crumble to bits, and to always have a few hidden in your pocket.)
The psychology behind slots— enticing folks to pump coins into machines for hours on end—is that the probability of winning remains constant, even though the number of plays it takes to recoup your money, or better yet, hit the jackpot, changes. The unpredictability makes doing the same mundane activity, over and over, interesting and exciting. You can take advantage of this same psychology to train your dog faster.
When teaching your dog a new command, once you’ve determined that he knows what you’re expecting from him, begin randomly rewarding successful trials using “Variable Ratio” (VR) reinforcement. Start with a low ratio, rewarding roughly one out of every three trials, then increase the ratio over the course of several training sessions.
For example, when teaching your puppy to sit, provide a small treat for (successful) trials 2, 7, 9, 15, 18, 19, 20, 23 and 25. Notice that during 25 trials, sometimes he gets three rewards in a row, but sometimes, there’s a longer lag between treats. The idea is to keep him guessing—and working!
Over the course of twice-daily training sessions (two to five minutes each), increase the ratio until he is rewarded for roughly one out of every ten successful trials. The behavior should become a happy habit by then, although, to keep commands fresh, continue to occasionally reward your dog for life. In other words, don’t become the slot machine that never pays a jackpot!
There are other types of reinforcement schedules too involved for our purposes here, but one to take advantage of is “Differential Reinforcement of Excellent Behavior” or DRE. This is just a fancy way of saying “better performance earns bigger rewards.” Once you’ve worked through Continuous Reinforcement (treating every time to teach the command) and Variable Ratio (treating randomly to hone the behavior), you can polish the command by handsomely rewarding only the best trials.
Let’s think about DRE in terms of teaching recalls. Once your dog is largely responding to your “come” command, and you’ve worked through Variable Ratio reinforcement—by sometimes treating and sometimes not— start rewarding with higher-value treats, or more of what you have, only when your dog immediately and enthusiastically answers your call. If he stops and smells the roses (or whatever that was) en route, no reward is given.
Advancing through these levels is not rigid, and you may combine aspects of more than one as you progress. Be ready to back up a step if you’ve moved too fast—your dog will let you know!