Positive reinforcement: turning the world into a treat
By Kiki Yablon, KPA CTP, CPDT-KA
Positive reinforcement training is often referred to as “treat training,” or, by those who want to cast aspersions, as “cookie pushing.” This is a drastic misunderstanding of what it means to use positive reinforcement.
Let’s revisit what those two words mean.
It’ll be basic to some of you, but new to others.
Almost all behavior is driven by outcomes, or consequences. Animals behave to change their environment. They get feedback from the environment about whether it worked, and change their behavior accordingly.
Reinforcement is any consequence that increases the future probability of a behavior.
Punishment, which we’re not going to spend a lot of time on here, is any consequence that decreases the future probability of a behavior.
Both of them come in two flavors: positive and negative. These terms carry no value judgments in behavior science. They refer to addition or subtraction.
Positive reinforcement is the addition of a stimulus that increases or maintains the behavior in the future. In negative reinforcement, a stimulus is removed in order to increase behavior. (Negative reinforcement is often paired with positive punishment, the addition of a stimulus to decrease behavior, so that there is something the dog is motivated to work to remove.)
These processes are at work constantly as we move through life, whether we’re aware or not. If I’ve grown uncomfortable sitting in one position for a long time, and I recross my legs, that behavior is negatively reinforced by relief from discomfort. I’ll do it again next time I’m uncomfortable in the same way. If I’m hungry and I open the refrigerator, that behavior is positively reinforced by the sight of food.
When we are training, we are just rigging these natural processes to produce more of the behavior we want. To increase sitting using positive reinforcement, for instance, you need to ensure that the dog gets something he wants immediately after the sit. In basic training, this is often a piece of food.
I use a ton of food, and often real food, in training. Why food? First, food is a primary reinforcer—meaning animals don’t have to learn to work for it. They’re born to do it. Second, food is easy to divide into little pieces and deliver quickly and with good timing, so that you can arrange for lots of quick repetitions of the message that this particular behavior “works.” Repetition builds habits.
But while all animals are motivated by food under some conditions, not all food is equally reinforcing, and food that is reinforcing in one scenario might not be in another.
In a recent session with Sylvie, a shepherd mix, the goal was to teach her to make eye contact with her owner when the front door was opened—a simple behavior incompatible with dashing out the door.
This went pretty well when we worked on opening the interior door. We started by touching the door handle and waiting for eye contact. When we got it, we marked that behavior with a click, then delivered a treat: a piece of steak, a notch or two up the “real food” ladder from what Sylvie earned for behaviors she was already good at. The steak was delivered behind Sylvie, to condition her to anticipate the good stuff a few steps away from the door.
Sylvie began to offer eye contact as soon as her owner touched the handle, so we advanced to opening the interior door slightly. No problem. Then we moved on to opening the interior door all the way. Sylvie held eye contact, got her click, ate her steak.
Then we started over with the same process on the storm door. Sylvie’s owner touched the handle, waited for eye contact, got it, clicked, and dropped a piece of steak behind her.
Sylvie left the steak on the floor. STEAK. I don’t even think she looked back at it.
This is the point at which some people would say, “I tried positive reinforcement, and it didn’t work.”
But remember: reinforcement is a consequence that increases a behavior. By definition, reinforcement works. If the consequence you provide doesn’t work, it may be a “treat,” but it isn’t a reinforcer to that dog at that time.
Fortunately, when dogs are “disobedient,” they are often pointing plain as the long, furry nose on their face to what they find most reinforcing. Clearly for Sylvie, it was getting through that door.
So the owner put her hand on the handle, Sylvie offered eye contact, and the owner marked with her release cue and opened the door. No, Sylvie didn’t get to run free through the neighborhood, but she did get to step out onto the porch on leash, and that was reinforcing. How do we know? Well, we brought her back inside and repeated the sequence several times. Each time, she offered eye contact more and more quickly. As we raised our criteria incrementally, she would maintain eye contact until released even with the door cracked open.
The behavioral principle we were leveraging is called Premack’s principle, which states that a higher probability behavior can reinforce a lower probability behavior. For Sylvie, going where she wanted to go, when she wanted to go there, was higher probability than checking in with her owner.
When you put Premack to work, using a behavior that the dog really wants to do to reinforce a behavior she is not as inclined to perform, a funny thing can happen: the two behaviors can start to flip in value. That’s because reinforcement is reinforcement is reinforcement, and each time a behavior is reinforced, it becomes more probable.
In another recent session, Stella, a border collie mix who would tear out the front door of her apartment in order to get to look out the glass front door of the apartment building, learned the same behavior: offer eye contact, get released to go out the door. By the end of her session, she was giving eye contact for several seconds, going into the stairwell halfheartedly, and heading back inside to offer more eye contact.
Even behaviors trained with food for convenience can later be reinforced with what the dog wants in a given moment. With both Sylvie and Stella, we trained the eye contact and release cues initially with clicks and treats.
There’s an idea out there that you should be able to eventually fade out reinforcement for behaviors that have been “installed,” and that’s wrong—behaviors that are not reinforced will go through another process called extinction. Plus, why purposely refrain from thanking your dog if you bothered to ask them to do something? (Try that one on your spouse or kids and see if it increases their “respect” for you.)
But you can, when necessary, fade out treats in favor of other types of positive reinforcement—if you become a student of what else your dog truly finds reinforcing. The mistake most people make is replacing food with head pats or praise, which are weak sauce for a lot of dogs.
So the next time your dog is doing what she wants instead of what you want, take notes. What’s maintaining that behavior? Is there some way you can give that to her? If not, can you give her something like it, that scratches the same itch? It may turn out to be the best “treat” you could ever offer.