Why most investors are not conditioned to do well in value investingIn the study of psychology there is a phenomenon called Operant Conditioning.

Operant conditioning shapes behaviour, and the underlying principle is that behaviour that is positively reinforced tends to be repeated, while behaviour that is not reinforced tends to be extinguished.

Think of a child crying for attention. He wants to be picked up and cuddled. When the mother picks up the child, she is actually reinforcing the child. The action (crying) is leads to the desired outcome (being picked up). If a child is picked up every time he cries, he is being constantly reinforced and in no time will associate the desired outcome with the required action.

Operant conditioning is studied extensively by psychologist B.F Skinner. He experimented with rats in an apparatus called a Skinner Box. A hungry rat would be placed in a box with a lever by the side. As the rat moved about the box it would accidentally knock into the lever. When that happens, a food pallet would drop into a container next the the lever.

In no time the rats became conditioned and would go straight to the lever the moment they are placed in the box. Once that is established, Skinner explored schedules of reinforcement to see how and which were most effective. He studied four different schedules.

Schedules of Reinforcement 

For a Fixed Ratio reinforcement schedule, the rat would be given a pallet after a specific number of presses on the lever. If a rat is exposed to a three press schedule, it would soon learn to make three quick presses of the lever in a row to obtain its food. The fixed ratio schedule is predictable and the best way to condition a new behaviour.

A sales person who is paid according to the number of product he sells is fixed ratio reinforced. For every product or every quota he meets, a reward is promised. This conditions him to sell more. Another real world example is loyalty cards. By buying five cups of bubble tea, you get a free drink. Vendors are reinforcing our purchasing behaviour in order to sell more drinks.


A Variable Ratio schedule calls for reinforcement to be applied after a random number of response. Instead of being fixed (eg. pallet after every two presses of the lever), the reinforcement might come after one press, then four presses, followed by three presses.

Classic everyday example of this would be Jackpot machines. Our behaviour (feeding hard earned money into the machine) is reinforced (monetary payouts) when we pull the handle. Sometimes we need to pull the handle many times before getting a payout. If we are lucky, we get a payout after one or two pulls.

The reinforcement for Fixed Interval is time dependent. The pallet would drop at a fixed time interval, regardless of the number of presses on the lever. As a result, the lever pressing behaviour of rats tend to become weaker over time.

