The Random Reinforcement Problem

THE RANDOM REINFORCEMENT PROBLEM

In the rational, sane world, correct actions are met with rewards, and doing the wrong thing results in punishment. This is simple cause and effect, but unfortunately, this is not the way the market works. Imagine a completely crazy teacher in a classroom, who without any rhyme or reason randomly screams at some students, ignores some, rewards a few, and punishes others.

A student could hand in a perfect paper and get a failing grade, sometimes more than once, while a student who puts a big “X” in the middle of a single sheet of paper receives a perfect score for what was supposed to be a 25-page essay.

It is not that the teacher is actively punishing the good students; there is no pattern at all to the teacher’s actions. Can you imagine trying to learn in such an environment? This is a problem for traders, because the market is like this teacher; it often rewards incorrect behaviors and punishes perfectly correct actions.

You can do exactly the right thing on a trade and lose money several times in a row, or you can make a serious mistake and make a lot of money. The statistical edges in our trading setups become valid only over a large sample size; on any one trial, anything can happen. Especially for developing traders, this random reinforcement, coupled with the extreme emotional charge of both winning and losing, conspire to create one of the most challenging learning environments imaginable.

Random reinforcement is a profoundly powerful tool for behavior modification, and is frequently used to train animals. If you train dogs and reward them every time they obey, their good behavior will probably stop as soon as the rewards stop.

On the other hand, if you randomly reward their obedience by sometimes giving a treat and sometimes not, the modifications to their behavior will usually be permanent. (Again, do you see any parallels with slot machines?) It may be counterintuitive, but random reinforcement is actually a much more powerful tool to shape behavior than consistent reinforcement.

There is so much random noise in the market that even excellent trading systems have a large random component in their results. Over a small set of trades, random reinforcement of both good and bad behavior is normal for our interactions with the market. Excellent decisions are just about as likely to be met with good results as bad results, and poor decisions will also result in a number of winning trades.

Traders trying to be responsive to the feedback of the market and trying to learn from their interactions with the market are likely to be confused, frustrated, and eventually bewildered. The market’s reinforcement is not truly random; over a large number of trades, results do tend to trend toward the expected value, but it certainly can seem random to the struggling trader.

The solution should not surprise you by now: evaluate your trading results over a large sample size, and use statistics to separate reality from your emotional perceptions. Learn from 20 or 30 trades, not one. Make decisions about changing your trading rules based on the results from 50 trades, not five. The market is a capricious teacher.

STUDY FOR FREE NOW

The Random Reinforcement Problem

THE RANDOM REINFORCEMENT PROBLEM

The Book