Streaky The Supercat

Personally, I think Billy Martin said it best when he said, “Hey! I can drive!”

Of course, this is not my opinion, but one of the guy who dreamt to be the starting centerfielder for the Boston Red Sox. Me? I think Joe Maddon said it best when he started rookie John Jaso over Dioner Navarro last week: “It’s not very complicated. Jaso, he’s hot.”

I will leave it to other people to battle it out whether the word hot is properly used when describing Eva Mendes or above average results in baseball over a period of time. I will use the word and show what I came up with in my first crack at Retrosheet play by play data.

I realize (well, sort of, since I never read it) that Tango covers similar topic in The Book. Apart from having a much cooler name, I am sure his analysis is done in a more scientific way and is more valuable. But, hey, I live on AN, he doesn’t, so this is what you get.

The difference between high-end and low-end performance in a year

Once we get socially acceptable nomenclature out of the way, I think most everyone will agree that hot and cold streaks are as much a part of baseball as seventh inning stretch and Pirates sucking at it. Over a course of the season every player will go through phases of great and not so great productions. Every player?

Hell yeah. In 2009 Albert Pujols (1.108) just barely missed finishing the year with an OPS two times that of Willy Taveras (.564). Yet, if random samples of their production were taken in the end of July and the beginning of May, respectively, hardly anyone would tell. Over those selected periods, 30 plate appearances each, Willy OPS-ed a cool 1.364, while Prince Albert managed only five measly singles for a very uncool OPS of .414.

Of course 30 PA a batter don’t make. No one is arguing that Taveras is better then Pujols – only that over a short period of time, also known as a streak, every player can seem great or shitty, and every player will have such periods.

I chose 30 PA, or roughly a week of production, as such periods seem to be mentioned most when considering someone hot or cold. Just how hot or how cold can Major League hitters get over 30 plate appearances? Ask above mentioned Willy or Luke Scott.


Ten hottest and coldest streaks of 2009. Orlando Cabrera just missed the cut with his .143 performance in April


Overall an average Major Leaguer  hit .096 / .160 / .118 / .278 when he went coldest and .464/.538/.870/1.408 when hottest.


Now, that is obviously a huge difference. We are talking same players here, remember. So MLB is full of Dr. Jekyll and Mr. Hyde types. Obviously, such extremes are unsustainable and can not be a good representation of players skills. Nobody can OPS 1.400 over a year (I know one player did it yet I repeat – nobody can OPS 1.400 over a year) and still we have an average player doing it over his best 30 PA.

Also nobody can amass 300 PA (the minimum I chose for this exercise) in a year when OPS-ing sub .300, unless he is a DH for… ah, never mind.


The players are robots, or are they?

People often say that players bring their constant skill-set and that luck, or better said chance, determines the outcome within the skill-set boundaries. A good bounce here and there. Teeing of a mop-up pitcher. A jet stream turning a lazy fly into a grand slam. For many, a batter is a coin, the same coin that can get flipped this way or that way, but one that has only a passive part in that equation. For many, batters are just like robots. And, hey, even robots can get lucky.


Some days are just better than others. Robby The Robot in the midst of a hot streak.


Hitting a baseball is a far more complicated process than flipping a coin or dealing out of deck of cards. Even if a player hits the baseball constantly well, the results of good hitting are not necessarily immediate. Even if a skill-set is truly constant, the results will vary. And, frankly, there is hardly a way to directly and objectively measure a skill of hitting a ball without looking at a result, which is influenced by many other factors than skill itself.

Take track, for example. You want to know if a guy is fast? Let him run, take a stop watch and measure it. What you have measured is more or less a direct result of his skill. He is either fast or not, and that’s about it. You know what you get.

In baseball you can either use subjective analysis, like scouts do when they watch the process of hitting, or objective analysis of results, which are sometimes very indirect results of the skill. The smarter these analysis are (big data samples, park factors, splits, etc.) the closer you get to isolating the part that is influenced by the batter himself. But even then, you will never get a constant out of it.

Back to our track runner. If you have him run 100 dash on 600 different occasions over 6 months his results will vary, even if you have him run it under exactly same circumstances (indoors, for example). Is it luck that he is faster on some days and slower on the others? Hardly. His physical abilities vary over time. Baseball is of course much less physical than track and one can argue that the bigger part of baseball skill is mental, that it is knowing how to, instead of being able to. But can one even start to argue that there is a metrics for a mental skill-set and that it is constant over time?

I guess what I’m trying to say is that just because we can not objectively measure a specific skill-set on a given day, it doesn’t mean that we can exclude the possibility of it changing. Just because we know that luck plays huge part in final results we have no base assuming that player himself had nothing to do with it. But enough of my opinions, you are here for the numbers.


How streaky are the batters?

I’ve split the 2009 season in a series of 30 PA blocks for each player. For each such streak I then compared the OPS with the season OPS. If a streak was within +/- 30% of his seasonal production, I labeled it as average, the ones bellow and above as cold and hot.

Now, mind you  a difference of +/- 30% is huge. Take a .750 OPS guy and what you are left with is a span of .525 – .975. One would expect that the batters maintain their production within that span for the vast, vast majority of the year. Well, I would. And I would be wrong.

For roughly one third of the time batters perform significantly different than their norm

So, an average hitter in MLB will not resemble himself for two months. My first reaction was to think that it must be the heavy HR hitters skewing the picture, by relying on the slugging part and hitting home runs in bunches. But, it’s not really like that. Have a look at 10 most constant and 10 most spread out performances of 2009:

A quick look at the streak distribution data seems to indicate that players with higher OBP are less prone to streaks, although there should be much more serious analysis to come to any such conclusion.


The predictive power of a streak

Now, I argued that, just because there is chance involved (and playing a prominent role, too), one should not dismiss the influence that varying skill-set might have on what comes to be known as a hot or a cold streak. However, if you are not even sure that skill-set actually does vary and on top of that have no way to measure in which manner it changes if it does – well, then you shouldn’t really be betting that such a streak will continue.

When the managers say “he’s been swinging the bat well” they more often than not really mean “he’s been getting hits”. And often, they bet it will continue to be so.


“It’s not very complicated, he’s been swinging the hot bat lately”


John Jaso promptly went on a 1 for 10 tear after his manager spoke the words of wisdom. How did the 2009 batters do? How did the league react after hitting at the highest and lowest levels of the season? I checked the immediate production after the hottest and lowest streaks, choosing to evaluate 15 PA – more less the next series.

Post coitum omne animal triste est. 

After their best impressions of Babe Ruth and Mario Mendoza, respectively, Major Leaguers didn’t cool off or heat up gradually, didn’t even go back to being their average selves. They swung the pendulum to the other side, as their production after the coldest streak of the year was significantly better than after the hottest. However, this data is cooked, as the performance after the hottest/coldest streak was taken into consideration when determining when such streak occurred in the first place. 

So I took another look, and analyzed performance after first hot or cold streak in a season (any streak of 30 PA where batter under- or over-performed by at least 30%)

Mirror, mirror on the wall…


So, I guess – let the managers do what they want. If you play fantasy, however, learn not to make any predictions based on the streaks, except perhaps – who will be playing and who not.


When is a streak not a streak anymore?

There is a time when players performance over a streak should be watched more closely, namely at the beginning of the year. Lot can happen over 6 months of off-season – players mature, grow old, get in shape, change work ethics and so on.

I looked at all the players that had at least 800 PA during 2006-2008 and at least 300 PA in 2009. I then looked how they started the year and whether they had a better or a worse year than their 3yr average would indicate.

How to read this? The columns are the first N plate appearances to start the 2009 season, the rows are comparison between the production over those PA and 3yr average. The cells in the crossing indicate the percentage of players in that group that went on and finished the year bellow/above their 3yr average.

In a highlighted example, 76.2% of all players who started the year by batting 25% worse over 50 PA than their 3yr average would indicate never recovered and finished the year worse than their 3yr average. The longer such streak is and bigger the deviance from the average (further to the right and down in each table), less chance there is that a player will bounce back to the normed production*.

*This is based only on 2009 vs. 2006-2008. Perhaps one day I’ll do or find something similar done over a century and get some seriously predictive value out of it.


“Defense is like hitting, it is contagious. “

Well, let’s hope defense isn’t – otherwise A’s should have one more reason not to have Cust anywhere close to Michael Taylor.


You’ve heard it before, for sure – “hitting is contagious”. In the last look at the streaks of 2009, I compared the production of the teammates of the “infected batters”. How did  the others do while surrounded by greatness or awfulness?

I’ll save you from looking at another chart you’ve seen before and just give you the numbers.

That’s more than 50,000 PA  – and basically identical data. So, if hitting is indeed contagious, there are some good vaccines around MLB.


* To all those who were tricked into reading this by thinking there will be some analysis of proper usage of animals with super natural powers when constructing a MLB line up, I apologize. I’ll write about that soon.

Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>