Apart from being the only result a Google search for words “Crosby”, “2B” and “HR” will ever retrieve — feeling lucky or not — this FanPost will try to further analyze two very interesting ideas voiced recently. One is that doubles might be the undervalued commodity and the other that a lineup without home run hitters can not survive one “black hole of suck” in it.
To analyze both the HR vs 2B and Crosby Effect, I needed some sort of game simulation tool, so I wrote one today. While certainly far from perfect, I was both very pleased with the results and entertained while testing. Especially Play by Play reporting in a Single Game Mode (which I needed to make sure everything is happening within plausible boundaries) made me chuckle more than once. With Rajai Davis batting, Rajai Davis steals second was the first result of implementing base stealing. Talk about fast.
I have done all the testing with something resembling one of the possible A’s lineups for the beginning of 2010. All the “ZOMG!! How can you have him batting in front of the other guy” are naturally welcome. OK, let’s jump and get some boring technical details out of the way.
How does this Game Simulator work?
Well, I type the data in it and it gives me a number basically. The data it needs is the following:
Tthe white stuff is what you can get from any decent projection system. I chose Chone for no reason at all – absolute accuracy was not a priority, so I don’t really care if one is better than the other. Peach colored ones — unfortunately my wife is not around, so I am probably embarrassing myself in front of the color aware AN audience — are historical data. The baserunning ones are somewhat regressed to the mean if the sample size was too small. They are in order, the percentage of times a runner goes first to third on a single, second to home on a single, first to home on a double and the percentage of outs recorded on a ground. The other-colored ones are rate stats calculated out of it.
Before the game itself can start, first the odds for each event are calculated. It then looks something like this:
So, for example, when Daric Barton comes to bat he is most likely to fly out (29%) and least likely to be hit by pitch or to hit a triple (1% each). If he is on first base when a single is hit, he has 26% chance to advance to third. The two stolen bases columns represent the aggressiveness (SBA), calculated form projected stolen bases attempts versus simplistically assumed opportunities (1B, BB and HBP, multiplied with a factor gained through some empirical testings). The rest is basically rolling the electronic dice.
Moving and accounting for the base runners took the longest time to implement, as it is full of code chunks like this:
I, of course had to simplify a lot and I looked to cut the corners, in areas where the effects should be negligible. Here is a short list of what this simulation does not do:
– With one minor exception (SB3 attempts only with one out), there is no intelligence implemented. No bunts, either.
– No runners are ever thrown out advancing, except for double plays and force outs. No Pick-offs, no outfield assists.
– Runners’ speed is not taken into account when grounding into double play.
– Defensive errors are not implemented, either.
With the time on hand, I couldn’t implement more, nor do I think it would have improved the simulation with any statistical significance.
My favorite part was running the tests and it is almost scary that I already started developing feelings for “players” for their ability to deliver, or the lack-of. The play-by-play mode looks something like this:
Zukes is clutch!!!
Once I got it to run decently, I started simulating whole seasons. And, boy, did the A’s have that one stretch mid-May, scoring, 7, 9, 11, 6 and 12. But then came June, and, well…
Three caveats when considering season numbers:
1. By not taking errors into account while simulating, total runs should be increased by some 8% (that’s average R/ER ratio)
2. By playing every game full nine innings, total runs should be decreased by some 1.5% (average innings per year / 9 * 162)
3. By not taking into account runners thrown out on bases total runs should be decreased. I do not know by how much, so I am conveniently assuming that such number should be by couple of percent which would mean the simulated numbers are pretty accurate.
I ran the sets of 1,000 simulations for every scenario and I will show you not only the average, but the distribution.
Here is one such simulation set, that was averaged with 999 others to create following charts. I analyzed single season outputs to see if the algorithms for double plays, sacrifice flies and stolen bases were functioning plausibly well.
The above lineup scored 730 runs a season of average:
Now this seems positively plausible. A’s scored 759 Runs last year, they did not necessarily get any better and due to restrictions mentioned above, I believe I am selling offenses 2-3% short on runs output. Having a system I can rather trust, I was finally able to tackle the question Nico posted – are doubles really worse then home runs?
To answer this, I manually changed the stats for the entire lineup, without changing the outcome. Both OBP and SLG remained the same, as I exchanged the hits at the rate 1 HR + 2 1B = 3 2B. For every A’s player, I stripped him of all of his home runs, I stripped him of double that number of singles and added triple that number of doubles. A’s now looked just the way they feel to watch – zero home run power across the board.
The power has been redistributed to the unrealistic extreme, which should help answer the question – can you win without HR power?
Thousand simulations sets later (it is, by the way beautiful to program with all that processing power, allowing you to run shitty code in milliseconds) this came out:
Double Duty Radcliffes did even better. Slightly better, probably within statistical noise, but still. I think this increase could be in line with what someone posted about how Tango weighs doubles vs home runs.
With the tool up and running, I decided to test another thing that intrigued me.
Ken wrote following:
Then the problem becomes the problem we’ve seen on the A’s the last few years: that when you have a lineup without any home run power, you absolutely cannot have any holes. A Bobby Crosby and/or a Jason Kendall with a .600 OPS in a lineup filled with doubles hitters with a .750 OPS will kill the offense.
Emphasis mine. I wondered from the moment I read it, would a lineup without HR power really be more affected by a hole?
In order to test that , I decided to use three lineups. Two above mentioned ones and a third one, improved by two imaginary mashers. For each I would measure the damage done by Crosby Effect, a rare impediment that would turn Cliff Pennington into Cross Bobbington, a player so utterly bad not even his father could cheer for him.
To add some home run power, two new players found their way to the lineup:
The question is, would such lineup endure one hitter as close to an automatic out as it gets more easily than lineups without 900 OPS mashers?
Is Crosby effect to have less impact on this lineup?
The answer was:
Gold is normal, silver with The Crosby Effect or Outamatic Rule. The amount of damage doesn’t seem to be affected by the strength or HR propensity of a lineup.