Extra innings games vs. runs scored
Posted by Andy on September 2, 2010
This is a follow-up to my earlier post about the correlation between run-scoring and extra-inning games. In that earlier piece, I made a scatter plot where each data point represented one year. The fraction of games going to extra innings was plotted against the average runs scored per game that year.
Now I've done something a little bit different that I think makes things clearer. Click through for more.
What follows is data that combines games from all years we have box scores for (1920-present.) First off, here are the total number of games played here the winning team scored a certain number of runs.
So this says that since 1920, teams have won about 21,800 games since 1920 with exactly 5 runs, and of those games about 19,600 were decided without going to extra innings. That means that 10.1% of games where the winning team scored 5 runs went to extra innings.
By making this same calculation at each number of runs, we generate the following plot:
A few things to notice about this plot:
- Don't pay all that much attention to the specifics of the graph above 10 runs. From the first graph above, you can see that there are many fewer overall games with 10+ runs so the percentages have a lot of variability associated with them. Whether the numbers come out to 1%, 2%, or 3% often comes down to no more than 1 or 2 extra-inning games over 90 years (truly meaningless).
- There seems to be a bit of a discontinuity between 4 and 5 runs, in the sense that extra-inning games occur a bit higher from 1 to 4 runs and then there's a bigger decrease when going to 5 runs. I don't really know why this is the case, but I can make a wild guess: when the team that eventually wins has 1, 2, 3 or 4 runs, the opposing manager has the chance to do some thing to push across 1 or 2 runs in order to tie the game. For example he can bring a good pinch-hitter off the bench in a key situation. He also might use a fast pinch-runner to try to push across one run. These things are more likely to happen when not too many runs have been scored. By the time the winning team gets to 5 runs, these strategies might be used less when the opposing team isn't within a few runs of tying the game. (Not sure about this theory...just a guess like I said.)
- Overall the graph is quite linear from 1 to 11 runs (R-squared of 0.9896 for those interested in such things). Turns out over this range of runs, the percentage of games going to extra innings drops by just about exactly 1 percentage point as the winning team scores 1 more run, in other words going from X to X+1 runs means the percentage of games going to extra innings falls from Y to Y-1 %.
- Keep in mind that this graph ignores run-scoring environment. It's the case that there are more games going to extra innings tied at 5 these days than in the 1960s, when scoring 5 runs was a lot harder.
September 2nd, 2010 at 8:04 am
That top graph is simply BEAUTIFUL.
September 2nd, 2010 at 9:09 am
So the "runs scored by winning team" is truly what it says?
Meaning, if you have 2 Extra Inning games with the final scores of 5-4 and 5-1, they both go into the '5' bucket? I am just trying to make sure I understand what is being depicted (sometimes the concepts of these posts get by me). But that sounds different than the last bullet point where it is mentioned that there are more games that go to extra innings tied at five, because thoretically, the "tied at 5" games could end up in any bucket (higher than 5 of course).
Like you said, the drop between 4 and 5 is odd, but then it happens again between 6 and 7 before leveling out.
September 2nd, 2010 at 9:10 am
#2 everything you say is correct. It's true that a game that goes to extras tied at 5 doesn't show up on this graph at "5"--it will have to show up at 6 or higher.
September 2nd, 2010 at 10:03 am
[...] more: Extra innings games vs. runs scored » Baseball-Reference Blog … Share and [...]
September 2nd, 2010 at 10:13 am
Another theory for the drop from 4-5 and 6-7 (it's really more of a drop from 4-7 with 5-6 being the oddity):
In a game where the leading team has 4 or less runs it's possible to tie the game with a single swing. Once the lead becomes 5 or more the game is often considered to be "out of hand". Even if the bases are loaded with no outs the pitcher feels secure that with a few ground balls he can get out of the inning. If the lead is 4 or less the pitcher is thinking "no home runs, no home runs"
September 2nd, 2010 at 11:19 am
Andy,
Given the the information you have available would it be difficult to produce a graph of the same thing, only this time using runs scored by the losing team? I'm not sure that this will tell us much of anything either.
I think what we are seeing above is that the lower the score of the game, the higher chances that the game is close and thus the higher the probability that the game will end up tied at the end of 9 innings. This is connected to the concepts behind the data we have seen showing the results of 1-run games and extra-inning games being close to .500 independent of record. If having a a score differential of +/-1 is fairly arbitrary then having a score differential of 0 should happen quite a bit as well.
Another interesting graph might be one where the X-axis is total runs scored in the game, but discarding the data points where total runs in an even number. In terms of looking at the impact of runs on producing extra inning games, I think a 5-4 game has more in common with a 8-1 game than it does with a 5-0 game.
September 3rd, 2010 at 3:48 pm
Andy, bear with my low IQ here, but in the second graph does the data point at (1,13.5%) mean the percentage of games that went into extra innings in a scoreless tie or does it include games that went into extra innings scoreless but where a team ended up winning say.... 2-1? Thx.
Either way, an excellent study that it will take me a while to wrap my head around.