All-time winning percentage at home
Posted by Andy on October 9, 2007
REMINDER: If you want to vote in the AL Cy Young race, just click here and leave a comment. Voting closes early tomorrow morning.
Hey, I'm back!
Using the Team Batting Game Finder, I have found all-time winning percentages for home teams. It's done simply by setting the appropriate bubbles to "Home" and "Win" (or "Loss.") Nothing was selected for the actual game criteria, and I summed for "Years w/ team games."
First, for the regular season. Here's a graph of home team winning percentage:
Over the last 50 years, home team winning percentage has been around .539 on average. You can see that in 1978, it spiked at .573, whereas in 1968 it bottomed out at .511. I seem to recall that someone over at The Hardball Times wrote a great article about this, and if anybody knows the link, please put it in the comments.
Looking at the graph, it's hard to tell if there's been more variation at one time or another. This surprised me a bit, since there are so many more games played these days than in 1957. (This would be regression toward the mean at work, which basically means that the larger sample sizes get--e.g. more games in a season--the less likely it becomes for total outcomes to vary significantly from the average. Think about it this way: If the 2007 Devil Rays played the 2007 Red Sox in a 3-game series, Tampa Bay might win the series. If they played 100 of these 3-game series, Tampa Bay would still likely win a bunch of the series--maybe 40 or so. However, if they played a huge best-out-of-41 games series, it's very unlikely that Tampa Bay would ever win such a series, since they'd need to win 21 out of 41 games, which would require them to win a lot more often than they usually do over an extended period of time.)
Anyway, in Major League Baseball, home teams played a total of 1230 games in 1957 (winning 646 of them) and a total of 2431 games in 2007 (winning 1319 of them.)
Now, on to the post season. Note that the 2007 numbers are as of yesterday (probably not including last night's home loss by the Yankees.)
As Scooter would have said, "Holy cow!" There is a massive amount of variation in the graph. This again has to do with the principles discussed above, as all of the data prior to 1969 is based on fewer than 10 games in each year. Even from 1969 to 1993, when there were between 11 and 20 games every year (except for the split season of 1981, with 32), there is a large amount of variation, although less than in the earlier era. When we say hello to the wild card era in 1995, however, things begin to settle down. We've had at least 30 playoff games since 1995 (except for 2007 so far) and the year-to-year variation in home winning percentage is much smaller.
In fact, since 1995, there have been 216 home-team wins and 187 home-team losses in the playoffs, a .536 winning percentage. In baseball regular-season history, there have been 54,269 home-team wins and 46,472 home-team losses, a .539 winning percentage. So with the extra round of playoffs and more overall games, the winning percentage is, in fact, regressing toward the mean.
October 9th, 2007 at 9:34 am
Just to be pedantic, I think you mean something more like "central limit theorem" than "regression to the mean." Regression to the mean would be something like: if you took the 2006 team which led the league in HW% - AW%, you'd expect their differential to be less in 2007 than it was in 2006.
October 9th, 2007 at 11:00 am
Fair enough.
I would be really interested to know if anyone has any info or theories on why the 1968 and 1978 seasons were so divergent. At a glance, I couldn't gather anything worthwhile from the standings or stats from those years.
October 9th, 2007 at 2:43 pm
It's a specific form of the central limit theorem, mainly in the sense that if it's a normal distribution, the variance will decrease with increased sample size, making deviation from the mean less likely.
October 9th, 2007 at 9:15 pm
You know that you can get very quick and convenient breakdowns of home and away game records by year, league and franchise, and combined in all sorts of ways, using B-R's Situational Records, outside the Play Index. For this particular sort of study I actually find it somewhat more convenient than PI, as long as you're looking for regular season games. For post-season games, though, PI is needed.