Home team winning percentage
Posted by Andy on February 11, 2011
We know that every year, just about every team plays better at home than on the road, or at least has a better home record. I decided to look into that a little bit more.
Here's a plot of home team winning percentage each season since 1920.
I figured this out by doing two really simple Team Batting Game Finder searches. Each was grouped by number of team games in a season. For the first one I clicked the Home bubble and the Team W bubble, and for the second one I changed Team W to Team L. Then I took the resulting lists of games in each season and put them into Excel to do the overall calculation and graph.
So we see that, indeed, in every season since 1920, home teams always win more than they lose (at least league-wide--obviously every year there are plenty of teams that have losing records both at home and on the road.)
To my eye, there also seems to be a correlation with run-scoring. It makes sense that in higher-scoring eras, the benefit of being the home team is lessened. In other words, being the home team might give a team, say, a half-run advantage. In an environment where teams score an average of 3 runs per game, that home team advantage is use. But in a 6-run-per-age environment, the half-run edge does less to help you win.
But I'm not sure that's correct. Here are various observations:
- Scoring was very low in 1968, and yet the home winning percentage plummeted that year. By my theory above, I would have expected the opposite, where the advantage of being the home team would have been worth more in such a low-scoring environment. The 1969, when the mound was lowered, scoring went way up, and home winning percentage went way up too.
- In 1993, home winning percentage started to fall, and then it went really down in 1994. This does go with my theory, in the sense that the higher scoring we saw starting then eradicated much of the home field advantage. The home team winning percentage bobbed around lower values through 2001, the peak of the recent high-offense era, and it was increased since as scoring has gone down.
One thing that cannot be overlooked is strategy. It matters whether managers played for a tie or for the win in a particular game. It matters whether they'd try to rally from a 2-run deficit by plating a single run in a middle inning. These tendencies of managers change over time and not necessarily in concert with how overall run-scoring levels change.
February 11th, 2011 at 11:36 am
In the book Scorecasting, they suggested that the homefield advantage in baseball really comes down to the umpires giving the home team "the call" on close ball/strike calls and on the bases. It was pretty interesting stuff.
February 11th, 2011 at 11:39 am
More on that point:
http://www.hardballtimes.com/main/blog_article/the-ump-in-the-home-field-advantage/
February 11th, 2011 at 11:40 am
Why don't you add run scoring to your speadsheet and then figure out the correlation between run scoring and home winning percentage?
February 11th, 2011 at 12:09 pm
Actually I did that and the r-squared is just 0.02.
February 11th, 2011 at 12:15 pm
i don't see how you can predict a correlation between league run scoring and home field advantage. there is no reason to believe the overall number of runs increases whether home or away, regardless of any "plus runs per game over generic road team" to the home team. also you don't have the dead ball era on your graph which would provide the crucial evidence to your point.
February 11th, 2011 at 12:15 pm
I think Sean posted an article here over the summer where he concluded that home field advantage was primarily a function of familiarity with the ball park and that batting second was of little to no value. I don't think the data he was working with had the ability to see the impact of home bias on the part of officials so I wouldn't say his article is contradictory to the study Steve references above.
Taking Sean's (apologies if it was one of the other bloggers) to be true, we should also expect to see diminished home field advantage in high home run, high strikeout and high walk environments because familiarity with the park in terms of base running and fielding are only relevant for balls in play.
February 11th, 2011 at 12:57 pm
I almost see the opposite correlation as far as high-run scoring and home field advantage goes.
I notice in the late 20's, save for 1928, the home teams had a higher winning percentage. And also, in 1930-31, the highest scoring period in the history of baseball, the home team had all time highs in winning percentages. Whereas historically low scoring seasons, such as 68, 71-72, the home team plummets in winning percentage. But there doesn't seem to be any real correlation at all and it seems to be totally random.
February 11th, 2011 at 12:58 pm
It's all about tailoring your lineup to your ballpark.
If you know your're playing 1/2 your games at Fenway, you build your lineup to suit. If a lot of runs are being scored in the league, that gives you an even bigger advantage, because your guys are going to mash it at their home park.
Conversely, with a cookie cutter stadium, your home advantage would seemingly be less because there's no feature like the wall to build a lineup around.
So why not take the teams with unique stadiums, like Boston with the wall and New York with the short porch, and compare their home records to teams with non unique stadiums?
It also means that the teams that have money to spend or a good farm system are more likely to have park suitable players, and therefore do much better at home.
I can see a minor umpire advantage, but not nearly enough to be the deciding factor.
February 11th, 2011 at 1:08 pm
Stat people will hate this because it cannot be graphed, but for some players having 50,000 on their feet cheering for you does something to increase to focus and performance, it matters. Players have always said this in all sports and often mention supportive fans as some explanation behind their success, just like military generals are concerned with morale.
After big plays when the game seems over (before its over though) and the crowd is cheering it can make the visitors feel like the cause is lost and take away from their comeback.
February 11th, 2011 at 1:22 pm
topper009, teams drawing 50 000 are going to be winning teams, so they'll seemingly have good home records. Teams drawing 8000 are going to be losing teams, and have bad home records. Seemingly.
Although could there be a correlation between the Atlanta Braves disappointments in the post season and the fact that they rarely sold out the stadium in the playoffs?
So, maybe someone could run a stat comparing home team record with high attendance vs home team record with low attendance.
February 11th, 2011 at 1:30 pm
About 10 years ago John Rickert did a presentation at the SABR con and concluded having extra fans does not help the home team. I don't know if that is online anywhere
February 11th, 2011 at 2:37 pm
[...] Home team winning percentage » Baseball-Reference Blog » Blog Archive [...]
February 11th, 2011 at 3:21 pm
It would probably be easier to search this using the Situational Reports
http://bbref.com/pi/shareit/Tr2Vj
February 11th, 2011 at 3:23 pm
Its not the number of fans its their involvement. Just because the Dodgers have high attendance doesnt matter when half the fans are there between the 3rd and 6th innigns.
February 11th, 2011 at 3:33 pm
Andy-
Ever since your post on Dec. 17th about players traded for each other who were most similar I have been trying to find one. Doesn't beat Rhoden for Drabek, but still good.
June 15, 1926-Baby Doll Jacobson even up for Bing Miller. Check it out.
February 11th, 2011 at 4:04 pm
Andy #4, did you eliminate bottom of the 9th or later situations? If you used runs/inning you would be biased because of partial walk off innings.
If you look at HFA for components, by far the largest is triples, which would seem to depend upon ballpark knowledge, both for the fielders and for the batter/baserunner.
The "Scorecasting" authors found the largest umpire bias to be in high leverage situations, and actually negative, (favors the visiting team in low leverage situations.) But if you look at HFA by inning, you will find the highest to be in the 1st, not a high leverage situation.
Attendance can be very ambigous; the Yankees draw very well on the road, but you would expect them to win more than the average visiting team. The Nationals draw well when the Phillies visit, because Philadelphia is close enough to come for a game and still get home by 1am or so. The quality of the Phillies, and the large % of Phillies fans in the stands are not likely to help the Nats win.
February 11th, 2011 at 4:15 pm
#14 Trooper
I'm gonna guess you have less time on this board than me.
When you add an intangible to the mix- quality of fans- you're veering quite afar from the path of the real adherents here. Look at Andy in #4- he tells the reader he did the r-squared scored.
I don't know what that means. Do you? These guys on this board- they're not just stat-heads. You and I are stat-heads. These guys- many of them- are statisticians. Quality of fan just can't be measured.
That being said; I'm not sure I agree with you. I agree that there is such a thing as quality of fan, I just don't agree it has a difference in the game. The fan mix at my homepark- Comerica Park in Detroit- has changed a lot in the last five years or so. A lot more families, a lot of teenagers, a lot of people not that much into the game. But to hear the Tiger players talk about it, they all say they have the best fans in the world. I think they're saying 40,000 fans is 40,000 fans.
February 11th, 2011 at 5:55 pm
Albert7 @10: "Teams drawing 50 000 are going to be winning teams.... Teams drawing 8000 are going to be losing teams...."
That's a bit too simplistic, no? The Tampa Bay Rays are just the latest team to win consistently but still draw poorly. Teams with new stadiums often draw very well for several years, regardless of their performance. There are a lot of exceptions to the general rule.
February 11th, 2011 at 7:35 pm
My main point is that there are some things you cannot quantify, and this is one of them. Home field advantage is one of the most repeatable events across ALL sports, yet there is no clear explanation why. Even this strike zone study says it only accounts for a < half of it. Im not sure how saying umps favoring the home team strike zone explains why football teams have home field advantages even when they dont have things like custom fields and batting last (although there are climate factors and domes, but not for all teams) unless the percentage the football refs favor the home team is the exact same as baseball umps (and basketball refs and hockey refs). Im not saying I have all the answers, but some players will perform a little better with the crowd at their back. Athletes mention this constantly so it cant be ignored. Other things like club house guys and veteran leadership DO matter, many here can probably think back to their high school teams. I played varsity when I was a sophomore and the seniors made sure I played hard, etc.
Also, even though you have 1.5M pitches that may not be enough, something a stat person should understand. Having a million of something is not always enough to make confident model. I would be curious to know if there is ALWAYS a homefield advantage with regard to balls/strikes, on average, for EVERY year; similar to the home field adv.
@17,
I have a pretty good background in stats, I actually think everyone who graduates from college should be forced to take a stats class because SO many people dont understand/dont believe in stats. Try to explain the law of large numbers to someone and try to convince them that tossing a coin say 1000 times ALL heads is IMPOSSIBLE, not very, very, very, very unlikely, but infact IMPOSSIBLE. Most people will not believe you even though it is a mathematical fact of the universe.
February 11th, 2011 at 7:36 pm
Conventional wisdom is meant to be debunked, after all.
February 11th, 2011 at 7:38 pm
Just thought of this, it would be very interesting to see if high school teams ALWAYS have a home field adv because that is an environment with significantly less fan interaction? Not sure how to find enough data for this, but it may be a way to try to test this particular theory. Also in HS the travel/fatigue elements are greatly reduced.
February 11th, 2011 at 9:14 pm
@johnautin
"That's a bit too simplistic, no? "
K.I.S.S.
If you'd read the rest of it, you'd see I make a point of saying "seemingly".
@topper
"Its not the number of fans its their involvement. Just because the Dodgers have high attendance doesnt matter when half the fans are there between the 3rd and 6th innigns.
"
Naturally there are going to be exceptions. That's what makes this and other sports impossible to completely understand using stats alone. It's not a closed door lab situation.
February 11th, 2011 at 9:50 pm
@19: I'd like for you to explain your coin flip example to me. I've never heard of that.
February 12th, 2011 at 7:01 am
Surprised I haven't seen this mentioned yet. The pitchers mound often accommodates the home pitcher. Greg Maddux has discussed this issue. There were some mounds that he hated, and it impacted his effectiveness. I think Arizona was one of them.
February 12th, 2011 at 9:41 am
I was waiting for the coin flip explanation too
February 12th, 2011 at 11:25 am
@23
Its just a semantic question of at what point "extremely remote" becomes "impossible". The odds of 1000 coin-flips coming up all heads is 1 in 1.07 * 10^301. That's 301 zeros. No number of things in all the history of the universe even remotely approaches a number that large. So for all intents and purposes you can call it "impossible". I don't know how useful this tangent is to the discussion but that's basically the point of that anecdote.
In the real world, you can 1000 coin flips isn't even that much. Imagine atoms or molecules. So, the remoteness of all the air in the room randomly migrating to the far corner and suffocating you is even more extreme than the coin flip example above.
@17,25
r-squared measures correlation. It ranges between 0 and 1. A number near 1 means very correlated. A number near 0 means no correlation... not even anti-correlation. Basically random. The number Andy got was 0.02. Which means he didn't find a link between offense levels and home-field advantage.
February 12th, 2011 at 6:38 pm
Not to get into it but the strong law of large numbers says that the probability of 1000 heads in 1000 tosses = 0, not = 10^-301. It is not very, very, very unlikely for all intents and purposes, it is impossible.
February 12th, 2011 at 7:28 pm
@28
I understand what you are saying, but an 8th grader can calculate 0.5^1000 and its about 10^-301. I think the point of the strong law is how fast this process is converging. A thousand flips is really not that many... I think perhaps its not obvious how unfathomably small 10^-301 actually is.
But Barkie's a bit of an anti-intellectual, so the more we try to explain things the less he's going to listen. 🙂
February 13th, 2011 at 1:17 pm
Nobody here has discussed the emergence of mercenary free agents and their correlation to fans and fan base. A fifteen year old kid can remember the Yankees as the Tino Martinez or the Giambi or now the Teixiera era.
I think when teams virtually do not have a single player stay more than 7 years with their club, it is hard for fans to relate to a new group of faces every year. They are rooting for clubs, not players, where in the past, fans liked the uniform and the player in the uniform.
February 15th, 2011 at 8:47 am
I'm confused as to why Topper thinks 10^-301 is incorrect. I think we can all agree this number is ridiculously small and most calculators probably would give 0. However, surely we can agree upon the following:
1) Each sequence for a fair coin is equally likely, from TT...TT to TT...TH all the way to HH...HH. Not the total result (e.g. 500 H and 500 T), but the 1000-flip string. For the 3-flip example - TTT,TTH,THT,THH,HTT,HTH,HHT,HHH all have probability 1/8.
2) The total number of these strings is 2^1000 (approx 10^301)
3) The sum of the probabilities of all strings = 1
Therefore each string, including 1000 tails, has a probablity of 2^-1000 (or 10^-301)
February 15th, 2011 at 12:44 pm
@19
"...tossing a coin say 1000 times ALL heads is IMPOSSIBLE, not very, very, very, very unlikely, but infact IMPOSSIBLE. Most people will not believe you even though it is a mathematical fact of the universe."
What is the exact number of coin flips that the "Universe" deems an outcome of all heads being "impossible" as "mathematical fact".
It has to be somewhere between 1 flip and 1000 flips. Is it 10 flips, 20 flips, 100 flips? I am completely intrigued.
So for example, if I get to the cutoff number of flips (we will call it N) with all heads, then you are saying the next flip (N+1) has 0 probability of being heads, instead of .5?
I am truly interested in how that can be. Seriously. Even if it takes a lengthy explanation of the law of large numbers.