This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

Bloops: A method for determining the probability that a given team was the true best team in some particular year

Posted by Neil Paine on April 25, 2011

In January 2004, Tangotiger posted this at his site:

http://www.tangotiger.net/archives/stud0268.shtml

It linked to a very cool mathematical method for determining the probability that any team was the "true" best team in a season. Unfortunately, though, webpages sometimes have a tendency to disappear over the course of 7 years. That's just what happened here -- the original link is now dead.

However, I contacted the creator of the methodology, Dr. Jesse Frey (Professor of Mathematical Science at Villanova), and he was gracious enough to re-upload the original study to his current site:

http://www19.homepage.villanova.edu/jesse.frey/BestTeam/forprimer.htm

Now sabermetricians can once again estimate the probability of any team truly being baseball's best. And for what it's worth, here are the current results from a very simplified version:

Rank Tm Lg W L WPct Stdev BayesW% Stdev p(best)
1 PHI NL 15 6 0.714 0.099 0.558 0.051 16.4%
2 COL NL 14 7 0.667 0.103 0.542 0.052 10.8%
3 TEX AL 14 7 0.667 0.103 0.542 0.052 9.9%
4 NYY AL 12 6 0.667 0.111 0.538 0.053 9.1%
5 FLA NL 13 7 0.650 0.107 0.536 0.052 7.9%
6 CLE AL 13 8 0.619 0.106 0.529 0.052 6.2%
7 LAA AL 12 10 0.545 0.106 0.511 0.052 3.5%
8 STL NL 12 10 0.545 0.106 0.511 0.052 3.2%
9 KCR AL 12 10 0.545 0.106 0.511 0.052 3.2%
10 DET AL 12 10 0.545 0.106 0.511 0.052 3.1%
11 MIL NL 11 10 0.524 0.109 0.506 0.053 2.6%
12 LAD NL 12 11 0.522 0.104 0.505 0.052 2.4%
13 OAK AL 11 11 0.500 0.107 0.500 0.052 2.3%
14 TBR AL 11 11 0.500 0.107 0.500 0.052 2.2%
15 WSN NL 10 10 0.500 0.112 0.500 0.053 2.1%
Rank Tm Lg W L WPct Stdev BayesW% Stdev p(best)
16 CIN NL 11 11 0.500 0.107 0.500 0.052 1.8%
16 SFG NL 10 11 0.476 0.109 0.494 0.053 1.8%
18 BOS AL 10 11 0.476 0.109 0.494 0.053 1.8%
19 ATL NL 11 12 0.478 0.104 0.495 0.052 1.7%
20 CHC NL 10 11 0.476 0.109 0.494 0.053 1.6%
21 TOR AL 9 12 0.429 0.108 0.483 0.052 1.0%
22 PIT NL 9 12 0.429 0.108 0.483 0.052 1.0%
23 MIN AL 9 12 0.429 0.108 0.483 0.052 0.8%
24 NYM NL 9 13 0.409 0.105 0.478 0.052 0.8%
25 BAL AL 8 12 0.400 0.110 0.477 0.053 0.8%
26 ARI NL 8 12 0.400 0.110 0.477 0.053 0.7%
27 HOU NL 8 14 0.364 0.103 0.465 0.052 0.5%
28 CHW AL 8 14 0.364 0.103 0.465 0.052 0.3%
28 SDP NL 8 14 0.364 0.103 0.465 0.052 0.3%
30 SEA AL 8 15 0.348 0.099 0.459 0.051 0.3%

18 Responses to “Bloops: A method for determining the probability that a given team was the true best team in some particular year”

  1. AlvaroEspinoza Says:

    Strength of schedule? Important with such a small sample size here, and with unbalanced schedules.

  2. Neil Paine Says:

    Right, my simple version didn't take that into account but Dr. Frey's definitely does.

  3. Voomo Zanzibar Says:

    This method conclusively demonstrates that the 11-12 Braves are in actuality not as good as the 10-11 Giants.

  4. John Autin Says:

    I think the odds that Seattle is the best team are the same as the odds of drawing a royal fizzbin: Mr. Spock has never calculated them.

  5. Neil Paine Says:

    #3 - Ha, well the difference there is actually noise in the Monte Carlo because I didn't run enough iterations. Given enough simulations, the Braves will be ranked higher because their Bayesian rating is higher with a smaller standard deviation. I just wanted to give everyone a feel for what kind of results you can get from a method like this.

    In fact, the ideal playoff system would involve running this method and eliminating all teams we're 95% certain aren't the best (since the playoffs should only include teams that have a plausible case for being the best).

  6. Neil L. Says:

    Neil,

    Very interesting!

    The results are only comparable within one year aren't they? For example, Oakland's 54.3% in 1990 can't be compared to Atlanta's 45.2% in 1997?

  7. Jon Says:

    How many iterations did you run?

  8. Neil Paine Says:

    For the sake of time I ran 10,000 -- which sounds like a lot, but Dr. Frey's stabilized after 100,000.

  9. Neil Paine Says:

    #6 - Right, the probabilities aren't really intended to be compared across seasons -- although they do give an indication of how dominant a team's W-L record was relative to the other top teams of that season.

  10. tbone82 Says:

    #5- so in the ideal playoff system, the 2001 Mariners would have been handed the WS title after the regular season? not to be argumentative, but...

  11. Jon Says:

    I have to side with Tbone here. All depends on what you mean by ideal. It would appear, especially given the playoff formats in other sports, that the ideal for most people is a system where lots of teams have a chance to win the championship (including the team for which they root).

    But wouldn't that be a blast if something like that were implemented, where if you didn't make that 95% certainty cut you'd be out of the playoff picture, and the greatest Cubs team ever was calculated as false negative resulting in another century of futility?

    If we're going to instill a criteria where we eliminate things of which we're 95% sure that they aren't the best, let's start with the Hall of Fame and not the playoffs.

  12. Neil Paine Says:

    #10 - Well, no. Oakland had a 9.1% chance of being the best according to his 2001 sub-page:

    http://www19.homepage.villanova.edu/jesse.frey/BestTeam/post2001.htm

    On the main methodology page he listed all teams who either had a 10% chance or won the real-life WS. But the cutoff I mentioned earlier was 5% (and you could change that depending on which significance level you wanted to use).

    My point was that the postseason only exists to settle the "best team" question if there is doubt after the regular season. It shouldn't include teams that didn't take advantage of regular-season opportunities to make their case for #1.

    That said, it's going to be hard to find a scenario where you would need to hand the WS to a team without playing any playoff games, if a season like 2001 or 1998 can't produce that outcome.

  13. Neil Paine Says:

    #12 - And I realize this statement sounds a lot like something advocates of the BCS would say:

    "...the postseason only exists to settle the "best team" question if there is doubt after the regular season. It shouldn't include teams that didn't take advantage of regular-season opportunities to make their case for #1."

    The problem with the BCS is the rigidity of a 2-team format that doesn't allow for the possibility of more than 2 teams having a legitimate chance at being #1. If the BCS were to adopt a methodology like this, where the size of the playoff changed yearly depending on the probabilities of each team being the "true" #1, I wouldn't have any problem with it.

  14. Wine Curmudgeon Says:

    Where are the 1969 rankings? Let's lift this albatross from Cubs' neck once and for all, and see if they were -- as all Cubs fans know in their heart -- truly better than the hated Mets.

  15. Neil L. Says:

    @14
    WIne, 1969 was a looong time ago and the Miracle Mets are everybody's darling. Let's not lets a Bloops calculation get the way of history.

  16. John Autin Says:

    @14
    Wait -- what?!? The '69 Mets beat the Cubs by 8 games in the division race and by 10-8 in the season series. Then they blazed through the postseason at 7-1, flattening the 109-win Orioles.

    Run differential is a wonderful tool. But the fact that the Cubs' pythagorean win total was 1 more than the Mets' is scant support for your claim.

    If the Cubs were truly better than the Mets, they probably wouldn't have folded like a Mad Magazine cover in September. Great teams don't go 8-18 with the pennant on the line.

  17. Mike Felber Says:

    They were about the same, but are poor or even great records down the stretch necessarily the result of play under pressure? When statisticians look at individual example of player's clutch play, it almost never exists. All here are likely to know about the vagueries of performance due to random variation & sample size.

    Why is it likely to be different for teams? We know that a team or individual with ANY sort of record will have times when they are much better or worse than their average performance, certainly over 162 games. If no team EVER performed (assume a perfect world where we magically knew the cause of everything) better or worse due to pressure, some would through the law of averages have records like 8-18 down the stretch.

    That does not even consider the causes of this, some luck, close games (since whether runs are scored efficiently or bunched up is mostly random), nor what the strength of the schedules are. And when things like injuries occur, any team may do badly by normal standards.

  18. John Autin Says:

    @17, Mike Felber -- Yeah, OK, my last point @16 was just gratuitous dig at a long-suffering Cubs fan. (Given the stretch collapses by my Mets in 2007-08, I don't mind focusing on someone else's pain once in a while.)

    I don't think there's a universally acknowledged scientific way to objectively compare two teams. For the '69 Mets and Cubs, their pythagorean win totals were almost the same. When I weigh that fact among the others I cited -- actual records, head-to-head records, and the fact that the Mets went 7-1 in the postseason -- I just can't see any reason to think that the Cubs were "truly" a better club. And isn't the burden of proof on the other side?