Pujols in perspective
Posted by Andy on September 10, 2009
The career numbers for Albert Pujols are getting truly ridiculous. Let's try to put his career so far in some perspective. Keep in mind that the numbers below don't even include his 2-HR game from yesterday.
All stats below are 1901-present and for the first 9 seasons of a player's career.
Most HR:
Cnt Player **HR** From To Ages G PA AB R H 2B 3B RBI BB IBB SO HBP SH SF GDP SB CS BA OBP SLG OPS Positions Teams +----+-----------------+-------+----+----+-----+----+-----+-----+----+----+---+---+----+----+---+----+---+---+---+---+----+---+-----+-----+-----+-----+---------+-----------+ 1 Albert Pujols 364 2001 2009 21-29 1377 5983 5063 1061 1690 379 14 1098 800 194 562 67 1 52 178 59 30 .334 .427 .630 1.057 *37/59D64 STL 2 Ralph Kiner 351 1946 1954 23-31 1359 5866 4884 915 1373 203 39 961 946 0 703 24 9 3 118 22 2 .281 .400 .554 .954 *7/83 PIT-TOT-CHC 3 Eddie Mathews 338 1952 1960 20-28 1330 5810 4894 929 1373 200 49 901 837 52 791 14 30 35 62 43 19 .281 .385 .549 .934 *5/7 BSN-MLN 4 Adam Dunn 313 2001 2009 21-29 1268 5328 4342 773 1087 227 8 764 899 95 1411 62 2 23 65 59 20 .250 .385 .523 .908 *739/D CIN-TOT-WSN 5 Alex Rodriguez 298 1994 2002 18-26 1114 4972 4382 885 1354 255 16 872 472 27 869 57 16 45 94 160 43 .309 .380 .579 .959 *6/D SEA-TEX 6 Hank Aaron 298 1954 1962 20-28 1350 5868 5309 956 1697 292 73 991 463 106 515 20 19 57 145 72 32 .320 .373 .571 .944 *987/453 MLN 7 Ernie Banks 298 1953 1961 22-30 1216 5205 4670 751 1355 210 59 858 452 116 577 29 10 44 103 37 39 .290 .353 .552 .905 *6/573 CHC 8 Ken Griffey 294 1989 1997 19-27 1214 5262 4593 820 1389 261 24 872 580 142 755 33 6 50 87 123 48 .302 .381 .562 .943 *8/D379 SEA 9 Ted Williams 293 1939 1950 20-31 1273 5764 4555 1164 1594 338 57 1135 1183 0 397 21 5 0 111 19 13 .350 .486 .642 1.128 *79/1 BOS 10 Frank Robinson 291 1956 1964 20-28 1346 5735 4945 934 1501 285 45 896 628 111 689 100 13 49 122 148 48 .304 .390 .556 .946 *7938/5 CIN
Pujols is already #1 here. Wow. While Adam Dunn gets on here at #4, note that Pujols has batted more than 80 points higher over the same 9-year period and had an OPS more than 100 points higher.
Here's batting average, minimum 4000 plate appearances:
Cnt Player **BA** PA From To Ages G AB R H 2B 3B HR RBI BB IBB SO HBP SH SF GDP SB CS OBP SLG OPS Positions Teams +----+-----------------+---------+-----+----+----+-----+----+-----+----+----+---+---+---+----+----+---+----+---+---+---+---+----+---+-----+-----+-----+---------+-----------+ 1 Ty Cobb .368 4844 1905 1913 18-26 1143 4345 808 1600 248 125 47 751 344 0 31 41 114 0 0 449 0 .420 .515 .935 *89/74 DET 2 Al Simmons .358 5470 1924 1932 22-30 1240 5019 960 1796 343 98 208 1156 339 0 403 14 98 0 0 65 47 .400 .590 .990 *78/9 PHA 3 George Sisler .353 5258 1915 1924 22-31 1198 4791 826 1692 269 110 69 686 304 0 209 33 130 0 0 301 99 .396 .498 .894 *3/198745 SLB 4 Rogers Hornsby .351 4768 1915 1923 19-27 1119 4231 730 1486 243 114 116 721 415 0 364 32 90 0 0 104 49 .413 .545 .958 *465/3798 STL 5 Ted Williams .350 5764 1939 1950 20-31 1273 4555 1164 1594 338 57 293 1135 1183 0 397 21 5 0 111 19 13 .486 .642 1.128 *79/1 BOS 6 Paul Waner .348 6106 1926 1934 23-31 1351 5352 1023 1860 369 144 81 785 606 0 180 29 119 0 25 85 0 .417 .516 .933 *9/387 PIT 7 Wade Boggs .346 6084 1982 1990 24-32 1338 5153 912 1784 358 41 70 586 841 106 407 18 23 49 137 14 22 .436 .472 .908 *5/3D7 BOS 8 Stan Musial .346 5392 1941 1950 20-29 1218 4688 920 1624 343 115 174 815 652 0 235 24 28 0 93 49 0 .429 .580 1.009 9378 STL 9 Lou Gehrig .342 4762 1923 1931 20-28 1076 3946 937 1350 279 104 233 995 698 0 470 18 100 0 0 59 57 .443 .643 1.086 *3/97 NYY 10 Bill Terry .342 4321 1923 1931 24-32 1067 3883 692 1328 239 77 104 717 345 0 282 6 87 0 0 42 6 .397 .524 .921 *3/97 NYG 11 Chuck Klein .341 5333 1928 1936 23-31 1203 4837 950 1651 322 63 257 984 434 0 406 11 51 0 27 64 0 .397 .593 .990 *97/8 PHI-CHC-TOT 12 Eddie Collins .338 4294 1906 1914 19-27 1013 3616 702 1221 157 84 15 474 472 0 68 43 163 0 0 370 30 .420 .440 .860 *4/69875 PHA 13 Todd Helton .337 5424 1997 2005 23-31 1279 4560 924 1535 373 24 271 915 773 131 622 40 3 48 111 33 23 .433 .607 1.040 *3/79 COL 14 Tris Speaker .337 4551 1907 1915 19-27 1065 3935 704 1327 241 106 39 542 459 0 61 55 102 0 0 267 54 .414 .482 .896 *8/9137 BOS 15 Heinie Manush .335 4977 1923 1931 21-29 1194 4481 784 1502 306 95 72 698 298 0 218 52 146 0 0 84 46 .383 .494 .877 *78/93 DET-SLB-TOT-WSH 16 Albert Pujols .334 5983 2001 2009 21-29 1377 5063 1061 1690 379 14 364 1098 800 194 562 67 1 52 178 59 30 .427 .630 1.057 *37/59D64 STL 17 Ichiro Suzuki .333 6503 2001 2009 27-35 1403 6004 962 2000 224 67 81 506 405 140 583 42 24 28 43 339 78 .378 .433 .811 *98/D SEA 18 Joe Medwick .333 5322 1932 1940 20-28 1227 5001 854 1667 383 93 162 959 273 0 412 21 27 0 125 30 0 .370 .544 .914 *7/89 STL-TOT 19 Joe DiMaggio .332 5585 1936 1947 21-32 1252 5015 1036 1663 294 100 264 1122 527 0 252 29 14 0 69 29 7 .398 .588 .986 *8/79 NYY 20 Earle Combs .330 5409 1924 1932 25-33 1181 4780 1006 1577 267 129 48 508 547 0 240 16 66 0 0 86 63 .401 .470 .871 *87/9 NYY
There's Albert at #16, ahead of Joe DiMaggio and a bit behind Eddie Collins and Tris Speaker. That's hard to believe.
Finally here's OPS+, again minimum 4000 PA's:
Cnt Player **OPS+** PA From To Ages G AB R H 2B 3B HR RBI BB IBB SO HBP SH SF GDP SB CS BA OBP SLG OPS Positions Teams +----+-----------------+--------+-----+----+----+-----+----+-----+----+----+---+---+---+----+----+---+----+---+---+---+---+----+---+-----+-----+-----+-----+---------+-----------+ 1 Ted Williams 193 5764 1939 1950 20-31 1273 4555 1164 1594 338 57 293 1135 1183 0 397 21 5 0 111 19 13 .350 .486 .642 1.128 *79/1 BOS 2 Lou Gehrig 182 4762 1923 1931 20-28 1076 3946 937 1350 279 104 233 995 698 0 470 18 100 0 0 59 57 .342 .443 .643 1.086 *3/97 NYY 3 Ty Cobb 181 4844 1905 1913 18-26 1143 4345 808 1600 248 125 47 751 344 0 31 41 114 0 0 449 0 .368 .420 .515 .935 *89/74 DET 4 Frank Thomas 174 5501 1990 1998 22-30 1236 4406 894 1416 281 10 286 963 989 120 675 32 0 74 137 25 15 .321 .443 .584 1.027 *3D CHW 5 Rogers Hornsby 174 4768 1915 1923 19-27 1119 4231 730 1486 243 114 116 721 415 0 364 32 90 0 0 104 49 .351 .413 .545 .958 *465/3798 STL 6 Mickey Mantle 173 5409 1951 1959 19-27 1246 4478 994 1392 208 54 280 841 892 54 899 9 12 18 44 98 25 .311 .425 .569 .994 *89/645 NYY 7 Albert Pujols 172 5983 2001 2009 21-29 1377 5063 1061 1690 379 14 364 1098 800 194 562 67 1 52 178 59 30 .334 .427 .630 1.057 *37/59D64 STL 8 Stan Musial 171 5392 1941 1950 20-29 1218 4688 920 1624 343 115 174 815 652 0 235 24 28 0 93 49 0 .346 .429 .580 1.009 9378 STL 9 Johnny Mize 169 5298 1936 1947 23-34 1251 4625 850 1517 287 78 257 971 620 0 386 34 19 0 69 22 0 .328 .411 .590 1.001 *3/9 STL-NYG 10 Tris Speaker 166 4551 1907 1915 19-27 1065 3935 704 1327 241 106 39 542 459 0 61 55 102 0 0 267 54 .337 .414 .482 .896 *8/9137 BOS
Pujols has the 7th highest OPS+ since 1901 for the first 9 seasons of a career. That's pretty amazing, especially considering that he's done it during an era of offensive explosion. Remember that when we think back on the careers of these top 10 guys, we regard them as top extra-base hitters of their time, clearly well ahead of the pack. These days, it's so tough to be ahead of the pack because home runs are being hit at such a high rate. Nevertheless, Pujols has really separated himself from his contemporaries. Frank Thomas is the only other recent player on here and even he had the benefit of playing a few years prior to The Steroids Era.
September 10th, 2009 at 9:13 am
Can't help but notice that Adam Dunn has 99 less IBB's than Pujols, but Dunn's also drawn 99 more walks than Pujols in total...yet Dunn's OBP is 42 points lower. I think that shows more of what strikeouts do for Dunn... his high K rate means pitchers often think they've a better shot at striking him out than just going "he might slug one, so I better just give him one base instead of four". The K's do far more damage than just the PA when he makes the out.
September 10th, 2009 at 9:21 am
There's Griffey, #8 on the HR list. What did he do in seasons 10, 11 and 12? 56, 48 and 40 HRs. 438 HRs by age 30. 715 seemed like a lock as seemed the RBI record. Frank Thomas, whose numbers by season nine were being compared to Gehrig's, never led the league in a single category after his eighth season.
Both Kiner and Mathews received mid-career speculation about reaching 715.
Can Pujols stay healthy? Can he put together a post age-29 career the likes of Aaron or Mays or Reggie or even Thome?
Pujols has gotten his lifetime Slugging Percentage up to .631. Only five players have ended a season with a career Slugging Percentage of .631 or higher:
*Babe Ruth (1920-1935, peaking at .712 in 1924)
*Lou Gehrig (1928, 1930-1936, peaking at .643 in 1931, 1934, 1936, 1937)
*Chuck Klein (1932-1933, peaking at .639 (although he got as high as .657 in 1930, but was 500 PAs short of qualifying for BR's Age-Based Leaderboards))
*Jimmie Foxx (1933-1936, 1938-1940, peaking at .640 in 1934, 1935, 1939)
*Ted Williams (1941-1960, peaking at .647 in 1946)
No player (other than Ruth) has ever ended a season with a career slugging percentage higher than Ruth's lifetime mark of .690.
Thinking about Cobb's lifetime batting average of .366. As we see above, at the end of his ninth season, it was .368. He got as high as .373 at the end of his 18th season (1922, age 35) before leveling off at .366. Only two other players in the history of MLB ended a season (mid-career) with a career batting average higher than .366. Can you name them? One got as high as .393 (at the end of his 5th season). The other got as high as .385 (at the end of his 6th and 7th season).
September 10th, 2009 at 10:14 am
I'm not sure if you're looking before 1900. How about Hugh Duffy and Billy Hamilton?
September 10th, 2009 at 10:16 am
I mentioned near the top of the post that everything here is 1901-present.
September 10th, 2009 at 10:16 am
And on checking...I'm not even close. They had too many seasons of merely mortal batting averages before the mound was pushed back in 1893.
One more try.... Joe Jackson and Harry Heilmann?
September 10th, 2009 at 10:17 am
I don't know if Kingturtle was following the same rules.
September 10th, 2009 at 10:19 am
Two higher than .366 at some point? Hmmm... I'm thinking Rogers Hornsby is the .393 hitter, and the other is probably Ted Williams.
September 10th, 2009 at 10:43 am
Pujols certainly deserves all the accolades that he is getting, anyone with a career OPS+ of 173 is one of the all-time greats.
That said, the use of "season count" is dropping some big names from the lists above. Its quite common for guys to get called up briefly before their rookie seasons. A-Rod is the most obvious example. His 9 seasons are really 7 as the mariners gave him a look in both his 18 and 19 year old seasons. There are other examples, too. Jimmie Foxx and Mel Ott are other multi-season examples. Harmon Killebrew was a bonus baby who had went back to the minors after the restrictions were lifted and had five MLB years under his belt by the time he was called up from the minors for good and then proceeded to hit 369 HR in the nine seasons after that.
Even just a single september call-up can make a big deal ranking lists like these. Guys like Frank Thomas or even Andruw Jones. A season count list heavily favors guys who's debut year in MLB was already an all-star caliber season. Guys like Ted Williams, Frank Robinson and Albert Pujols.
I like the age lists like. Checking the age 29 list:
http://www.baseball-reference.com/leaders/leaders_29_bat.shtml
Pujols is in the top ten in almost every major category and still has a couple of weeks to climb the list.
September 10th, 2009 at 10:47 am
David, thanks for that suggestion. The season counter has always been flawed in the way you describe, and the age lists are certainly better in that regard.
September 10th, 2009 at 10:56 am
Trivia answers:
The .393 is Joe Jackson. He hit .408/.395/.373 in his first three seasons in the mini-live-ball era of 1911-13. And he had three brief call-ups in 1908-1910 ages 18-20 to pad his season count. 🙂
The .385 is Willie Keeler. He first became a regular in 1894 right after the mound was moved back and he didn't have any off-seasons batting-average-wise to weight down his career numbers. His career average didn't dip below .366 until 1903.
September 10th, 2009 at 11:13 am
Four decade players! There's a good future topic. We'll have a whole new set of those come next year. Can the play index tell you which active players were active in 1989? I can find the oldest players. Moyer, Johnson, Wakefield, Vizquel, Smoltz and cross-reference the youngest guys from 1989... Griffey, Sheffield. I don't think that catches everyone though. And those players still have to make it to next season for the fourth decade.
Plus its also fun to look at the league's youngest players and see if any of them will be around in 2030 to pick up their fourth decade. Youngest guys are Bumganer, Porcello, Martinez, Andrus, Feliz. Most of the youngest players won't make it to 2030, but history tells us a few of them will.
September 10th, 2009 at 11:17 am
2030. Geez.
I looked up active major-leaguers from 1989 recently and I believe the list included only Johnson, Moyer, Sheffield, Smoltz, Griffey, and Vizquel. If you include guys who haven't formally retired, the list is a lot longer and includes the likes of Glavine, Palmeiro, and some guy named Barry.
September 10th, 2009 at 12:12 pm
I'd love to see some team slip Rickey Henderson a 1 game contract next year for a promotional stunt or something.. he'd end up being a 5 decade guy. I'm sure he wouldn't turn down the opportunity.
September 10th, 2009 at 3:45 pm
Minnie Minoso played 5 decades...barely. He managed 20 plate appearances as a 23-year old in 1949, then didn't play again until 1951. The White Sox brought him back for 8 PAs in 1976 and 2 in 1980.
When you think about it, it really wasn't that long ago.
Henderson had 398 or more plate appearances in 4 separate decades. Anybody know if that's the record?
September 10th, 2009 at 4:38 pm
Yeah, it only takes 21 years to play in four decades.
By my eyeballing, Henderson looks to have the record with 398. Ted Williams is very close with 390. No one else comes close. Williams has 29 HR or more in four different decades which is extremely impressive.
On the pitching side, Jack Quinn's 118.2 IP in four decades looks like most.
September 10th, 2009 at 7:08 pm
Regarding post #2; was there really mid-career speculation about Kiner reaching 715? He started at age 23, and doesn't appear on any of the top 10 by age lists.
Regarding #16, I guess it actually takes a bit less than 21 years to play in 4 decades. If you debut 1 October 1989, and you're still in the lineup on 31 March 2010, you've got your 4 decades in just under 20 years, 6 months. On the other hand, it definitely takes (parts of) 22 seasons.
September 10th, 2009 at 7:20 pm
Elmer Valo will probably never appear on official lists of 4-decade players, but it is said that Connie Mack put him into a game late in the 1939 season, then asked the official scorer to attribute Valo's plate appearance to another player, since Valo was unsigned at the time. See, for example, http://www.baseballlibrary.com/ballplayers/player.php?name=Elmer_Valo_1921&page=chronology
September 10th, 2009 at 7:35 pm
Post #16:
Nolan Ryan threw 226 2/3 innings in 4 decades
September 10th, 2009 at 7:43 pm
Pujols truly is something special. One of those truly rare players that just seems to have it all together. (And if I hear another damn word about steroids...grumble grumble...)
I sat down and looked at batting stats today...comparing Pujols to the other mere mortals in the National League:
Batting Average: 2nd at .331
Runs: 1st at 116--a mere 20% better than second place.
Hits: 4th at 162; all the more impressive when you consider his ABs are low (though even with that metric, Ramirez is still lights out).
Doubles: 6th at 37
Home Runs: 1st at 47 (and on a bit of a tear lately)
RBIs: 2nd at 124 (and I think that second place will last about another week, tops)
BBs: 1st at 104 (I always just think he has more, but there are occasional pitchers that opt to pitch to him)
IBB: 1st at 40 (merely double second place)
OBP: 1st at .450 (6% ahead of second)
SLG: 1st at .693 (and now we start separating from the pack. Pujols is currently .105 ahead of second. He's as far ahead of Prince Fielder as Fielder is of Lance Berkman, who's in 23rd)
OPS: 1st at 1.148 (and this time the separation -- .145 takes you from 2nd to just past 23rd, David Wright of the Mets).
OPS+: 1st at 199. 164 earns Adrian Gonzalez (having a stellar year in his own right) 2nd place.
Total Bases: 1st at 342. Prince Fielder again at 2nd with 300.
Stolen Bases: Ok...he's no great shakes there. But he is first on the Cardinals with 14.
Adj Batting Runs: 1st at 76. There's Gonzalez again. At 49.
Ad. Batting Wins: 1st at 7.1. Gonzalez? Yeah...he's there at 2nd again. 4.6
ISO: 1st at .367. Mark Reynolds rounding out 2nd at .303
And one of my favorite stats BB/K: 1st at 1.86. 2nd place? Teammate Yadier Molina
I realize I'm not exactly breaking new ground with the "Wow...that Pujols guy is pretty good." notions. I just thought it was worth sharing. You can now resume your four decade conversation.
PS: The millennium, century, and decade that we're currently in started in 2001, based upon the calendar. I would also argue that the 1990's includes only the years that begin with 199-. You can define a decade as any ten year period. So...um...everyone's right. Free beer!
September 10th, 2009 at 7:52 pm
DavidRF, nicely done!
gerry, I wasn't alive during Kiner's career. I can only go by what I've read. And I can no longer remember where I read such a claim. I did find one example online: according to http://books.google.com/books?id=B6SVNZBAHX0C&pg=PA349&dq=%22ralph+kiner%22&lr=#v=onepage&q=%22ralph%20kiner%22&f=false Kiner was "touted" as late as 1950 as someone to challenge 714. I am not sure what William Marshall's sources were, but he lists them in the end of his book.
There was also speculation in that era that Kiner would break the 60 HR barrier:
*http://books.google.com/books?id=FS4DAAAAMBAJ&pg=PA60&dq=kiner+ruth&lr=&as_pt=MAGAZINES#v=onepage&q=kiner%20ruth&f=false
*http://books.google.com/books?id=Ti4DAAAAMBAJ&pg=PA54&dq=kiner&as_pt=MAGAZINES#v=onepage&q=kiner&f=false
September 10th, 2009 at 7:55 pm
That's it--I'm retiring. Kingturtle's links above are hands-down the best source supporting an argument I've ever seen. Extremely well done, man!
September 10th, 2009 at 9:40 pm
Thanks to #18 for the correction on Ryan. I totally blacked out on that one. Checked everyone but him.
September 10th, 2009 at 11:02 pm
okay, Pujols clinched the 2009 MVP award as far back as August 1st. But is that stopping him? Is he just going to put up normal Pujols numbers in each of the closing weeks of the season? Apparently not. Have you seen his numbers for September? He's batting .471 with 6 home runs and 14 runs scored in 9 games. His OPS is 1.638. Oh, and he's struck out once this month in 34 ABs.
So he's turned a dominating offensive season into a historic offensive season.
It could possibly be one of the most dominating National League offensive outputs in history. My rating system (I've described it before) compares an offensive player with other offensive players in the same league in the same year. Using BA, R, TB, RBI, BB and SB, players are judged on their top ten placement. (10 pts for 1st place, 9 for 2nd, etc. etc.)
Here are the top six most offensively dominating seasons in NL history:
*Magee in 1910. 1st in BA, R, TB, RBI; 3rd in BB; 4th in SB = 55 points
*Aaron in 1963. tied for 3rd in BA, 1st in R, TB, RBI; 3rd in BB; 2nd in SB = 54.5
*Klein in 1932. 3rd in BA; 1st in R, TB; 2nd in RBI; tied for 5th in BB; 1st in SB = 52.5
*Klein in 1933. 1st in BA; tied for 2nd in R; 1st in TB, RBI; tied for 6th in BB; 7th in SB = 50
*Wagner in 1908. 1st in BA; 2nd in R; 1st in TB, RBI; tied for 10th in BB; 1st in SB = 49.5
*Pujols in 2009. 2nd in BA; 1st in R, TB; 2nd in RBI; 1st in BB = 48
September 10th, 2009 at 11:49 pm
Kingturtle, it's not entirely clear to me whether Marshall is saying Kiner was touted to challenge 714 or touted to challenge 60.
For what it's worth, I'm old enough to remember reading an article about Mantle and Mathews which concluded that Mathews had the better chance of reaching 714. The article didn't even mention Aaron. I'm not quite old enough to remember Kiner as a player.
I like the rating system for offensive output. I think I'd like it more if the categories were on-base percentage, slugging average, times-on-base, total bases, and steals. Wagner 1908 was 1st in all 5 of these categories.
September 11th, 2009 at 12:59 am
Gerry, agreed -- I was about to say that I take the first link to mean he was seen as a challenger to to the 60-HR record (though it could be read either way). Kiner was the only guy hitting 45+ HR a season since the '30s. Still, in 1950 he was 500 career HR behind Ruth, and not that young.
Win Shares rates Wagner's 1908 as the best season ever. Essentially the equivalent of Ozzie Smith with 200 RBI.
September 11th, 2009 at 2:28 am
Wagner's 1908 is all the more remarkable when you consider he didn't have spring training, claiming he was going to retire. Oh, and he wasn't exactly a spring chicken. He was a 34-year-old shortstop who led the league in putouts. Take all that into account, and it's this year's Jeter with 200 RBI. Or A-Rod's 2007 or Jim Rice's 1978 while 34, at shortstop, with no time to warm up.
Let's face it, nobody has ever duplicated what Wagner did that year. But Pujols' typical performance isn't that far off from Wagnerian or Ruthian accomplishments.
September 11th, 2009 at 7:55 am
gerry, you're right. that article i cited may have been talking about the single-season record, which all of the other articles i found were discussing. maybe i just created a false memory of reading somewhere that kiner was discussed with 715.
September 11th, 2009 at 8:03 am
To put Magee's 55 points in 1910 or Pujols' 48 points in 2009 into some more perspective, in all NL seasons from 1876 to 2008, the average result of the best player each season is 38. The median is 37, the mode is 35 and the range is 28.5.
And the most dominant season in the AL computes to be:
*Cobb in 1915. 1st in BA, R, TB; 3rd in RBIs; Three-way-tied for 2nd in BB; 1st in SB = 56
September 11th, 2009 at 10:17 am
Kingturtle, you should note that competition for the top 10 in any category increases as the league expands. There are twice as many players in the NL now as there were from 1900 to 1961.
September 11th, 2009 at 11:00 am
Kingturtle, I really like Sherry Magee and think he's an underrated and unfairly forgotten player, but I think the weights of your domination-metric could use some tweaking if its rating Magee & Klein seasons higher than Musial-48, Hornsby-22 or Bonds-93.
I guess a lot of it has to do with throwing stolen bases into the mix (who knew Chuck Klein could run?), but it doesn't measure the magnitude of the domination in each category and but there's bit of eclecticness to the numbers as well. I mean, how can you really fault Musial for not walking enough when he leads the league in OBP by 27 points?
September 11th, 2009 at 1:15 pm
The idea is to think in terms of overall production (run production *and* baserunning *and* long hits) while comparing players within the same league during the same season. I realize the simple 10-9-8-7-6-5-4-3-2-1 scale leaves out sheer domination, like leading a category by 50%. I suppose I could prorate levels of achievement, like the way they do in World Class Decathlons - although I have never been able to discern exactly and specifically how establish the scoring systems in Decathlons.
The results in this basic system I've created correspond 37% of the time with NL and AL MVPs of each year (excluding MVP winners that were pitchers), including a strange anomaly of eight out of nine National Leaguers between 1975 and 1983. Of course, MVP takes other things into account than just raw numbers, such as defense, leadership, team success and player personality. And of course, sometimes the MVP winner is just wrong.
To get back to Pujols, he was the most dominant offensive player in the NL in '04, '05, '06 and '08. '09 will make in five times. The players who topped their seasons most often are: Wagner 9 times, Cobb 8, Gehrig 8, Ted Williams 8 (could have been 11 if it weren't for WW2), Musial 8, Hornsby 7, Mays 7, Mantle 6, Ruth 5 and Frank Thomas 5, Yastrzemski 4, Carew 4, Brouthers 4, Bonds 4, Speaker 3, Foxx 3, Jim Rice 3, Rickey Henderson 3, Klein 3, Medwick 3, Aaron 3, Schmidt 3, Dale Murphy 3, Lajoie 2, Sisler 2, Simmons 2, Al Rosen 2, Giambi 2, Alex Rodriguez 2, Ortiz 2, Frank Robinson 2, Anson 2, King Kelly 2, Ed Delahanty 2, Joe Kelley 2, Cravath 2, Magee 2, George Burns 2, Mize 2, Snider 2, Billy Williams 2, Joe Morgan 2, Keith Hernandez 2, Dale Murphy 2, Raines 2, Will Clark 2, McGwire 2, Helton 2.
September 11th, 2009 at 5:12 pm
Kingturtle,
One way it 'could' be done, at least for most stats, is to normalize the stat based upon whomever came in first that season. That player gets one point, and everyone from second on down gets a number less than one equal to their stat divided by first place.
And since that sentence doesn't parse very well, I'll offer an example:
Prince Fielder is currently first in RBIs at 125. 125/125 = 1.000 Albert Pujols has 124. If the season ended today, he'd get .992 points.
This allows for a measure of dominance over the remainder of the league. I started playing around with this notion about a week ago, looking at a variety of offensive categories: R, H, HRs, RBIs, BB/SO, BA, OPS+, TB. In retrospect, I neglected SBs, and should not have included HRs, but I was just experimenting. Using data at the time (I think this was the first week in September), out of a maximum possible 8 points (if you lead all 8 categories), Pujols had 7.682, leading five of the eight categories. The top five worked out as:
Pujols (5) 7.682
Fielder (1) 6.553
Braun (0) 6.153
Utley (0) 6.127
Howard (0) 6.063
Ramirez came in 8th, leading in two categories (H, BA), with 5.975
Not a perfect system, but what it does show is not only the effect of leading in several areas, but utter dominance in those areas. Being ahead by a full point, out of a possible eight is rather significant.
Bruan (0) 6.15
September 11th, 2009 at 5:15 pm
One thing I like about Slothbaby's system above (#32) is that it differentiates between a big gap and a small gap between the 1st and 2nd players (and even 2nd and 3rd, etc). I remember in 1987 when the Phillies tried to encourage fans to come to games by pointing out that they'd finished 2nd in their division the previous season. Of course, they finished 21.5 games out of first place...
http://www.baseball-reference.com/leagues/NL/1986.shtml
September 11th, 2009 at 5:21 pm
If someone wants to pick a year/league, and pick the stats they'd liked rolled into it...I'll run the numbers this way, and see what it generates. Some highly contentious MVP year or somesuch. I've got a free evening...go go EXCEL!
September 11th, 2009 at 6:01 pm
Slothbaby. that's very smart! i spent weeks a few years ago using my remedial system by hand each to determine each league's top ten offensive players for each season. but with your idea and with the advancement of data access in BR i may re-do it. this is very interesting.
for fun, can you determine the top ten National Leaguers in 1963 using BA, R, TB, RBI, BB and SB? Will Batting Average be too difficult?
Using my old system the results are: Hank Aaron, MLN (54.5), Willie Mays, SFG (34), Vida Pinson, CIN (24), Bill White, STL (23.5), Orlando Cepeda, SFG (19), Frank Robinson, CIN (17), Willie McCovey, SFG (16), Ken Boyer, STL (16), Willie Davis, LAD (16), Curt Flood, STL (9)
September 11th, 2009 at 6:09 pm
I'm on it.
September 11th, 2009 at 6:50 pm
Okay.
1963 National League. Run for BA, R, TB, RBI, BB, and SB
Data will be shown as Rank, Name, Number of categories they were first (in parentheses), and the value out of six possible points:
1. Aaron (3) 5.3826
2. Pinson (0) 4.4396
3. Mays (0) 4.3760
4. White (0) 4.2458
5. Robinson (0) 4.0263
6. Cepeda (0) 3.9214
7. Matthews (1) 3.8759
8. Williams (0) 3.8721
9. Flood (0) 3.8219
10. McCovey (0) 3.7851
10. Boyer (0) 3.7851 (yes...tied to the 4th decimal place)
Rounding out your info, and other category leaders:
12 Wills (1) 3.7260
16. Tommy Davis (1) 3.5245
26. Willie Davis 3.0437
September 11th, 2009 at 7:05 pm
Awesome. How about 1988 NL?
September 11th, 2009 at 7:52 pm
1988 National League.
Run for BA, R, TB, RBI, BB, and SB
Scary Close:
1. Clark (2) 4.8350
2. Strawberry (0) 4.8203
Runs and total bases nearly identical. Will Clark lead the league in walks, had a slight edge in RBIs, decent edge in BA, but 20 fewer stolen bases. In the end, the stolen bases wasn't enough. Had Strawberry had two more stolen bases though or a few more hits, at least on this metric, he would have come out on top.
3. Van Slyke (0) 4.6074
4. Gibson (0) 4.5047 (and your MVP)
5. Butler (1) 4.4995
6. Bonilla (0) 4.3233
7. Daniels (0) 4.2879
8. Galarraga (1) 4.2685
9. Davis (0) 4.2541
10. Smith (0) 4.0938
Vince Coleman, SB leader, comes in at 13th with 4.0105
Tony Gwynn, BA leader, comes in at 21st with 3.7169.
In short, not nearly so dominating as Mays in 1963, or Pujols this year.
Your MVP Vote that year, among non-pitchers:
Gibson -- 272 (13 1st place votes)
Strawberry -- 236 (7)
McReynolds -- 162 (4) (he's 12th on the above metric)
Van Slyke -- 160 (0)
Clark -- 135 (0)
Galarraga -- 105 (0)
Davis -- 72 (0)
Gwynn -- 29 (0)
September 11th, 2009 at 7:55 pm
I'm loving it! You, or someone, should write a script that allows this calculation for all seasons. I had a feeling that 1988 was going to be weird, and that Gibby wasn't going to be at the top.
September 11th, 2009 at 8:12 pm
I'm just running this through excel...grabbing the batting information. I can set up a macro to automate the calculations, but this would be easier done through the database itself. It's not substantively different from normalizing to 162 games...you're just normalizing to the league leaders. The other question is whether BA, R, TB, RBI, BB, and SB are the right factors. (As a sidenote, this could also be very easily done for pitching.)
No matter what though...I think the voters were a bit kind to Gibson in 1988, not that he didn't have a great year. But, compared to the top 5:
He was 2nd in Runs (but first through fifth was an 8 run spread)
He was 4th in RBIs (but 30% fewer than Will Clark)
He was 2nd in SBs (but 2nd through 4th was only a 2 SB spread)
He was 4th in BBs (and 25% fewer than Will Clark)
He was 1st in BA (but .008 separated 1st through 4th)
He was 4th in TBs (and 10% behind either Clark, Strawberry, or Van Slyke).
In other words, where he ranked high among the other choices, it was slight. And where he ranked low, he was far behind.
September 11th, 2009 at 8:18 pm
You can't really look at this stuff seriously to try re-picking the MVPs. It's like one of Bill James's old "junk stats"; it's fun and interesting, but it doesn't really mean that much to throw a bunch of categories into a soup. It also doesn't adjust for park or position, obviously. I'm not sure how one can call Pujols the most dominant offensive player in the NL in 2004 when Barry Bonds is there.
September 11th, 2009 at 8:34 pm
I'm not suggesting that this should be done to repick the MVPs. To be honest, in 1988, I wouldn't have voted for Kirk Gibson...I'd just argue that something like this helps to back up the claim as to why he shouldn't have been voted for when there were better choices. I don't need an excel spreadsheet, 6 factors, and hindsight to make that claim. Nor would I claim that Pujols was the most dominant in 2004 when, compared to Bonds:
They had roughly the same number of Runs and Stolen Bases.
Pujols had an edge in RBIs, but only about 20% better. The edge on total bases was bigger; around 33%.
Bonds had an edge in BA (10%). But then there's the walks. 232 to 84. So..yeah...that's probably worth noting. And of course this ignores the fact that Bonds did everything he did...in 225 fewer ABs (See also, that huge walk number.) 160 more walks means 100 more times on base, when you take into account Pujols 60 more hits. I'm not a Bonds fan, but that's pretty compelling evidence...please claim your trophy at the front desk.
September 11th, 2009 at 8:52 pm
could you email me a copy of the spreadsheet so I can see how you're setting it up? this is awesome. oliver@kingturtle.com
September 11th, 2009 at 9:08 pm
MVP involves leadership, timing, personality, defensive skills, etc. this is measure is not to determine MVP. it is just a way take a variety of different important raw stats and pile them into one number that compares players with their peers. yes, brett butler was actually statistically notable three or four times in his career. at least when considering BA, R, TB, RBI, BB and SB.
September 11th, 2009 at 9:21 pm
Because I'm the curious sort, I looked at the 2006 NL hitting:
1. Ryan Howard (2) 4.6681
2. Albert Pujols (0) 4.6582
3. Jose Reyes (1) 4.6427
Wow.
That being said, I still think BB/SO should be relevant here. 181 vs 50. Just sayin'.
September 12th, 2009 at 7:19 am
JT, I don't think anybody is suggesting using this system for MVP--but I do think it would be one useful component in determining MVP (one out of perhaps 10 things that need to be considered.) When you take a player like Pujols, so in some categories is so far beyond anybody else, it means he's having a massively huge impact on games--not necessarily in just his own stats but in how the opposing team approaches the lineup and how Pujols' presence affects the hitters ahead and behind him.
Of course, stuff like this can be overrated. We know how good Manny is offensively, and yet when he was replaced with Juan Pierre for 50 games, the Dodgers did even better.
So often, if a guy finishes first in HR and RBI in his league, he wins the MVP regardless of how his team did or how good the player's offensive performance really was. I like Slothbaby's system just as a single data point reality check on how good the season was.
September 12th, 2009 at 8:23 am
So...the two questions that come to my mind are:
1. What would be the appropriate stats for a pitching version of this?
W? ERA? ERA+? WHIP? SO? SO/BB? 10 other stats?
2. Are there any hitting stats that should be considered added to the hitting version?
BB/SO? OPS+? (honestly...I like the 6 in there)
September 12th, 2009 at 8:36 am
Oh... I think the objection is not in the mechanism. The new scalings do make the metric pretty cool. Its sort of a scaled single-season black ink test.
What I think makes it more of a fun/junk measure and not something serious is the stats selected. They seem a bit arbitrary. Total base and walks tend to compete against each other. and no one thinks stolen bases should be weighted that high. Its kinda like metrics for ranking players in a roto-league they correlate with a players real offensive value quite a bit, but there are some "interesting differences".
Sabermetricians have been working on offense metrics for decades and in terms of run estimators, they've gotten quite good at it. We're all stats geeks here, we know how involved those estimators can be. A fun metric like this one doesn't really compare to those.
I don't want to spoil the fun here because these black ink measures are cool, but lets not get too carried away about how much they should be used in MVP discussions.
September 13th, 2009 at 3:31 am
Slothbaby, thanks for the spreadsheet tips! That MAX(L:L) function is handy!
I use my old system at various points through an active season to see which players come up that aren't in popular conversation that year. It was especially interesting last year when there was no single stand out in the American League and my system began singling out Pedroia before the media did. Pedroia wound up winning last year within the constructs of my system. (He also won the MVP).
This year the AL is again unresolved (MVP/best-season-of-the-year-wise) at this point in the season, while the NL has been wrapped up for some time (Pujols). A particular name continues to crop up lately (and leading the pack lately) in my old system that no one in the media is discussing for 2009 AL MVP rights. I was interested to see where this particular name ranked with this new system. Indeed, the same name currently leads the list using the new system too. Has anyone heard Chone Figgins' name crop up in the media as a contender for MVP this season? Have any of you considered him? Close behind in 2nd on the updated list is Abreu, who wasn't even in the top ten of my old system's list. Anyone have Abreu on their list of MVP candidates? Certainly, the Angels players are not getting their due respect in the mainstream media.
Using the new system, the top ten 2009 AL leaders are: Figgins 4.641, Abreu 4.635, Teixeira 4.492, Bay 4.480, Crawford 4.440, Jeter 4.400, BRoberts 4.325, CPena 4.289, Longoria 4.235, Ellsbury 4.205. Statistically, it's still a dead heat within these constructs. In terms of leadership, team result, popularity, media attention and intangibles, Jeter probably has a lock on it. For what it's worth, Figgins leads AL 3rd-basemen in Assists and is 3rd in DPs.
The system rewards quantity. How much raw data can a player amass. But what happens when we divide the raw number by plate appearances? Limiting it to those with 440 PAs or more, the leaders are: ARodriguez (29th on the raw list), Mauer (17th), Youkilis (14th), Bay, Zobrist (20th), Bartlett (30th), Nelson Cruz (34th), Abreu, CPena and Crawford. If you drop the PA limit to 300, Rajai Davis and Torii Hunter top the list. This really highlights what interesting years Zobrist, Cruz Davis and Hunter are having.
Figgins, Abreu and Hunter are dynamic players. The Angels are fun to watch and the team will get their deserved media attention in October.
September 13th, 2009 at 5:04 pm
The three names I've heard as MVP candidates are Mauer, Teixeira, and Jeter (caveat: I live in NYC). I don't see Figgins as a legit candidate to win, but I would definitely consider him somewhere on my imaginary 10-man ballot. A point in his favor which doesn't show up in the traditional numbers: http://www.baseball-reference.com/leagues/AL/2009-baserunning-batting.shtml#players_baserunning_batting::22 Of players with notable playing time, he leads the league by taking the extra base on a hit behind him 68% of the time, against a league avg of 39%. (I thought there was a way to hide players who don't have enough PAs to qualify but it doesn't seem to be there now.)
As for the Angels not getting their due, I was very surprised a few weeks ago during a Yankee telecast which polled the audience on who was Teixeira's main competition for MVP. 4 choices were given; I don't remember them all but Mauer was one. The poll winner was Kendry Morales. I was shocked because he's having a very good season but not an overwhelming one and I didn't think the average fan would be that aware of him.
September 13th, 2009 at 5:11 pm
Here are the top 10 in the AL at taking the extra base, w/ at least 300 PA.
http://www.baseball-reference.com/pi/shareit/kbuC5
(Gerald Laird?)
September 14th, 2009 at 9:43 pm
Kingturtle,
I'm sitting here pondering your decision to divide by PAs. On the one hand, I'm intrigued, but I'm thinking that statistically, it might be suspect. To this point, we've been doing dividing the stats accumulated by a given player, by the player who was #1 in that particular stat, obtaining a number that represents that ratio, between 0 and 1, repeating it for several stats, and then adding those ratios together. This represents, in a somewhat arbitrary (but still fairly grounded) way the comparison of that player with the dominant players in the league. It gives no weight to one stat over another, and summarizes the season output of that player.
You seem to want to evaluate the average amount of 'stats' a player accumulates, on average, per PA, using this method, and then compare. To obtain the seasonal stats above, and then to simply divide those values by PA...causes a problem, at least in my opinion, as you've not taken into account the PAs of the top player in that stat. If, for example, Player X get 100 runs in 500 PAs to lead the league, and Player Y gets 80 runs in 300 PAs, the raw number for player X would be 1.0000 (leading the league) and Player Y would be .8000 (80/100). In other words, Player Y did 80% as well as Player X on that stat. If you then divide .8000 by 300, you get .00267. But does that really mean anything? Player X is supposed to be the best in that stat, but if you did the same calculation for him, you'd get (1.0000/500) or .00200. So...it would seem that Player X is not the top player, on a per PA basis.
In my opinion, if you wish to evaluate players, not on their season numbers but on a value adjusted for PA, then you need to divide the stats by PA BEFORE you start. SB, BB, R, RBIs, TB can all be divided by PA to get a SB/PA, BB/PA, R/PA, RBI/PA and TB/PA, which is literally "what should I have expected, on average, from a player, in that P." If a player gives you, say 100 R on 500 PAs, then the value would be .20000 R/PA. Calculate this for all players. Then find the top value and repeat the calculations as before. This would, in my opinion, be closer to what you want.
It comes down to the question you want to answer (total season vs. expected per PA) and goes back to a JohnnyTwisto's comment about not accounting for park or position. This ratio stats account for none of that. They are simply a way of evaluating a single player against the top players in that league for those stats. A mini black ink/grey ink evaluator. Quick and dirty. Quite obviously a defensive stud, speed demon, heart of the dugout catcher that gives you 4.2340 out of 6.0000 points is more important than an all bat, no legs clubhouse cancer that DHs, but who happens to get 4.2500
What it does show is pure, unadulterated dominance. Coming in first in a category is impressive. Coming in first by 30% over second place is way more impressive.
September 15th, 2009 at 10:07 am
Well, my thinking is this...i'm looking for well-rounded total season offensive output, so giving TB dominance and SB dominance equal weight is an attempt at locating well-roundedness. Sure you can hit dingers and doubles, but can you steal bases too? R, RBI, TB, BB, SB and BA are all different aspects of a player's potential well-roundedness. Some players may shine in one or two; lead off hitters may shine in some while clean up hitters shine in others. But who shines across the board more than others? This is only to look at a player's total season offensive well-roundedness. This is not looking at their leadership, fielding, intangibles, etc.
As for dividing by PA, that was simply to find the players who put up good numbers but for various reasons had not played the full season...for curiosity sake, not as a way rank players.
Your solution is interesting because it sets a finite maximum (6) and minimum (0) to each total result. I experimented with dividing each player's raw numbers with "LgAvg per 600 PA" (40/7 for SBs in the case of Wills in 1963) instead of "largest number of x" (max(B:B)). In that system players like Wills, Taylor, Clendenon and Gilliam rank much higher because of their dominance in SBs that season compared the the weak league average of 7.
So instead of Aaron 5.38, Pinson 4.44, Mays 4.38, White 4.25, FRobinson 4.08, Cepeda 3.92, Mathews 3.88, BWilliams 3.87, Flood 3.82, McCovey 3.79 - you get Aaron 13.62, Pinson 11.08, Wills 10.82, FRobinson 10.55, Mays 9.35, White 9.28, Taylor 9.10, Flood 8.90, Clendenon 8.43, and Gilliam 8.34.
I am curious to hear your thoughts regarding dividing by LgAvg per 600 PA rather than (max(B:B)).