This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

Contest: most dissimilar player

Posted by Andy on November 27, 2009

Most Baseball-Reference.com users are aware of the site's inclusion of Similarity Scores for each player.

By way of example, here are the players tto whom Mark Teixeira is currently most similar:

  1. Kevin Mitchell (913)
  2. Miguel Cabrera (905)
  3. Tony Clark (883)
  4. Dick Stuart (868)
  5. Geoff Jenkins (861)
  6. Gus Zernial (856)
  7. Aubrey Huff (855)
  8. Richie Sexson (853)
  9. Richie Zisk (853)
  10. Ripper Collins (853)

This is the similar batter list for career totals. (Each player's page also lists similar players through the current age of the player as well as similar players at past ages for the player.)

So at this point in time, Mark Teixeira's career totals are most similar to Kevin Mitchell's career totals, which is not bad considering that Tex will just be turning 30 around the beginning of the 2010 season. For an explanation of how similarity scores are calculated, see here. I really like the system although I admit I'd prefer if it didn't consider the defensive position of each player so that we could compare based on offensive performance alone.

Anyway, I'd like to try to identify the players who are least similar to any other players.

Here's what I meant. If you look at Teixeira's list above, his top similarity score is 913. However, there are other players whose stats are so unusual that they have a top similarity score that is much lower. Barry Bonds, for example, has Willie Mays as his most similar player but with a score of just 762. By comparison, the guy most similar to Mays himself is Frank Robinson with a score of 830.

I want to find the player with the lowest #1 similarity score. I already know of one star player with such a score much lower than Bonds' but I'll let you, the readers, figure it out.

Let's also create a few categories: lowest similarity score for 1) retired players with at least 1000 games played, 2) retired players with under 1000 games played, 3) active players with at least 1000 games played, and 4) active players with under 1000 games played. I'm talking about only positional players here, not pitchers (or pitchers' similarity scores as batters.)

Go ahead and post whatever you find in the comments. I'll check back on this post at the end of the year (Dec 31) and see who posted the earliest comments with the best answers. Comment as many times as you like.

What are the prizes? As of now, there are none beyond bragging rights. However I am going to add some next week so stay tuned.

20 Responses to “Contest: most dissimilar player”

  1. dgreds Says:

    Rickey Henderson's most similar player is Craig Biggio at 713.

  2. alkeiper Says:

    Pete Rose's most similar player is Paul Molitor at 678. I'd be stunned if there's a lower score.

  3. Imsdal Says:

    Pete Rose, almost certainly, for 1).

  4. Imsdal Says:

    See, this is what you get for doing research. I knew it was Pete Rose, but decided to double check and lost out by seconds. From now on, I should do research Fox News style, i.e. not at all.

  5. Imsdal Says:

    For 3), I'm going with ARod at 799.

  6. Imsdal Says:

    For active players under 1000 games, I'm guessing Hanley Ramirez at 866.

    For retired players under 1000 games, I'm guessing Dave Orr at 878.

  7. BunnyWrangler Says:

    I would just like to say that, in searching for this, I saw that the hitter most similar to Russell Martin is Johnny Estrada (954). I say this not to criticize the system but to bring up one of its strangest results.

    Martin is one of the fastest catchers, at the very least the best basestealing backstop of recent years; Estrada didn't try to steal a base even once in his career. Martin has walked about 65 times per season; Estrada rarely took a base on balls. I thought that Martin would have a low similarity score because his talents are unique for a catcher, but I was wrong. Instead of finding another relatively fast catcher, though, it brought up one of the slowest ones I can remember watching. Then again, Martin's most similar player through his age (26), is Thurman Munson, who seems like a much better comparison.

  8. dgreds Says:

    Most unique players: http://www.baseball-reference.com/leaders/similarity.shtml

    So I guess Rose was right and Rickey was wrong. It doesn't say active but I'm going with ARod for that.

  9. Tiger_fan Says:

    Yep I was slow too! Found #2 Cy Young and saw the link for unique players.
    BunnyWrangler: The similarities are stats alone. So position, size, quickness do not matter. The first one I thought of was Randy Johnson and the shortest player, then I read "For an explanation of how similarity scores are calculated, see here." (above)

  10. Tiger_fan Says:

    At an individual age nobody is under 700 except Cy Young at 40 years old- 679.8 with Pete Alexander.

  11. Andy Says:

    Pete Rose was the guy I'd found with the lowest score, but that doesn't mean he's necessarily the answer...let's see if anybody can find someone even lower.

    As for comments about Martin and Estrada, I totally agree that the sim score system breaks down when the guys haven't played too many games. Sim scores are calculated strictly on a points system, not a rate system, meaning that two players who played 1500 games and have a sim score of 900 are actually much more similar than two players who played 500 games and also have a sim score of 900. In other words, either way there is 100 points of variation, but in the first case it might be spread across 10 years while in the second case it might be spread across just 3-5 years.

  12. BunnyWrangler Says:

    Tiger_Fan:
    I wasn't saying that the system whiffed or anything, and I know that it measures only statistics (although it does factor in position). My comment was more about how odd I found it that the career statistics - at least the ones measured by similarity score - of two very dissimilar catchers, Martin and Estrada, were actually pretty close.

  13. Tiger_fan Says:

    I give Ty Cobb a shot. Not sure how the numbers differ on your list and this one: http://www.baseball-reference.com/leaders/similarity.shtml

  14. cubbies Says:

    wow. waking up at noon has its consequences. when reading this article, i knew it had to be pete rose becuase i coincidentaly was looking at the most dissimilar players leaderboard.also, most unique "active" is barry bonds at 762. http://sports.yahoo.com/mlb/news?slug=ti-uggla111309 -at the bottom of this acticle it says that bonds is still active. So because of the leader board i know that if you want to consider him active, he is the least similar, if if he isnt active, it would have to be a-rod with 771 through age 33.

  15. cubbies Says:

    p.s. wouldnt it be cool if there was a thing like the oracle except for similarity scores?

  16. Andy Says:

    As you might imagine, I did not know the dissimilar leaderboard existed!

  17. SpastikMooss Says:

    This was really cool to mess with, though I missed the boat by a lot.

    Now I'm trying to find some pair with a 1.000 similarity rating. The closest I have so far is a .991 at age 27 by John Foster and Rodrigo Lopez. I know there's gotta be a 1.000 out there somewhere (at least for a young age with little MLB time), but I'm looking for two players with like a .994 over two ten year careers. How wacky would that be?

  18. SpastikMooss Says:

    Also, still messing around here. Has anyone seen Gary Sheffield's similarity by age?
    You've got Gary Clark, Ryan Zimmerman, Scott Rolen, Dale Murphy, Jack Clark, Chipper Jones, Duke Snider, Jeff Bagwell, Fred McGriff, and Reggie Jackson in there. And similarity overall he pulls Mel Ott, Reggie Jackson, Ken Griffey, Fred McGriff, and Mickey Mantle as his top five (three hall of famers and two who probably will be).

    Impressive comparison. But would any of us consider Sheff a HOFer? He did win a world series, and he finished in the top 3 in mvp three times.

  19. Andy Says:

    I don't think you'll find a 1000 score because they aren't given unless a player has at least a minimum playing time and the odds are stacked quite high against any two players with, say, 3 years of experience having identical totals across all categories. I'd say it's a million-to-1 shot.

    Sheffield compares to those guys because he's played many years and racked up high totals. He's a very good player but falls short of HOF in my eyes mainly because he was not a particular dominant player for any significant stretch of his career. He put up big numbers alongside a bunch of other guys.

  20. JohnnyTwisto Says:

    I think Sheffield will have difficulty making the HOF soon because of his bouncing around from team to team, multiple injury-shortened seasons earlier in his career, and the various attitude/character questions. But he may very well deserve it as he was a tremendous offensive player for many years. He could be the type of guy who gets in 50 years down the road, when people mostly just have the numbers to go on and can't believe this type of hitter was never inducted. (On the other hand, Dick Allen doesn't seem to be particularly close to getting in...)

    Here's an impressive list of hitters with similar career numbers: http://bbref.com/pi/shareit/S8kYz