RetroSheet and the 1920’s
Posted by Sean Forman on May 11, 2009
A Retro-tour of the Early 1920s.
Retrosheet has pointed the box score files for the 1920's and are working on the 1930's as you read this. One question I've always had is should I include seasons that are well set off from the main chunk of seasons we have on hand, 1954-2009. Now there is a decade of the 1920's some teens, some in the 1800's. I'm a little leery of incorporating those into the site because i'm not sure that doing a search for all 5 hit games by catchers will be that meaningful for 1920-1929 and 1954-2009. And there is also the issue of getting all of the new ballparks etc. in.
Thoughts?
May 11th, 2009 at 2:27 pm
I would not add them to the PI as it would make the results very confusing. Right now, we have an awesome tool for analyzing the game 1954-present. We do all these searches knowing that we are not touching anything game-by-game prior to 1954.
May 11th, 2009 at 2:49 pm
I'm of the opinion that more info is better. While it may not be useful for a clean stat-of the day post, why not include an option to search earlier games. Of course if the question is one of time and resource allocation and the options are an improved PI for the post-1954 games or the current PI with the 1920 games included, I would prefer the former.
May 11th, 2009 at 3:16 pm
Stats are good no matter what!
Sure there will be huge breaks in time but that doesn't take away from the stat, does it?
A catcher that had a 4 hit game in 1927 will still have that game forever Add the stats...)
Besides, the only true issue will be of player/team streak value. Individual single game stats should not be a problem.
May 11th, 2009 at 3:19 pm
I also think it's better to have the additional info. Who wouldn't like to see how Babe Ruth faired against Walter Johnson, or whether he ever batted behind Gehrig? Just provide clear disclaimers so anyone running a search knows what results he is getting, and what the limitations are.
May 11th, 2009 at 3:21 pm
Also Sean, I don't know if you're monitoring all the comments, but there are some problems with the Year-by-Year Per-Game Batting Stats you linked to a few days ago.
May 11th, 2009 at 4:24 pm
Johnny, Is it still an issue. I made a change this morning. The MLB ones were incorrect.
May 11th, 2009 at 5:23 pm
At a quick glance it looks good to me. Thanks.
May 12th, 2009 at 4:38 pm
Hell yes you should include. If you have the technology, damn straight you should add them in. I'm amazed it's even a quetion.
Will there be problems? OK, so there will be problems and oddities. Perfect shouldn't be the enemy of good, though. I'd love to see who Ruth drew the most walks from or who Dazzy Vance owned or how many doubles Dud Lee hit off of Cleveland. Really, whatever you can add is an improvement.
May 12th, 2009 at 11:37 pm
Hey Sean:
If you're worried about having big gaps in stats, just remember that you have "pitch count" for just the most recent of years and not the early years.
You didn't hesitate to put in the pitch count (thinking that they are not listed for the 1950s-1980s) so the same should be done for these 1920s stats.
Include them and don't worry about the stat gaps.
May 12th, 2009 at 11:52 pm
I think you should include the stats because everybody has interest in the stats. But as long as there's that gap before 1954, you should have the option for searching games of "1954-2009 only", like you have in so many other areas for AL Era, Expansion Era, or small breakdowns for streaks. That way, people who wanna look just at a group of time without interruption can do it easily with one click.