Citi and New Yankee Park Factors
Posted by Sean Forman on September 22, 2009
I ran the numbers through our park factor calculator today. Basically, the new parks come out as neutral. Citi get a 98/99 (batting/pitching) park factor and New Yankee gets a (98/97) park factor. Hyperventilation aside they are both very slight pitcher's parks. Whenever possible we use 3-year park factor which are an average of year N-1 and year N+1. When only two years are available we use year N-1 and year N and in the case of New Yankee and Citi we just use the current year. I also only use intra-league games since there is a home and home with every team in the league. Interestingly, if we include the inter-league games in our totals, we have only a percentage point change here or there.
I think we need to seriously rethink park factors and I've done a little bit of that, but don't have anything ready to use.
For now, I'm not going to tweak the park factors in the db. You should assume that Yankees and Mets hitters are slightly better than shown and the pitchers slightly worse, since I've been using 100/100 for the two parks.
Here are the 2009 3-year park factors if the season ended last night.
team_ID BPF_tot PPF_tot ARI 109.000 109.000 COL 109.000 109.000 CHC 108.000 106.000 BOS 106.000 104.000 SFG 104.000 104.000 TEX 104.000 104.000 DET 103.000 104.000 PHI 103.000 101.000 CHW 103.000 103.000 CIN 102.000 103.000 TBR 102.000 102.000 LAA 101.000 100.000 FLA 101.000 101.000 BAL 99.000 101.000 HOU 99.000 99.000 MIN 99.000 98.000 ATL 99.000 99.000 STL 98.000 97.000 TOR 98.000 98.000 OAK 98.000 98.000 NYY 98.000 97.000 KCR 98.000 99.000 WSN 98.000 99.000 NYM 98.000 99.000 MIL 97.000 97.000 SEA 96.000 97.000 LAD 96.000 94.000 PIT 96.000 97.000 CLE 94.000 95.000 SDP 88.000 89.000
September 22nd, 2009 at 2:20 pm
Are there park factors split by handedness or batted ball type? I could never understand applying the same number to players who hit to different parts of the park.
September 22nd, 2009 at 2:34 pm
There has been some work on that, but you run into issues with sample size issues when you start breaking it down like that.
September 22nd, 2009 at 3:25 pm
Raphy, it's the value vs ability dichotomy. If we just want to know how many runs/wins a player was worth, I think we HAVE to apply the same park factor to everyone. If it is 10% easier to score in Stadium X, then Player Y's offensive value is 10% less it appears, even if for some reason he did not hit well there. But if we are trying to assess his actual ability, then yes, we should consider whether he individually is affected by the park in a specific way.
September 22nd, 2009 at 4:16 pm
Sean, does this mean the 2008 PF's get a slight tweak in the offseason as well because there is now data for year N+1?
If so, a tight race like the 2008 NL ERA+ title could change winners a year later. Though eyeballing the numbers above, 2009 data would increase SFG's 2008 PF a tiny bit which would widen Lincenum's lead over Santana instead of narrow it.
I agree with JohnnyTwisto. Park Factors are best applied broadly to set the offensive context. If there is something unique about a player's interaction with his park that helps/hurts him more than everyone else, then that actually helps/hurts his team in the same way. But of course, issues like that need to be taken into account in a players's *projection* if said player were to be traded away from that park.
September 23rd, 2009 at 8:32 am
You CANNOT apply a single PF to the statistics of all the players who play there. That constitutes sabermetric malpractice.
September 23rd, 2009 at 9:26 am
Please, save the melodrama for another site.
We use single park factors for a couple of reasons.
1) We are focussed on what happened rather than prognostication, see comment #3
2) Single PF's are all we have for nearly half of major league history.
To me it seems to make the most sense to use a consistent methodology across all seasons rather than change it mid-stream.
I'm not saying we shouldn't explore more precise park factors and I expect to add much more along those lines this off-season, but I think your dismissive attitude is rude and not constructive.
September 23rd, 2009 at 1:06 pm
Sean,
This is one of my favorite sites and you do a fantastic job. I'm sorry you found my comment melodramatic or rude, but that was not my intent. (It certainly didn't rise to the level of dismissiveness of some comments from last week concerning when the decade ends. I stayed out of that debate :))
It is my strong opinion that PF's should not be used to adjust or individuals' objective statistics. There is a school here who seems to think that "OPS+ and ERA+ are improvements on OPS and ERA, because they at least try to account for Park Factors." That is wrong. OPS+ and ERA+ are terrible statistics because they take objective, verifiable statistics and turn them subjective. There is no objective way of "measuring" how Todd Helton and Brad Hawpe would have performed in a perfectly neutral park, so OPS+ tries to take a guess, but only using the roughest of summary data. The fact is, Coors doesn't help Hawpe much, but it helps Helton a hell of a lot.
(Furthermore, "righty-lefty PF" are still going to carry the same intrinsic problems, although they could reduce the error by up to 50%. However, we know that RH power hitters have huge BPF's in Fenway, but RH contact hitters do not. So if you take OPS points away from all RH hitters your stats will be doing RH contact hitters).
Ideally you would look at each batter and how the various parks help his unique hitting style. But as you point out there is a sample size problem and I frankly don't see any way around that...
I have no problem with OPS+ and ERA+ as long as everyone keeps in mind the PF's are nothing but statistical means, and as such, the players' actual data points will fall all over the distribution curve. I haven't calculated the standard deviations but having eyeballed a number of Player Splits, I might guess that 90% of the actual PF's for batters and pitchers at Nationals Park would fall between -7 and +5 and that's quite a spread.
Old ideas die hard so I don't expect anyone to actually make any changes I suggest (nor do I want to run this site, nor would I ever have the ability to do so, nor could I do as good a job as its current custodians, so let's get that out of the way), but if I did have a vote, it would be to relegate subjective stats like OPS+ and ERA+ to the "More Stats" page and leave the "Standard" page to stats that describe and illuminate observable facts.
Kelly
September 23rd, 2009 at 3:11 pm
Kelly's point is fairly important: What actually happened matters. Yes, you can't ignore whether or not a stat is tied to Coors Field, circa 1995, or Dodger Stadium, circa 1968, but you can't pretend that a .230 hitter with 17 HR in 1968 would have hit .285 with 40 HR playing for the Phillies today, either.
This is why I think the smart method is to remember Aristotle and go for the mean. I love OPS+ and ERA+, because if it is a stadium or year I'm not familiar with, it's nice to know 4.53 really is closer to 3.53, or vice versa. Chances are a 130 OPS+ or ERA+, even if the raw numbers aren't impressive, means a player was great that year. But the fact is that they had to play under the real conditions, not averaged-out ones. If Jim Rice put up great numbers at Fenway because he was at Fenway, should he be penalized for that? Heck, no! It was smart for him to take advantage of his ballpark, or for Jeter to poke homers to right, or Bobby Thomson to swing at the 250-foot left field line. The same stadium will affect a righty who hits line drives to all fields differently than a righty power pull-hitter. And it won't affect Willie Keeler, Eddie Collins or Phil Rizzuto much. Are Brad Fullmer's doubles at Olympic Stadium any less valid? Or Jack Morris' or Lefty Gomez's wins because they knew their batters would come through? That's why I still place a fairly decent value on raw numbers, and numbers tied to the real result--runs, RBI, wins and even average (because as much as I love walks, they won't help you against a guy who only throws strikes!).
Park factor, however it is figured, is valuable, but not a crutch.
Oh, and I agree--the site's "custodians" are the best. In fact, if they ever want to find a job for me...
September 23rd, 2009 at 5:26 pm
It seems like you guys are somewhat missing the point of OPS+ and ERA+. They are not meant to show how Helton would hit in a neutral park, or to penalize or reward players, or to predict how a 1968 hitter would perform today. They are simply to help measure value by comparing a player's performance to the average performance in parks where he played. Maybe Eddie Collins's numbers would be unaffected no matter when or where he played, because of his style of play. But it is important to know that his .900 OPS helps win more games in 1909 than in 2009. I don't see that as a "penalty," it's just a description. If you think a .900 OPS means the same thing whether attained in 2009 or 1968 or Little League or Mile High Stadium or Petco Field, then just ignore OPS+, because you are not interested in what it is attempting to measure.
I think you believe OPS+ and ERA+ are trying to do more than they do. In an odd way, they've probably become more powerful than they should because of the growth of this great website. There are certainly people who seem to over-rely on them and perhaps you are reacting to that.
Anyway, we had this argument already here: http://www.baseball-reference.com/blog/archives/2164#comments Jkesq, I think we are just talking past each other because you are trying to figure out something different than OPS+ and ERA+ are meant to answer.