I'm not lying when I say that I lay awake at night thinking about pitching BABIP. It is because of the variability in pitching BABIP that Jeremy Hellickson has become my #Unicorn.
Explaining (or more importantly predicting) year-to-year BABIP for pitchers is nearly impossible.
Derek Carty of Baseball Prospectus (building off earlier work done by Pizza Cutter) showed that it takes eight seasons for BABIP to become reliable (r= .5). In different piece at BP, Matt Swartz showed that for pitchers who throw at least 150 innings in a season, on average 75 percent of BABIP is simply random variation.
Given these pieces of evidence, here are two brief (made-up) examples of how many writers/analysts tend to discuss or project BABIP for individuals during the offseason:
Braves starter Johnny Goodfortune had a .250 BABIP in 160 innings in 2012. This number is well below the league average for starters (.293) and thus I expect his BABIP to regress a good deal in 2013.
Royals starter Carl Cantcatchabreak had a .330 BABIP in 175 innings in 2012. Because his BABIP is well above the league average for starters, his results are due to improve in 2013.
Assumptions or predictions like these all sound great and make intuitive sense.
We know one season of BABIP for pitchers (even ones who throw a lot of innings) is not really reliable, it seems best to assume regression toward the league average BABIP for individuals.
The basis for this piece is to search for evidence to either back or disprove this prevailing assumption about year-to-year BABIP.
I'm sure almost everyone reading this will say the answer to that question is quite obviously, yes. But for a second consider what we know about the variation in a single season of BABIP.
If a pitcher yields a .270 BABIP in Year 1 (.290 league avg. BABIP) why must we assume that his BABIP in Year 2 will be closer to .290 rather than .250?
If 75 percent of BABIP is noise in each season then there's a chance it's just as plausible for an individual's BABIP to move further away from the mean in Year 2 as it is for it to move towards the mean.
This idea is exactly what I would like to test.
A good starting point is to figure out how much weight we should give to an individual's BABIP and how much weight the league average BABIP should get in a simple year-to-year projection.
To find this number, I took a sample of starting pitchers who threw least 100 innings in Year 1 and 100 innings in Year 2 (n = 774) for the years 2004 to 2012.
Each individual pitcher's BABIP in Year 1 was regressed against his BABIP in Year 2, which resulted in this scatter plot:
As expected, the correlation was very weak and the data points are noticeably scattered.
The r-squared indicates that only ~4.3 percent of the variation in BABIP in Year 2 is explained by BABIP in Year 1; which is very low.
Typically within the study of baseball statistics, the correlation coefficient is used in a very specific way when regressing a statistic (like BABIP) to the mean.
In this case, the correlation coefficient indicates that when predicting BABIP for the pitchers in this sample, we should use 21 percent of the individual's BABIP and 79 percent of the league average BABIP (or league average for similar pitchers).
For example: In 2012, Cardinals starter Kyle Lohse had a .262 BABIP, and the league average for all starting pitchers was .293. Thus, based on this study, we come up with this equation for a projected BABIP in 2013:
2013 BABIP = .21 (.262) + .79 (.293) = .288.
This equation calls for serious regression towards the mean for Lohse's BABIP, in 2013; which many who are versed with sabermetrics would assume (or agree with).
The linear regression equation (see on chart) for this sample illustrates this idea even more beautifully than my example with Lohse:
BABIP in Year 2 = .214 * ( BABIP in Year 1) + .2316
Does that look familiar? (Hint: .2316 = .79 * .293)
It's literally the exact equation I used to project Lohse's BABIP in the example, except it uses .293 for the entire sample, as opposed to just the league average BABIP for starters in Year 1.
My next thought was that I could improve the regression's strength by using the actual league average for starters in Year 1, instead of using .293 as a crude average for the entire sample.
Quite interestingly, this idea did not improve the predictive value of the regression. The overall r-squared (.038) was slightly lower than simply using .293.
Predicting year-to-year BABIP for pitchers is nearly impossible.
However, sometimes we are tasked with the impossible and based on this sample, it seems the best way to predict year-to-year BABIP is to use ~20 percent of the individual's number and ~80 percent of the league average.
This agrees with the intuitive idea that a pitcher with a higher than average BABIP will improve (or move towards the mean) and that pitcher with a lower than average BABIP will regress (again move towards the mean) in the subsequent season.
So far we've seen that strongest model for projecting BABIP based on just last season's data includes a large amount of regression toward the mean.
At the same time, the correlation was still relatively weak; which may cause some to be a little wary about taking this large regression to the mean assumption at face value. Thus, I ran one final test.
For the independent variable (x-axis), I used the gap between an individual's BABIP and the league average. The dependent variable (y-axis) for this test was the difference between a pitcher's BABIP in Year 1 and 2.
To make my exact process clearer I'll use Lohse as an example again.
I found the gap between Lohse's BABIP and the league average in Year 1 (2012), by subtracting it from the league average(.293 - .262 = .031).
Then I would regress this number against the difference between his BABIPs in Year 1 and 2 (2013 BABIP - .262)
We would expect a positive relationship between the two numbers, as his BABIP in Year 2 should be higher than Year 1, based on the fact that his BABIP in Year 1 was lower than the league average.
The same idea works for the other end of the spectrum.
If a pitcher's BABIP was higher than the league average this would give us a negative predictor, which makes sense as we'd expect the individual's BABIP to fall in Year 2.
Was there a relationship?
The relationship was relatively strong (and as expected positive), although it's clear there was still some serious scatter.
Our main focus from this regression is on the linear equation:
BABIP in Year 2 - BABIP in Year 1 = .7892 * (League BABIP - BABIP in Year 1) - .0008**
**Note**--The intercept is essentially zero, so we'll ignore that an focus on what the slope's interaction with the x and y variables means.
Essentially all this equation is saying is that, on average, the difference in year-to-year BABIP is equal to the league/individual gap multiplied by about .79.
This idea should sound rather familiar. Multiplying the league/individual BABIP gap by .79 is the same thing as we did earlier when the regression used just 21 percent of individual's BABIP and the other 79 percent was simply the league average.
These projections are almost exactly the same, as indicated by our test dummy, Mr. Lohse:
Regression 1 projection: 2013 BABIP = .21 (.262) + .79 (.293) = .288.
Regression 2 projection: 2013 BABIP - 2012 BABIP = .79 * ( .293 - .262) = .025.
.025 + .262 = .287
I think the evidence pretty clearly backs the fact that regression toward the mean is very real. Both tests resulted in a predictive model where the mean had almost 80 percent of the weight.
I'll conclude with two pieces of advice.
All data comes courtesy of FanGraphs.
You can follow Glenn on twitter @Glenn_DuPaul