In recent weeks, I've written a fair amount about batting average on balls in play (BABIP). A different statistic, fielding independent pitching (FIP), uses the three true outcomes (Ks, BBs, HRs) to describe a pitching performance, while assuming a league average BABIP. This average BABIP assumption allows the statistic to regressed back nicely towards ERA, to create a defense independent version of ERA.
BABIP is not stable; thus (and for other reasons), not every pitcher finishes each single season with a league average BABIP. A gap between an individual's BABIP and the league average is one of the factors (along with sequencing or strand rate) that leads to a gap between a pitcher's FIP and ERA.
At the moment, Hellickson is baseball's poster child for not only the gap between FIP and ERA, but the idea that a pitcher could have a plan that revolves around inducing softer contact, and in turn, yielding a lower than average BABIP.
Among qualified starters in both 2011 and 2012, Hellickson lead baseball with the largest (absolute) gap between his ERA and FIP. The difference in 2012 was an incredible 1.5 runs (1.49 runs in 2011); while, Hellickson's BABIP (.261) was 32 points below the league average.
My focus for this article has really nothing to do with BABIP though, instead while writing the THT piece I referred to, I began wondering how Hellickson's gap between FIP and ERA stacked up historically.
Theoretically, the gap between FIP and ERA should decrease as innings pitched increases.
For that reason, looking at the historical significance of the 3.35 FIP/ERA gap that Josh Outman had in 40.2 innings last season, would be rather foolish.
My goal instead was to see if Hellickson's 1.5 FIP/ERA gap had any historical significance among pitchers who threw enough innings to qualify for the ERA title. Questions such as, whether or not a qualified starter ever had a two run difference between his ERA and FIP, were what I was looking to answer.
So, I looked at ERA/FIP gap for every qualified starter dating back to 1950, because pre-World II baseball was a different game and FIP for a pitcher in the 20s and 30s really isn't too relevant.
I didn't concern myself with whether or not the pitcher outperformed his FIP (ERA much lower than FIP) or if the pitcher underperformed his FIP (ERA much higher than FIP), but instead looked at the absolute difference between the two statistics, to find the top-10 largest gaps in ERA and FIP since 1950.
In 1987, Chris Bosio, a pitcher for the Milwaukee Brewers had the largest FIP/ERA gap of any qualified pitcher in baseball, since 1950. '87 was Bosio's first full season in the majors, but he would go on to have a serviceable major league career. Both his career FIP and ERA finished under four as he racked up almost 30 wins above replacement (FanGraphs).
The most recent member of this list happened to be Ricky Nolasco's 2009 season, in which his ERA was over 1.70 points (runs) higher than his FIP would suggest. In Nolasco's career, he has only one season, in which he induced a BABIP lower than .300, but his 2009 season was most affected by his league worst 61 percent strand rate.
Neither of the two Hellickson seasons that sent me down this path made the top-10, but this list makes it clear that his 1.50 difference between FIP and ERA was nothing to bat on eye at.
I'm thinking of possibly separating ERA greater than FIP and FIP higher than ERA in a future study, seeing as eight of the ten pitchers on this list had an ERA much higher than their FIP, rather than the other way around.
The aspect of this list that most popped out to me though, had nothing to do with the names of the ten pitchers, but instead had everything to do with the calendar years in which these seasons occurred.
This sample consists of over 60 seasons, yet five of the ten seasons occurred within the last twenty years (1993-2012), while the other five pitchers came from the first forty.
This reminded me of an article written on this very site, by James Gentile, just a month ago. In that piece, James discussed the rise of the three true outcomes in baseball, and makes the argument that maybe FIP is only really useful from the start of the 1990s on.
The combination of James's piece and results of this list brought me to this hypothesis:
The rise in percentage of plays that result in one of the three true outcomes has resulted in the gap we often see between FIP and ERA. And that this gap is really only a recent (1990s - ) phenomenon in baseball.
James's piece and this top-ten list are clearly not enough conclusive evidence to back that hypothesis; thus, I decided to come up with another test.
What I would've liked to do, would be to test to see if there was an increasing trend in the gap between FIP and ERA within a single season for the entire population of pitchers, beginning in 1950.
The problem with that idea though is that there is never a difference between ERA and FIP among the population of pitchers.
The reason for this comes from the way FIP is calculated. FIP has its given weights, which result in a number for each pitchers, then that number is regressed against ERA in that given season, which scales the statistic up to ERA, and puts the two statistics on a comparable platform.
Luckily though, when we eliminate all relievers and isolate for the gap between starters' ERA and FIP, the gap is hardly ever zero.
Below, I plotted the (absolute) difference between ERA and FIP for starters from 1950-2012:
This data clearly shows an increasing trend for the difference between ERA and FIP from the starting point (1950) to the ending point (2012).
However, that trend does not begin as I expected in 1990, but instead around 1973.
Another interesting thing I should point out is that while I tested for any difference between the two statistics, the only times where all of the starting pitchers had an average FIP below their average ERA came before 1973.
Thus, I came to this final question:
Also, the secondary question that should be considered, or I guess needs to be considered along with this question, is why do relievers have higher FIPs than ERAs?
I'll be the first to admit that I truly don't know; however, I think I might have a few ideas, or starting points that we can work from.
While, James made the argument that FIP is really more useful in the last twenty years, it is clear from his research that the three true outcomes have been rising since 1920, and have for the most part risen each season from 1950 to 2012.
Here's the breakdown of the FIP components in 1950 versus 2012, for starters:
There is a strach difference between the two eras.
Starting pitchers in 1950 actually walked more batters than they struck out, which is nowhere close to today's average. Also, starting pitchers in 1950 had a better average ERA than relievers; which also is nowhere close to being true today (4.19 ERA for starters 3.67 ERA for relievers in 2012).
That difference needs to be taken with a grain of salt; however, as relievers were used sparingly in 1950, as opposed to receiving one-third of the innings in 2012. In a chapter of BP's Extra Innings, Colin Wyers goes into great detail about baseball's increasing use of pitchers out of the bullpen.
Many have postulated that the increase in hard-throwing speciality relievers has been one of (if not the) main reason for the increase in the three true outcomes.
I think this fact definitely needs to be considered in answering my question of why there is such a large gap between ERA and FIP for starters, but I'm honestly not positive on how it is affecting things.
My best guess is that the current gap between ERA for starters and relievers is so large, that FIP is trying to compensate for that fact when regressing the FIP components for both starters and relievers back to the league average ERA. This results in average FIP for starters and relievers residing somewhere in the middle of their individual average, and the average of every pitcher in baseball, regardless of type.
My final idea, which goes back to the increase in three true outcomes, is that because, in today's game plays result in three outcomes more often, individual pitchers' true outcome measures are more spread, which leads to an in-season regression to ERA that is not as tight as it was (in for example 1950) when individual (Ks, BBs, HRs) metrics were less spread.
I may have just opened up a can of warms and come up with no real conclusions; however, that can most times end up being a good thing.
I'm going to leave this study with the community.
What do you think?
Does anyone have any better idea for why we're seeing this gap between ERA and FIP?
All statistics come courtesy of FanGraphs
You can follow Glenn on twitter @Glenn_DuPaul