Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared.
THE SCORECARD
In Week 2, I laid out our guiding principles for Regression Alert. No specific prediction was made.
In Week 3, I discussed why yards per carry is the least useful statistic and predicted that the rushers with the lowest yard-per-carry average to that point would outrush the rushers with the highest yard-per-carry average going forward.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
In Week 5, I talked about how preseason expectations still held as much predictive power as performance through four weeks. No specific prediction was made.
In Week 6, I looked at how much yards per target is influenced by a receiver's role, how some receivers' per-target averages deviated from what we'd expect according to their role, and predicted that the receivers with the fewest yards per target would gain more receiving yards than the receivers with the most yards per target going forward.
In Week 7, I demonstrated how randomness could reign over smaller samples, but regression dominates over larger ones. No specific prediction was made.
In Week 8, I discussed how even something like average career length could be largely determined by regression-prone fluctuations in incoming talent. No specific prediction was made.
In Week 9, I looked at running backs scoring touchdowns at an unsustainable rate and posited that even Todd Gurley must return to earth.
In Week 10, I delved into the purpose of regression alert and the proper takeaways. No specific prediction was made.
In Week 11, I explained an easy way to find statistics that were more prone to regression and picked on yards per carry one more time.
In Week 12, I went into the difference between regression to the mean, (the idea that production will probably improve or decline going forward), and the gambler's fallacy, (the idea that production is "due" to improve or decline going forward). No specific prediction was made.
In Week 13, I badmouthed interception rate for a bit and then predicted that the most interception-prone quarterbacks to that point would throw fewer picks than the least interception-prone quarterbacks going forward.
In Week 14, I delved into the various biases that permeate this column and how regression to the mean works even in less spectacular ways than the ones I choose to highlight here. No specific prediction was made.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 24% more rushing yards per game
|
Group B has 4% more rushing yards per game
|
SUCCESS!
|
Yards:Touchdown Ratio
|
Group A had 28% more fantasy points per game
|
Group B has 23% more fantasy points per game
|
SUCCESS!
|
Yards per Target
|
Group A had 16% more receiving yards per game
|
Group A has 13% more receiving yards per game
|
Failure
|
Yards:Touchdown Ratio
|
Group A had 26% more fantasy points per game
|
Group B has 4% more fantasy points per game
|
SUCCESS!
|
Yards per Carry
|
Group A had 9% more rushing yards per game
|
Group B has 23% more rushing yards per game
|
SUCCESS!
|
Total Interceptions
|
Group A had 83% as many total interceptions
|
Group B has 48% as many total interceptions
|
2
|
In the interest of truth in advertising, I'm giving serious consideration to renaming this column next year to "let's continuously dunk on yards per carry". In our "high-ypc" sample, Kerryon Johnson and Matt Breida were both injured before getting even 20 carries. Of the remaining backs, Aaron Jones saw his yards per carry fall from 6.8 pre-prediction to 4.0 post-prediction. Nick Chubb fell from 6.2 to 3.6. Melvin Gordon fell from 5.4 to 4.6. Marlon Mack fell from 5.3 to 3.9. And Phillip Lindsay, our last back... well, Lindsay saw his yards per carry increase from 5.3 to 6.5, because yards per carry is random and that's how randomness works sometimes.
From our "low-ypc" sample, Saquon Barkley increased his yards per carry from 4.5 to 6.9. Alvin Kamara rose from 4.4 to 4.9. Lamar Miller rose from 4.2 to 5.9. David Johnson and Jordan Howard increased from 3.4 to 4.1 and 4.2, respectively. And Adrian Peterson and Sony Michel held steady at 4.3 across both samples. 60% of the healthy "high-ypc" backs averaged 4.0 yards per carry or fewer, while 0% of the "low-ypc" backs did the same. In total, the "low-ypc" backs averaged more carries per game (16.1 to 14.0), more yards per game (81.2 to 65.9), and more yards per carry (5.0 vs. 4.7). And if I tracked it for four more weeks, I think that ypc average would be as likely as not to flip back, because yards per carry simply isn't a thing, at least not in the kind of minuscule sample sizes we're dealing with over half of an NFL season.
As for the interceptions... when I set out the groups, the assumption was that the high-interception quarterbacks would throw more interceptions per game, but because there were more total low-interception quarterbacks, they'd throw more interceptions total. That's... not what has happened so far. Not only have the high-interception quarterbacks thrown fewer interceptions overall, but they've also thrown fewer interceptions per game, (0.61 vs. 0.82). And it's not a few terrible games from Group A dragging the average down, either; Group A quarterbacks have gone without an interception in 46% of their games. Group B have managed it in 50%.
As things stand now, if quarterbacks in Group B avoid interceptions in the next two weeks as well as they did in the first two weeks, Group A passers could throw zero interceptions and they'd still lose the prediction.
Regression is a Cruel Mistress (If You're Lucky)
It feels like the season has hardly begun and yet we already find ourselves with just three weeks remaining together. In Week 17 we will wrap up the season with a year-long retrospective, revisiting each of this season's predictions and showing how they've done in the weeks since they've closed. In Week 16, we'll have a special one-week championship game prediction.
Week 15, however, leaves us in limbo a bit; it's too late to start another standard prediction but too early to begin the end-of-year festivities. So I wanted to take this opportunity to be a little bit of a killjoy to all of the great teams out there cruising to a title.
Drew Brees was the #4 fantasy quarterback over the first twelve weeks of the season. Over the last two, he's QB23. Yes, Brees' full-season performance predicts how he'll do in week 15 and 16 better than his performance over the last two weeks... but that's cold comfort for fantasy players who were eliminated in week 14 after his poor showing. Ditto that for those with Jared Goff, who was fifth among quarterbacks over the first 13 weeks and 32nd in Week 14. (As a quick reminder: there are only 32 starting quarterbacks in any given week.)
In the fantasy regular season, Todd Gurley outscored the second-best running back by more than 58 points in standard scoring. In week 14, he finished tied for 38th behind guys like James Develin, Zach Line, and Trenton Cannon. Alvin Kamara, the #3 fantasy back in the regular season, finished 29th. James Conner and Melvin Gordon, the #5 and #9 fantasy backs, didn't play. James White, who was 10th in the regular season, was 57th in the first week of the fantasy playoffs.
Antonio Brown, the #3 fantasy receiver coming in, was WR64 last week. Zach Ertz, the #2 tight end, finished 16th, which is a lot worse than it sounds because tight ends as a whole produce so little; Ertz's 3.8 points were 0.1 more than Austin Hooper, Vance McDonald, Dalton Shultz, and C.J. Uzomah produced.
This, of course, is how regression works. A player has to overperform his true performance level to rise to the top of the leaderboard in the first place. Players who are overperforming their true performance level tend to come crashing back to earth. Fantasy football is a weekly game, and small, single-week samples wreak havoc on the standings. The players who won games to this point in the season are probably not the players who will win games going forward.
I know there are a lot of #1 seeds out there feeling pretty good about their fantasy teams right now; after all, the fact that they're the #1 seed means they scored a lot of points and won a lot of games. Unfortunately, almost every single team still alive in the fantasy playoffs today is more likely to lose in the next two weeks than it is to win the championship, and regression to the mean is a huge part of that.
I mostly play in dynasty leagues where savvy owners are able to build true juggernauts the likes of which you'd never see in redraft. My prized team currently features the QB1, RB2, RB6, RB10, WR4, WR5, WR6, WR12, TE2, TE8, and TE9, as well as defenses hand-picked weeks in advance specifically for their matchups in the playoffs. That team has a less than 50% chance of winning the title this year. (Part of the problem is that you can build better teams in dynasty, but you also tend to face stiffer competition in the playoffs.)
Perhaps this is a cruel way to kick off the holiday season, by pointing out that your favorite team is probably going to lose. But fantasy football is by definition zero-sum, which means every team's loss is another team's win, so that means there's good news to be shared, as well. To every 3, 4, 5, or 6 seed out there who struggled mightily just to scrape into the playoffs and who is surprised to find yourself still alive: you're probably not going to win the championship, either. Almost nobody is ever the odds-on favorite to win it all!
But you at least have regression operating as a tailwind instead of a headwind. When your team isn't that great, there's more room for it to pleasantly surprise you and less room for it to disappoint. Some of you will hit the positive side of regression for the first time this season and find glorious upsets in your future, much like GMs with Amari Cooper and Derrick Henry stole surprise wins in Week 14. Keep setting those lineups and hoping for the best, and remember that fantasy football is never fait accompli.