Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2020 and their final results. Here's the same list from 2019 and their final results, here's the list from 2018, and here's the list from 2017. Over four seasons, I have made 30 specific predictions and 24 of them have proven correct, a hit rate of 80%.
The Scorecard
In Week 2, I broke down what regression to the mean really is, what causes it, how we can benefit from it, and what the guiding philosophy of this column would be. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about yard-to-touchdown ratios and why they were the most powerful regression target in football that absolutely no one talks about, then predicted that touchdowns were going to follow yards going forward (but the yards wouldn't follow back).
In Week 5, we looked at ten years worth of data to see whether early-season results better predicted rest-of-year performance than preseason ADP and we found that, while the exact details fluctuated from year to year, overall they did not. No specific prediction was made.
In Week 6, I taught a quick trick to tell how well a new statistic actually measures what you think it measures. No specific prediction was made.
In Week 7, I went over the process of finding a good statistic for regression and used team rushing vs. passing touchdowns as an example.
In Week 8, I talked about how interceptions were an unstable statistic for quarterbacks, but also for defenses.
In Week 9, we took a look at JaMarr Chase's season so far. He was outperforming his opportunities, which is not sustainable in the long term, but I offered a reminder that everyone regresses to a different mean, and the "true performance level" that Chase will trend towards over a long timeline is likely a lot higher than for most other receivers. No specific prediction was made.
In Week 10, I talked about how schedule luck in fantasy football was entirely driven by chance and, as such, should be completely random from one sample to the next. Then I checked Footballguys' staff leagues and predicted that the teams with the worst schedule luck would outperform the teams with the best schedule luck once that random element was removed from their favor.
In Week 11, I walked through how to tell the difference between regression to the mean and gambler's fallacy (which is essentially a belief in regression past the mean). No specific prediction was made.
In Week 12, I showed how to use the concept of regression to the mean to make predictions about the past and explained why the average fantasy teams were close but the average fantasy games were not. As a bonus, I threw in another quick prediction on touchdown over- and underachievers (based on yardage gained).
In Week 13, I went through the rabbit hole and investigated how performance in Regression Alert was also subject to regression to the mean, and how our current winning streak was unsustainable and destined to end sometime.
In Week 14, I talked about why larger samples were almost always better than smaller subsamples and how "hot streaks" were often just an illusion. I made two specific predictions: that a group of "hot" players would cool back down to their season average, and that the ice-cold Ja'Marr Chase would heat back up again.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 10% more rushing yards per game | Group B has 4% more rushing yards per game | None (Win!) |
Yards per Touchdown | Group A scored 9% more fantasy points per game | Group B scored 13% more fantasy points per game | None (Win!) |
Passing vs. Rushing TDs | Group A scored 42% more RUSHING TDs | Group A is scoring 33% more PASSING TDs | None (Win!) |
Defensive Interceptions | Group A had 33% more interceptions | Group B had 24% more interceptions | None (Win!) |
Schedule Luck | Group A had a 3.7% better win% | Group B has an 18.5% better win% | None (Win!) |
Yards per Touchdown | Group A scored 10% more fantasy points per game | Group B has 26% more fantasy points per game | 1 |
"Hot" Players Regress | Players were performing at an elevated level | They have regressed 53.7% to their season average | 3 |
Yards per Route Run | Group A led by 109% | Not yet evaluable | 3 |
Even with the Patriots 3-attempt game gumming up the works, Groups A and B are performing true to form in our yard-to-touchdown prediction. Both groups are averaging essentially the same amount of yards per game (down 12% for Group A, down 9% for Group B), which makes sense because yards per game is a very sticky statistic. But both groups have seen their yard-to-touchdown ratios regress dramatically into the sustainable band, and now Group B is outscoring Group A by 26%.
As for our "hot" players, some continued or even increased their scorching streaks, but when you look at the group as a whole, it regressed 53.7% of the way back to its full-season average. Put differently, in Week 14 our "hot" players scored about halfway between their full-season performance and their last-four-weeks performance. Now recall that in order to "win" this prediction they'll need to regress at least 66%— or twice as close to their full-season average as their hot streak— but Week 14 at least puts us within striking distance.
As for JaMarr Chase, he already shed the title of "coldest player in fantasy football" with a 25-point game that's his third-best of the season. Looking specifically at his yards per route run average, he gained 77 yards in 39 routes for an average of 1.97 that's quite close to his full-season average. How does that compare to the other rookies? Well... with DeVonta Smith and Jaylen Waddle on bye and Elijah Moore on IR, we won't have anything to compare to until next week.
Playoff Teams Regress
My s*** doesn't work in the playoffs. My job is to get us to the playoffs. What happens after that is f****** luck.
-Billy Beane
We spend all year working to get our teams into the playoffs. We seek out every edge we can exploit. We learn about regression to the mean and we harness the forces of randomness to carry us onward.
But randomness does not hold a harness well. It is wild and it is chaotic. And so, despite our efforts, our most likely reward for reaching the postseason is a season-ending loss.
It's important to realize that this is not a failure on our parts. It's tempting to think that we can control the chaos, we can find the exact right sleeper against the exact right matchup and push ourselves over the finish line. Or maybe we're focused on the order and not the chaos. Maybe most teams are more likely than not to lose, but certainly not our best ones. Certainly, our 12-2 team, armed with a bye and outscoring all competition by 10 points per game, has at least a better-than-a-coinflip shot at the title.
I've written several columns so far this year with an eye towards demonstrating that that's simply not the case. Sure, someone is going to end the season holding a trophy. But for 99.9% of fantasy teams out there, it's more likely than not that that someone isn't you.
Nor is that someone me. I have a stacked team that's rolled its way to a #1 seed and a first-round bye. This team is averaging 157.3 points per game, which is 15 ahead of second place and 22 points per game ahead of the average among playoff teams. I'd love to think that this dominant team is destined to hang a banner at the end of the year.
Except... that's "arrow of time" thinking; remember, per-game averages seem stable but they hide wild game-to-game swings. A 15 or 22 point edge would be a big deal if it was rare for the margin of a game to be greater than 15 or 22 points. But as we demonstrated earlier this year, that's remarkably common. A 22-point head start is hardly as daunting as it seems.
Instead of points per game, let's look at other metrics of team quality. This team has an all-play winning percentage of 75.3% (116 wins against 38 losses), and we've demonstrated that all-play records are relatively stable across samples. In order to have a 50/50 shot at winning back-to-back games, you'd need a 70.7% all-play winning percentage. (70.7% * 70.7% = 49.98%.) We easily clear that threshold here; our chances of winning back-to-back games against random opponents would be around 56.7%. So we're at least the odds-on favorite for the title, right?
Not so fast; "random opponents" is doing a lot of lifting in that sentence. The average playoff team is better than the average team, and the average championship-game team is better than the average playoff team. How can we adjust for this? There's no perfect solution, but I do have a few quick hacks.
Collectively, the other eleven teams in the league have 808 combined all-play wins, with playoff teams combining for 481 of them (59.5%). Meanwhile, the rest of the league has 886 all-play losses, with the playoff teams comprising 289 (32.6%).
Since the playoff teams account for 32.6% of all the all-play losses, I'll assume that they account for 32.6% of my 116 all-play wins. And likewise, I'll assume they provided 59.5% of my 38 all-play losses. Multiplying out, my "playoff-adjusted all-play record" would be 37.8 and 22.6, or 62.6%. And if that represents my odds of winning a game against a playoff-caliber team, my chances of winning two such games back-to-back are just 39.2%. My chances of taking home a title are barely better than one out of three.
And these odds are likely an overestimate. They assume playoff teams are roughly equal in quality; the bigger the differences between the best and the worst teams, the worse your title odds become. Imagine a league where two teams had the top two scores every week and both finished with identical all-play winning percentages of 95.4%. Despite the tremendous record, neither team can have odds better than 50/50 because each team would almost certainly have to get past the other.
Additionally, these calculations assume that all-play records are a good predictor. They are in a broad sense, but this approach doesn't allow for chaos. Players get injured, surprise breakouts happen, and there's generally nothing more inevitable than the unexpected. To the extent that chaos has a bias, it favors weaker teams and cuts against stronger ones, because weaker teams have more to gain and stronger teams have more to lose.
This isn't to say that no team's chances at a title can ever top 50%. I've been running variations of this exercise for nearly a decade now and in that time I've seen one team that came out above 50%. (Not one of my own, unfortunately.) It had an all-play winning percentage of 87.7% and finished as the #1 overall scorer in 9 out of the 17 weeks of the season. It did, in fact, win a title, but even that wasn't inevitable; over the course of the year, it happened to lose four times in the eight weeks in which a loss was possible (despite an average weekly finish of 4th in those weeks).
For the rest of us, title odds somewhere in the neighborhood of 30-40% are the best we can hope for. My team is starting the #3 quarterback, three Top 5 running backs, three Top 15 receivers, and the #2 tight end per Footballguys' rest-of-season rankings. (Plus carrying another Top 5 quarterback, Top 10 running back, Top 20 receiver, and Top 10 tight end behind them as depth.) I'm proud of the team that I've assembled and I think even pushing my odds as high as 30-40% is a testament to its quality. I hope I don't lose sight of this accomplishment if the most likely thing happens and I lose sometime in the next three weeks.
And I hope you don't lose sight of your accomplishments, either. When your playoff teams lose, let yourself off the hook. It's not your fault. Our s*** just doesn't work in the playoffs.
Most importantly, if you buck the odds and take home a title, don't take it for granted or view it as an inevitable outcome to a great season. Cherish it like the rare and unlikely event that it truly was.