Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2020 and their final results. Here's the same list from 2019 and their final results, here's the list from 2018, and here's the list from 2017. Over four seasons, I have made 30 specific predictions and 24 of them have proven correct, a hit rate of 80%.
The Scorecard
In Week 2, I broke down what regression to the mean really is, what causes it, how we can benefit from it, and what the guiding philosophy of this column would be. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about yard-to-touchdown ratios and why they were the most powerful regression target in football that absolutely no one talks about, then predicted that touchdowns were going to follow yards going forward (but the yards wouldn't follow back).
In Week 5, we looked at ten years worth of data to see whether early-season results better predicted rest-of-year performance than preseason ADP and we found that, while the exact details fluctuated from year to year, overall they did not. No specific prediction was made.
In Week 6, I taught a quick trick to tell how well a new statistic actually measures what you think it measures. No specific prediction was made.
In Week 7, I went over the process of finding a good statistic for regression and used team rushing vs. passing touchdowns as an example.
In Week 8, I talked about how interceptions were an unstable statistic for quarterbacks, but also for defenses.
In Week 9, we took a look at JaMarr Chase's season so far. He was outperforming his opportunities, which is not sustainable in the long term, but I offered a reminder that everyone regresses to a different mean, and the "true performance level" that Chase will trend towards over a long timeline is likely a lot higher than for most other receivers. No specific prediction was made.
In Week 10, I talked about how schedule luck in fantasy football was entirely driven by chance and, as such, should be completely random from one sample to the next. Then I checked Footballguys' staff leagues and predicted that the teams with the worst schedule luck would outperform the teams with the best schedule luck once that random element was removed from their favor.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 10% more rushing yards per game | Group B has 4% more rushing yards per game | None (Win!) |
Yards per Touchdown | Group A scored 9% more fantasy points per game | Group B scored 13% more fantasy points per game | None (Win!) |
Passing vs. Rushing TDs | Group A scored 42% more RUSHING TDs | Group A is scoring 33% more PASSING TDs | None (Win!) |
Defensive Interceptions | Group A had 33% more interceptions | Group B has 48% more interceptions | 1 |
Schedule Luck | Group A had a 3.7% better win% | Group B has a 54.8% better win% | 3 |
Our run-heavy teams had 43 passing touchdowns and 61 rushing touchdowns in the first six weeks of the season. They totaled 24 passing touchdowns and 18 rushing touchdowns in the four weeks since, giving us our third win of the year. Today's NFL is a passing league and virtually no team should be expected to score more rushing touchdowns than passing touchdowns over a long timeline.
For our next prediction, two defenses from our high-interception group (the Bills and the Cowboys) combined to intercept a whopping seven passes (assisted by a favorable matchup against the hapless Jets). Despite that, 56% of our "high-interception" teams finished without a single interception compared to just 41% of our "low-interception" teams, which means our low-interception defenses managed to widen their advantage. Unless Group A combined to intercept 13 more passes than Group B this week, that prediction should cruise to another victory.
I said last week that schedule luck should immediately vanish from one sample to the next, and boy did it. In fact, our "lucky" teams finished with slightly fewer wins than their all-play record would have predicted last week, while our "unlucky" teams notched slightly more. Our "good but unlucky" squads went 5-2 in Week 10, but the real story of the day was our six "bad but lucky" squads combining for just a single win (and five losses). Somewhat appropriately, the one win belonged to a team that posted the second-lowest score of the week and lucked his way into a game against the one team that scored even less.
Gambler's Fallacy and Regression to the Mean
The goal of this column is to convince you to view regression to the mean as a force of nature, implacable and inevitable, a mathematical certainty. I can generate a list of players and, without knowing a single thing about any of them, predict which ones will perform better going forward and which will perform worse. I like to say that I don't want any analysis in this column to be beyond the abilities of a moderately precocious 10-year-old.
But it's important that we give regression to the mean as much respect as it deserves... and not one single solitary ounce more.
This is difficult, because regression is essentially the visible arm of random variation, and our brains are especially bad at dealing with genuine randomness. We're just not wired that way. We see patterns in everything. There's even a name for this hardwired tendency to "discover" patterns in random data: Apophenia.
A fun example of apophenia is pareidolia, or the propensity to "see" faces in random places. Our ancestors used to tell stories of the "Man in the Moon". We... type silly faces to communicate emotion over the internet. Yes, pareidolia is why I can type a colon and a close paren and you'll immediately know that I'm happy and being playful. :)
Our ability to "see" these faces is surprisingly robust. -_- is just three short lines, and not only do most people see a face, they also mentally assign it a specific mood. '.' works as well. With very subtle changes, I can convey massive differences in that mood. (¬â€¿¬) and (¬_¬) are remarkably similar, and yet the interpreted moods are drastically different.
(Did I start this entire digression just to have a thinly-veiled excuse to post some of my favorite emoticons? ¯\_(ツ)_/¯)
If some of the faces look weird to you, you might lack the necessary fonts to render them properly, in which case I'm terribly sorry. :-(
Another less-endearing manifestation of apophenia is formally called gambler's fallacy (and informally called "the reason Las Vegas keeps building bigger casinos"). We look at random sequences of events and instead of seeing faces, we see trends. A roulette wheel might land on 7 three times in six spins and suddenly we think the number 7 is "hot". Or a wheel might not land on 00 for three hundred straight spins and now we believe that 00 is "due". But randomness doesn't work that way; the odds of a roulette wheel landing on a number when it's "hot" are exactly the same as the odds of it landing on that number when it's "cold" (1 in 38 on an American-style "double zero" roulette wheel).
It's very tempting to see regression to the mean as the universe's enforcement mechanism for the gambler's fallacy. Wide receiver Jakobi Meyers entered last week with 1531 career yards and zero career touchdowns, so surely he was "due". Surely Regression to the Mean(tm) would intervene, would guarantee that a touchdown was in the cards. (Nevermind the fact that Meyers had been "due" for quite a while before that point and regression to the mean had not yet worked its magic.)
But a player's "true touchdown rate" after a long cold streak is exactly the same as his "true touchdown rate" after a long hot streak. Regression to the mean doesn't magically force cold streaks to follow hot streaks to restore balance to the universe. In fact, a player is just as likely to follow up a hot streak with another hot streak as he is to follow it with a cold streak. (Witness poor Jakobi Meyers following cold streak after cold streak after cold streak before finally breaking the drought.)
Over his first eight seasons in the NFL, DeAndre Hopkins scored one touchdown for every 167 yards he gained. That was probably reasonably close to his "true scoring rate". So far this season he's scored a touchdown for every 69 receiving yards, well below his previous career average. If Hopkins has another 700 yards and 0 touchdowns this year, he'll finish right at 167 yards per touchdown again, his "true rate". That might certainly appeal to our sense of fairness. Balance would be restored to the universe.
But going 700 yards without a touchdown would be wildly unlikely for a guy with a "true scoring rate" of 167 yards per touchdown. Just as unlikely as scoring 7 touchdowns on 486 yards. Instead, we'd expect him to score 4 touchdowns on that kind of workload. And if that's the case, because of his hot streak to start the year, Hopkins would finish the year with a touchdown rate much higher than we would have expected based on his previous history.
This burning need to find patterns whether any patterns exist or not can be a real hindrance in fantasy football. When we see a player on a lucky streak, we'll think he's "hot" and his luck will continue going forward. Or we'll think he's "due" and his luck will reverse going forward.
But the universe, the very nature of randomness itself, is unimpressed by our expectations. This is why many smart analysts prefer the term "reversion to the mean" instead of "regression to the mean", because it doesn't imply any specific directional force. When a player is coming off a particularly lucky stretch, the most likely result isn't another lucky stretch. And it's not an unlucky stretch, either. The expectation instead should be neutral luck. The expectation should be that the player in question simply... reverts back to his true mean going forward.