Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.
The Scorecard
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 42% more rushing yards per game
|
2
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 15% more points per game
|
3
|
Another week, another big edge for our high-volume, low-ypc backs. Alvin Kamara led all Group A running backs with just 17 carries, a total that seven out of nine backs in Group B managed to top. Through two weeks, Group B has a crushing 50% advantage in carries per game, while Group A's yard per carry edge has fallen from 2.31 yards all the way down to 0.21 yards.
As for our high-touchdown receivers, they did manage to keep a slight touchdown edge over their Group B counterparts, averaging 0.33 per game compared to 0.18 per game. But five out of nine Group A receivers failed to even reach 15 yards (compared to just one out of eleven Group B receivers), and Group B took an early scoring lead.
REVISITING PRESEASON EXPECTATIONS
In October of 2013, I wondered just how many weeks it took before the early-season performance wasn't a fluke anymore. In "Revisiting Preseason Expectations", I looked back at the 2012 season and compared how well production in a player's first four games predicted production in his last 12 games. And since that number was meaningless without context, I compared how his preseason ADP predicted production in his last 12 games.
It was a fortuitous time to ask that question, as it turns out, because I discovered that after four weeks in 2012, preseason ADP still predicted performance going forward than early-season production did.
This is the kind of surprising result that I love, but the thing about surprising results is that sometimes the reason they're surprising is really just because they're flukes. So in October of 2014, I revisited "Revisiting Preseason Expectations". This time I found that in the 2013 season, preseason ADP and week 1-4 performance held essentially identical predictive power for the rest of the season.
With two different results in two years, I decided to keep up my quest for a definitive answer about whether early-season results or preseason expectations were more predictive down the stretch. In October of 2015, I revisited my revisitation of "Revisiting Preseason Expectations". This time, I found that early-season performance held a slight predictive edge over preseason ADP.
With things still so inconclusive, in October of 2016, I decided to revisit my revisitation of the revisited "Revisiting Preseason Expectations". As in 2015, I found that this time early-season performance carried slightly more predictive power than early-season performance.
To no one's surprise, I couldn't leave well enough alone in October 2017, once more revisiting the revisited revisitation of the revisited "Revisiting Preseason Expectations". This time I once again found that preseason ADP and early-season performance were roughly equally predictive, with a slight edge to preseason ADP.
And of course, as a creature of habit, when October 2018 rolled around I simply had to revisit my revisitation of the revisited revisited revisitation of "Revisiting Preseason Expectations".
And now, as you've probably guessed, it's time for an autumn tradition as sacred as turning off the lights and pretending I'm not home on October 31st. It's time for "Revisiting Preseason Expectations"! (Or, I guess technically for Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Preseason Expectations.)
METHODOLOGY
If you've read the previous pieces, you have a rough idea of how this works, but here's a quick rundown of the methodology. I have compiled a list of the top 24 quarterbacks, 36 running backs, 48 wide receivers, and 24 tight ends by 2018 preseason ADP.
From that list, I have removed any player who missed more than one of his team’s first four games or more than two of his team’s last twelve games so that any fluctuations represent performance and not injury. As always, we’re looking by team games rather than by week, so players with an early bye aren't skewing the comparisons.
I’ve used PPR scoring for this exercise because that was easier for me to look up with the databases I had on hand. For the remaining players, I tracked where they ranked at their position over the first four games and over the final twelve games. Finally, I’ve calculated the correlation between preseason ADP and stretch performance, as well as the correlation between early performance and stretch performance.
Correlation is a measure of how strongly one list resembles another list. The highest possible correlation is 1.000, which is what you get when two lists are identical. The lowest possible correlation is 0.000, which is what you get when you compare one list of numbers to a second list that has no relationship whatsoever. (Correlations can actually go down to -1.000, which means the higher something ranks in one list the lower it tends to rank in the other, but negative correlations aren’t really relevant for this exercise.)
So if guys who were drafted high in preseason tend to score a lot of points from weeks 5-16, and this tendency is strong, we’ll see correlations closer to 1. If they don’t tend to score more points, or they do but the tendency is very weak, we’ll see correlations closer to zero. The numbers themselves don’t matter beyond “higher = more predictable”.
Here's the raw data for anyone curious. If you're willing to take my word for it, I'd recommend just skipping ahead to the "Overall" section below for averages and key takeaways.
QUARTERBACK
Player | ADP | Games 1-4 | Games 5-16 |
---|---|---|---|
Aaron Rodgers | 1 | 16 | 7 |
Deshaun Watson | 2 | 9 | 5 |
Russell Wilson | 3 | 21 | 6 |
Tom Brady | 4 | 19 | 10 |
Cam Newton | 5 | 7 | 14 |
Drew Brees | 6 | 5 | 12 |
Kirk Cousins | 8 | 6 | 16 |
Andrew Luck | 9 | 12 | 3 |
Matthew Stafford | 10 | 17 | 24 |
Philip Rivers | 11 | 8 | 13 |
Ben Roethlisberger | 12 | 10 | 2 |
Matt Ryan | 13 | 2 | 4 |
Jared Goff | 14 | 4 | 11 |
Patrick Mahomes II | 15 | 1 | 1 |
Dak Prescott | 16 | 24 | 8 |
Marcus Mariota | 17 | 29 | 25 |
Derek Carr | 18 | 18 | 23 |
Mitchell Trubisky | 20 | 14 | 17 |
Case Keenum | 22 | 27 | 19 |
Eli Manning | 23 | 23 | 18 |
The correlation between ADP and late-season performance was 0.435.
The correlation between early-season performance and late-season performance was 0.505.
RUNNING BACK
Player | ADP | Games 1-4 | Games 5-16 |
---|---|---|---|
Ezekiel Elliott | 2 | 5 | 4 |
David Johnson | 3 | 15 | 9 |
Alvin Kamara | 4 | 1 | 5 |
Saquon Barkley | 5 | 4 | 2 |
Todd Gurley | 6 | 2 | 3 |
Christian McCaffrey | 11 | 6 | 1 |
Jordan Howard | 13 | 27 | 22 |
LeSean McCoy | 14 | 55 | 37 |
Kenyan Drake | 15 | 37 | 13 |
Derrick Henry | 16 | 54 | 11 |
Lamar Miller | 17 | 32 | 23 |
Royce Freeman | 18 | 30 | 59 |
Sony Michel | 19 | 42 | 29 |
Jamaal Williams | 21 | 51 | 43 |
Dion Lewis | 22 | 23 | 30 |
Tevin Coleman | 23 | 21 | 21 |
Rashaad Penny | 24 | 63 | 63 |
Carlos Hyde | 26 | 9 | 80 |
Tarik Cohen | 27 | 22 | 10 |
Duke Johnson Jr | 29 | 50 | 36 |
Adrian Peterson | 32 | 14 | 25 |
Chris Carson | 33 | 38 | 16 |
Nick Chubb | 34 | 47 | 17 |
Peyton Barber | 35 | 59 | 24 |
The correlation between ADP and late-season performance was 0.428.
The correlation between early-season performance and late-season performance was 0.387.
Wide Receivers
Player | ADP | Games 1-4 | Games 5-16 |
---|---|---|---|
Antonio Brown | 1 | 17 | 2 |
DeAndre Hopkins | 2 | 5 | 4 |
Julio Jones | 4 | 8 | 5 |
Michael Thomas | 5 | 1 | 7 |
Keenan Allen | 6 | 31 | 10 |
Devante Adams | 7 | 16 | 1 |
Mike Evans | 8 | 3 | 11 |
Tyreek Hill | 9 | 6 | 3 |
Stefon Diggs | 10 | 9 | 13 |
T.Y. Hilton | 11 | 24 | 14 |
Amari Cooper | 12 | 33 | 20 |
Adam Thielen | 13 | 2 | 8 |
Jarvis Landry | 14 | 23 | 22 |
JuJu Smith-Schuster | 15 | 12 | 6 |
Larry Fitzgerald | 17 | 69 | 21 |
Brandin Cooks | 18 | 10 | 17 |
Golden Tate | 20 | 4 | 53 |
Chris Hogan | 21 | 63 | 70 |
Corey Davis | 23 | 29 | 37 |
Michael Crabtree | 25 | 51 | 61 |
Jordy Nelson | 27 | 30 | 51 |
Robert Woods | 29 | 15 | 12 |
Devin Funchess | 30 | 42 | 71 |
Robby Anderson | 32 | 90 | 27 |
Nelson Agholor | 34 | 40 | 40 |
Sterling Shepard | 36 | 26 | 38 |
Kenny Stills | 37 | 36 | 67 |
D.J. Moore | 38 | 75 | 30 |
Calvin Ridley | 39 | 11 | 32 |
Kelvin Benjamin | 40 | 84 | 97 |
Mike Williams | 41 | 37 | 31 |
Keelan Cole | 42 | 44 | 106 |
Kenny Golladay | 43 | 20 | 26 |
Josh Doctson | 44 | 114 | 50 |
Michael Gallup | 45 | 105 | 65 |
Tyler Lockett | 47 | 27 | 18 |
Anthony Miller | 48 | 89 | 47 |
The correlation between ADP and late-season performance was 0.645.
The correlation between early-season performance and late-season performance was 0.568.
Tight Ends
Player | ADP | Weeks 1-4 | Weeks 5-16 |
---|---|---|---|
Travis Kelce | 2 | 2 | 1 |
Zach Ertz | 3 | 3 | 2 |
Jimmy Graham | 4 | 11 | 13 |
Kyle Rudolph | 6 | 6 | 10 |
Trey Burton | 7 | 8 | 8 |
David Njoku | 8 | 22 | 7 |
George Kittle | 9 | 4 | 3 |
Mike Gesicki | 10 | 53 | 50 |
Eric Ebron | 11 | 5 | 4 |
Jared Cook | 12 | 1 | 6 |
Vance McDonald | 13 | 14 | 12 |
Cameron Brate | 14 | 26 | 22 |
Austin Hooper | 17 | 16 | 5 |
Antonio Gates | 18 | 28 | 33 |
Ricky Seals-Jones | 19 | 18 | 36 |
Dallas Goedert | 20 | 24 | 23 |
Ben Watson | 21 | 17 | 27 |
Nick Vannett | 22 | 41 | 28 |
Vernon Davis | 23 | 39 | 26 |
Gerald Everett | 24 | 63 | 18 |
The correlation between ADP and late-season performance was 0.537.
The correlation between early-season performance and late-season performance was 0.856.
Overall
Across all positions, the correlation between ADP and late-season performance was 0.642.
The correlation between early-season performance and late-season performance was 0.598.
After seven years of running this article and with nine years of collected data, how do things stand? Here are the correlations at each position. (I've only run positional breakdowns for the past four years and the two-factor averages for the past two years, hence the shorter charts.)
Quarterback
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.422
|
-0.019
|
|
2015
|
0.260
|
0.215
|
|
2016
|
0.200
|
0.404
|
0.367
|
2017
|
0.252
|
0.431
|
0.442
|
2018
|
0.435
|
0.505
|
0.579
|
Average
|
0.314
|
0.307
|
0.463
|
Running Back
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.568
|
0.472
|
|
2015
|
0.309
|
0.644
|
|
2016
|
0.597
|
0.768
|
0.821
|
2017
|
0.540
|
0.447
|
0.610
|
2018
|
0.428
|
0.387
|
0.447
|
Average
|
0.488
|
0.544
|
0.627
|
Wide Receiver
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.333
|
0.477
|
|
2015
|
0.648
|
0.632
|
|
2016
|
0.551
|
0.447
|
0.576
|
2017
|
0.349
|
0.412
|
0.443
|
2018
|
0.645
|
0.568
|
0.650
|
Average
|
0.505
|
0.507
|
0.556
|
Tight End
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
-0.051
|
0.416
|
|
2015
|
0.295
|
0.559
|
|
2016
|
0.461
|
0.723
|
0.716
|
2017
|
0.634
|
0.857
|
0.891
|
2018
|
0.537
|
0.856
|
0.708
|
Average
|
0.375
|
0.682
|
0.772
|
Overall
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2010-2012
|
0.578
|
0.471
|
|
2013
|
0.649
|
0.655
|
|
2014
|
0.466
|
0.560
|
|
2015
|
0.548
|
0.659
|
|
2016
|
0.599
|
0.585
|
0.682
|
2017
|
0.456
|
0.570
|
0.608
|
2018
|
0.642
|
0.598
|
0.668
|
Average
|
0.566
|
0.555
|
0.645
|
At quarterback, two of the last five seasons have favored preseason ADP. At running back and wide receiver, three of the past five seasons have favored preseason ADP. ADP and early-season performance have been virtually identical when it came to predicting rest-of-year performance.
At tight end, early-season performance has outperformed preseason ADP all five times and it's never been close. We can speculate why this has been the case all we want, but at this point, it is abundantly clear that early-season performance is substantially more predictive than preseason ADP at the tight end position (and only at the tight end position).
On the whole, though, 2018 continues to reinforce my prior belief that four games worth of stats gives us no more and no less information on a player than an offseason of study. If one person drafted a new team today straight from preseason ADP, and another drafted straight from current year-to-date rankings, both teams would probably do about equally well.
But the idea that it has to be either preseason ADP or early-season production is a false dichotomy. Most of us are closet Bayesians, which means we start with an opinion and update it with new evidence. In that case, we've reached the point of the season where we should give roughly equal weight to both factors.
Indeed, a simple average of preseason ADP and ranking through four games correlates with rest-of-year outcomes better than either factor alone; over the past three years, the correlation has been 0.645, nearly ten points higher than either ADP or early-season performance alone. Except, again, at the tight end position, where early-season performance is so important that adding preseason ADP into the mix actually reduces predictive power. Again, we could speculate all day as to why this is the case, but at tight end, you really should just be going off of early-season performances and ignoring ADP entirely by this point.
And while I've only been looking today at ADP and early-season performance, there are plenty of other factors you can keep in mind when valuing players going forward. How have they looked? Has their role changed? Are they hurt? Is there any other news coming out about them? I've demonstrated in the past that an award-winning projector like Bob Henry can outperform even a simple average of ADP and early-season results because Bob is a smart guy and knows how to account for all that other stuff. (Though for everything we do we'll probably never get much higher than correlations of 0.700.)
(As an additional aside, in the past I've seen a study similar to this that used points scored from the previous year instead of preseason ADP, and that study discovered that week three is the informational tipping point. This led to the quip that all of the hard work we put in during the offseason is basically to buy us one extra week before being wrong.)
This week's prediction is "early-season performances will tend to regress toward preseason ADP". This prediction isn't easily trackable, so I won't be adding it to the scorecard going forward. I have a hunch that I might just be revisiting this prediction sometime during October of 2020, though...