Regression Alert: Week 5

Adam Harstad's Regression Alert: Week 5 Adam Harstad Published 10/03/2019

Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.

For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.

In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.

Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.

Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.


The Scorecard

In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.

In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.

In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.

Statistic For Regression
Performance Before Prediction
Performance Since Prediction
Weeks Remaining
Yards per Carry
Group A had 20% more rushing yards per game
Group B has 42% more rushing yards per game
2
Yard:Touchdown Ratio
Group A had 23% more points per game
Group B has 15% more points per game
3

Another week, another big edge for our high-volume, low-ypc backs. Alvin Kamara led all Group A running backs with just 17 carries, a total that seven out of nine backs in Group B managed to top. Through two weeks, Group B has a crushing 50% advantage in carries per game, while Group A's yard per carry edge has fallen from 2.31 yards all the way down to 0.21 yards.

As for our high-touchdown receivers, they did manage to keep a slight touchdown edge over their Group B counterparts, averaging 0.33 per game compared to 0.18 per game. But five out of nine Group A receivers failed to even reach 15 yards (compared to just one out of eleven Group B receivers), and Group B took an early scoring lead.


REVISITING PRESEASON EXPECTATIONS

In October of 2013, I wondered just how many weeks it took before the early-season performance wasn't a fluke anymore. In "Revisiting Preseason Expectations", I looked back at the 2012 season and compared how well production in a player's first four games predicted production in his last 12 games. And since that number was meaningless without context, I compared how his preseason ADP predicted production in his last 12 games.

It was a fortuitous time to ask that question, as it turns out, because I discovered that after four weeks in 2012, preseason ADP still predicted performance going forward than early-season production did.

This is the kind of surprising result that I love, but the thing about surprising results is that sometimes the reason they're surprising is really just because they're flukes. So in October of 2014, I revisited "Revisiting Preseason Expectations". This time I found that in the 2013 season, preseason ADP and week 1-4 performance held essentially identical predictive power for the rest of the season.

With two different results in two years, I decided to keep up my quest for a definitive answer about whether early-season results or preseason expectations were more predictive down the stretch. In October of 2015, I revisited my revisitation of "Revisiting Preseason Expectations". This time, I found that early-season performance held a slight predictive edge over preseason ADP.

With things still so inconclusive, in October of 2016, I decided to revisit my revisitation of the revisited "Revisiting Preseason Expectations". As in 2015, I found that this time early-season performance carried slightly more predictive power than early-season performance.

To no one's surprise, I couldn't leave well enough alone in October 2017, once more revisiting the revisited revisitation of the revisited "Revisiting Preseason Expectations". This time I once again found that preseason ADP and early-season performance were roughly equally predictive, with a slight edge to preseason ADP.

And of course, as a creature of habit, when October 2018 rolled around I simply had to revisit my revisitation of the revisited revisited revisitation of "Revisiting Preseason Expectations".

And now, as you've probably guessed, it's time for an autumn tradition as sacred as turning off the lights and pretending I'm not home on October 31st. It's time for "Revisiting Preseason Expectations"! (Or, I guess technically for Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Preseason Expectations.)

METHODOLOGY

If you've read the previous pieces, you have a rough idea of how this works, but here's a quick rundown of the methodology. I have compiled a list of the top 24 quarterbacks, 36 running backs, 48 wide receivers, and 24 tight ends by 2018 preseason ADP.

From that list, I have removed any player who missed more than one of his team’s first four games or more than two of his team’s last twelve games so that any fluctuations represent performance and not injury. As always, we’re looking by team games rather than by week, so players with an early bye aren't skewing the comparisons.

I’ve used PPR scoring for this exercise because that was easier for me to look up with the databases I had on hand. For the remaining players, I tracked where they ranked at their position over the first four games and over the final twelve games. Finally, I’ve calculated the correlation between preseason ADP and stretch performance, as well as the correlation between early performance and stretch performance.

Correlation is a measure of how strongly one list resembles another list. The highest possible correlation is 1.000, which is what you get when two lists are identical. The lowest possible correlation is 0.000, which is what you get when you compare one list of numbers to a second list that has no relationship whatsoever. (Correlations can actually go down to -1.000, which means the higher something ranks in one list the lower it tends to rank in the other, but negative correlations aren’t really relevant for this exercise.)

So if guys who were drafted high in preseason tend to score a lot of points from weeks 5-16, and this tendency is strong, we’ll see correlations closer to 1. If they don’t tend to score more points, or they do but the tendency is very weak, we’ll see correlations closer to zero. The numbers themselves don’t matter beyond “higher = more predictable”.

Here's the raw data for anyone curious. If you're willing to take my word for it, I'd recommend just skipping ahead to the "Overall" section below for averages and key takeaways.

QUARTERBACK

Player ADP Games 1-4 Games 5-16
Aaron Rodgers 1 16 7
Deshaun Watson 2 9 5
Russell Wilson 3 21 6
Tom Brady 4 19 10
Cam Newton 5 7 14
Drew Brees 6 5 12
Kirk Cousins 8 6 16
Andrew Luck 9 12 3
Matthew Stafford 10 17 24
Philip Rivers 11 8 13
Ben Roethlisberger 12 10 2
Matt Ryan 13 2 4
Jared Goff 14 4 11
Patrick Mahomes II 15 1 1
Dak Prescott 16 24 8
Marcus Mariota 17 29 25
Derek Carr 18 18 23
Mitchell Trubisky 20 14 17
Case Keenum 22 27 19
Eli Manning 23 23 18

The correlation between ADP and late-season performance was 0.435.
The correlation between early-season performance and late-season performance was 0.505.

RUNNING BACK

Player ADP Games 1-4 Games 5-16
Ezekiel Elliott 2 5 4
David Johnson 3 15 9
Alvin Kamara 4 1 5
Saquon Barkley 5 4 2
Todd Gurley 6 2 3
Christian McCaffrey 11 6 1
Jordan Howard 13 27 22
LeSean McCoy 14 55 37
Kenyan Drake 15 37 13
Derrick Henry 16 54 11
Lamar Miller 17 32 23
Royce Freeman 18 30 59
Sony Michel 19 42 29
Jamaal Williams 21 51 43
Dion Lewis 22 23 30
Tevin Coleman 23 21 21
Rashaad Penny 24 63 63
Carlos Hyde 26 9 80
Tarik Cohen 27 22 10
Duke Johnson Jr 29 50 36
Adrian Peterson 32 14 25
Chris Carson 33 38 16
Nick Chubb 34 47 17
Peyton Barber 35 59 24

The correlation between ADP and late-season performance was 0.428.
The correlation between early-season performance and late-season performance was 0.387.

Wide Receivers

Player ADP Games 1-4 Games 5-16
Antonio Brown 1 17 2
DeAndre Hopkins 2 5 4
Julio Jones 4 8 5
Michael Thomas 5 1 7
Keenan Allen 6 31 10
Devante Adams 7 16 1
Mike Evans 8 3 11
Tyreek Hill 9 6 3
Stefon Diggs 10 9 13
T.Y. Hilton 11 24 14
Amari Cooper 12 33 20
Adam Thielen 13 2 8
Jarvis Landry 14 23 22
JuJu Smith-Schuster 15 12 6
Larry Fitzgerald 17 69 21
Brandin Cooks 18 10 17
Golden Tate 20 4 53
Chris Hogan 21 63 70
Corey Davis 23 29 37
Michael Crabtree 25 51 61
Jordy Nelson 27 30 51
Robert Woods 29 15 12
Devin Funchess 30 42 71
Robby Anderson 32 90 27
Nelson Agholor 34 40 40
Sterling Shepard 36 26 38
Kenny Stills 37 36 67
D.J. Moore 38 75 30
Calvin Ridley 39 11 32
Kelvin Benjamin 40 84 97
Mike Williams 41 37 31
Keelan Cole 42 44 106
Kenny Golladay 43 20 26
Josh Doctson 44 114 50
Michael Gallup 45 105 65
Tyler Lockett 47 27 18
Anthony Miller 48 89 47

The correlation between ADP and late-season performance was 0.645.
The correlation between early-season performance and late-season performance was 0.568.

Tight Ends

Player ADP Weeks 1-4 Weeks 5-16
Travis Kelce 2 2 1
Zach Ertz 3 3 2
Jimmy Graham 4 11 13
Kyle Rudolph 6 6 10
Trey Burton 7 8 8
David Njoku 8 22 7
George Kittle 9 4 3
Mike Gesicki 10 53 50
Eric Ebron 11 5 4
Jared Cook 12 1 6
Vance McDonald 13 14 12
Cameron Brate 14 26 22
Austin Hooper 17 16 5
Antonio Gates 18 28 33
Ricky Seals-Jones 19 18 36
Dallas Goedert 20 24 23
Ben Watson 21 17 27
Nick Vannett 22 41 28
Vernon Davis 23 39 26
Gerald Everett 24 63 18

The correlation between ADP and late-season performance was 0.537.
The correlation between early-season performance and late-season performance was 0.856.

Overall

Across all positions, the correlation between ADP and late-season performance was 0.642.
The correlation between early-season performance and late-season performance was 0.598.

After seven years of running this article and with nine years of collected data, how do things stand? Here are the correlations at each position. (I've only run positional breakdowns for the past four years and the two-factor averages for the past two years, hence the shorter charts.)

Quarterback
Season
ADP
Early-Season
Avg of Both
2014
0.422
-0.019
2015
0.260
0.215
2016
0.200
0.404
0.367
2017
0.252
0.431
0.442
2018
0.435
0.505
0.579
Average
0.314
0.307
0.463
Running Back
Season
ADP
Early-Season
Avg of Both
2014
0.568
0.472
2015
0.309
0.644
2016
0.597
0.768
0.821
2017
0.540
0.447
0.610
2018
0.428
0.387
0.447
Average
0.488
0.544
0.627
Wide Receiver
Season
ADP
Early-Season
Avg of Both
2014
0.333
0.477
2015
0.648
0.632
2016
0.551
0.447
0.576
2017
0.349
0.412
0.443
2018
0.645
0.568
0.650
Average
0.505
0.507
0.556
Tight End
Season
ADP
Early-Season
Avg of Both
2014
-0.051
0.416
2015
0.295
0.559
2016
0.461
0.723
0.716
2017
0.634
0.857
0.891
2018
0.537
0.856
0.708
Average
0.375
0.682
0.772
Overall
Season
ADP
Early-Season
Avg of Both
2010-2012
0.578
0.471
2013
0.649
0.655
2014
0.466
0.560
2015
0.548
0.659
2016
0.599
0.585
0.682
2017
0.456
0.570
0.608
2018
0.642
0.598
0.668
Average
0.566
0.555
0.645

At quarterback, two of the last five seasons have favored preseason ADP. At running back and wide receiver, three of the past five seasons have favored preseason ADP. ADP and early-season performance have been virtually identical when it came to predicting rest-of-year performance.

At tight end, early-season performance has outperformed preseason ADP all five times and it's never been close. We can speculate why this has been the case all we want, but at this point, it is abundantly clear that early-season performance is substantially more predictive than preseason ADP at the tight end position (and only at the tight end position).

On the whole, though, 2018 continues to reinforce my prior belief that four games worth of stats gives us no more and no less information on a player than an offseason of study. If one person drafted a new team today straight from preseason ADP, and another drafted straight from current year-to-date rankings, both teams would probably do about equally well.

But the idea that it has to be either preseason ADP or early-season production is a false dichotomy. Most of us are closet Bayesians, which means we start with an opinion and update it with new evidence. In that case, we've reached the point of the season where we should give roughly equal weight to both factors.

Indeed, a simple average of preseason ADP and ranking through four games correlates with rest-of-year outcomes better than either factor alone; over the past three years, the correlation has been 0.645, nearly ten points higher than either ADP or early-season performance alone. Except, again, at the tight end position, where early-season performance is so important that adding preseason ADP into the mix actually reduces predictive power. Again, we could speculate all day as to why this is the case, but at tight end, you really should just be going off of early-season performances and ignoring ADP entirely by this point.

And while I've only been looking today at ADP and early-season performance, there are plenty of other factors you can keep in mind when valuing players going forward. How have they looked? Has their role changed? Are they hurt? Is there any other news coming out about them? I've demonstrated in the past that an award-winning projector like Bob Henry can outperform even a simple average of ADP and early-season results because Bob is a smart guy and knows how to account for all that other stuff. (Though for everything we do we'll probably never get much higher than correlations of 0.700.)

(As an additional aside, in the past I've seen a study similar to this that used points scored from the previous year instead of preseason ADP, and that study discovered that week three is the informational tipping point. This led to the quip that all of the hard work we put in during the offseason is basically to buy us one extra week before being wrong.)

This week's prediction is "early-season performances will tend to regress toward preseason ADP". This prediction isn't easily trackable, so I won't be adding it to the scorecard going forward. I have a hunch that I might just be revisiting this prediction sometime during October of 2020, though...

Photos provided by Imagn Images
Share This Article

Featured Articles