Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared.
THE SCORECARD
In Week 2, I laid out our guiding principles for Regression Alert. No specific prediction was made.
In Week 3, I discussed why yards per carry is the least useful statistic and predicted that the rushers with the lowest yard-per-carry average to that point would outrush the rushers with the highest yard-per-carry average going forward.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 24% more rushing yards per game
|
Group B has 8% more rushing yards per game
|
2
|
Yards:Touchdown Ratio
|
Group A had 28% more fantasy points per game
|
Group B has 69% more fantasy points per game
|
3
|
"Yards per carry is completely random", exhibit #103,962,411: in week 4, our "high-YPC" cohort averaged 4.219 yards per carry. Our "low-YPC" cohort averaged 4.224 yards per carry. Yards per carry is totally not actually a thing.
Also, if you were hoping for a dramatic result, our yard-to-touchdown ratio prediction last week certainly delivered one. Consider: over the first three weeks, our high-touchdown cohort scored 36 touchdowns in 39 player-games, or 0.92 touchdowns per game. Our low-touchdown cohort scored 4 touchdowns in 36 games, or 0.11 touchdowns per game.
In Week 4, 66% of "low-touchdown" players reached the end zone, and the group collectively scored 0.75 touchdowns per player game. Meanwhile, only 23% of "high-touchdown" players reached the end zone, and the group collectively scored 0.38 touchdowns per game, about half as many as Group B scored.
There's still a long way to go on both predictions, but the early returns show why I chose to focus attention here first.
REVISITING PRESEASON EXPECTATIONS
In October of 2013, I wondered just how many weeks it took before the early-season performance wasn't a fluke anymore. In "Revisiting Preseason Expectations", I looked back at the 2012 season and compared how well production in a player's first four games predicted production in his last 12 games. And since that number was meaningless without context, I compared how his preseason ADP predicted production in his last 12 games.
It was a fortuitous time to ask that question, as it turns out, because I discovered that after four weeks in 2012, preseason ADP still predicted performance going forward than early season production did.
This is the kind of surprising result that I love, but the thing about surprising results is that sometimes the reason they're surprising is really just because they're flukes. So in October of 2014, I revisited "Revisiting Preseason Expectations". This time I found that in the 2013 season, preseason ADP and week 1-4 performance held essentially identical predictive power for the rest of the season.
With two different results in two years, I decided to keep up my quest for a definitive answer about whether early-season results or preseason expectations were more predictive down the stretch. In October of 2015, I revisited my revisitation of "Revisiting Preseason Expectations". This time, I found that early-season performance held a slight predictive edge over preseason ADP.
With things still so inconclusive, in October of 2016, I decided to revisit my revisitation of the revisited "Revisiting Preseason Expectations". As in 2015, I found that this time early-season performance carried slightly more predictive power than early-season performance.
To no one's surprise, I couldn't leave well enough alone in October 2017, once more revisiting the revisited revisitation of the revisited "Revisiting Preseason Expectations". This time I once again found that preseason ADP and early-season performance were roughly equally predictive, with a slight edge to preseason ADP.
And now, as you've probably guessed, it's time for an autumn tradition as sacred as turning off the lights and pretending I'm not home on October 31st. It's time for "Revisiting Preseason Expectations"! (Or, I guess technically for Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Preseason Expectations.)
METHODOLOGY
If you've read the previous pieces, you have a rough idea of how this works, but here's a quick rundown of the methodology. I have compiled a list of the top 24 quarterbacks, 36 running backs, 48 wide receivers, and 24 tight ends by 2017 preseason ADP.
From that list, I have removed any player who missed more than one of his team’s first four games or more than two of his team’s last twelve games so that any fluctuations represent performance and not injury. As always, we’re looking by team games rather than by week, so players with an early bye aren't skewing the comparisons.
I’ve used PPR scoring for this exercise because that was easier for me to look up with the databases I had on hand. For the remaining players, I tracked where they ranked at their position over the first four games and over the final twelve games. Finally, I’ve calculated the correlation between preseason ADP and stretch performance, as well as the correlation between early performance and stretch performance.
Here's the data.
QUARTERBACK
Player
|
ADP
|
Games 1-4
|
Games 5-16
|
2
|
1
|
9
|
|
3
|
6
|
13
|
|
4
|
22
|
15
|
|
5
|
3
|
1
|
|
6
|
23
|
21
|
|
9
|
13
|
22
|
|
10
|
15
|
2
|
|
11
|
10
|
5
|
|
12
|
21
|
8
|
|
13
|
20
|
3
|
|
14
|
5
|
14
|
|
15
|
14
|
6
|
|
16
|
24
|
18
|
|
17
|
12
|
28
|
|
20
|
18
|
16
|
|
21
|
30
|
27
|
|
22
|
2
|
7
|
The correlation between ADP and late-season performance was 0.252.
The correlation between early-season performance and late-season performance was 0.431.
RUNNING BACK
Player
|
ADP
|
Games 1-4
|
Games 5-16
|
2
|
3
|
2
|
|
3
|
15
|
6
|
|
4
|
5
|
25
|
|
Melvin Gordon
|
5
|
17
|
5
|
7
|
33
|
35
|
|
8
|
32
|
20
|
|
9
|
11
|
17
|
|
10
|
1
|
3
|
|
12
|
47
|
28
|
|
13
|
42
|
18
|
|
14
|
20
|
9
|
|
15
|
2
|
8
|
|
17
|
14
|
21
|
|
18
|
8
|
11
|
|
20
|
34
|
34
|
|
C.J. Anderson
|
21
|
12
|
30
|
23
|
22
|
52
|
|
Mark Ingram
|
24
|
25
|
4
|
26
|
21
|
37
|
|
30
|
25
|
33
|
|
31
|
29
|
57
|
|
32
|
19
|
29
|
|
34
|
30
|
22
|
The correlation between ADP and late-season performance was 0.540.
The correlation between early-season performance and late-season performance was 0.447.
WIDE RECEIVER
Player
|
ADP
|
Games 1-4
|
Games 5-16
|
1
|
2
|
2
|
|
2
|
27
|
5
|
|
4
|
9
|
25
|
|
5
|
4
|
18
|
|
6
|
6
|
56
|
|
7
|
5
|
8
|
|
8
|
12
|
15
|
|
9
|
24
|
26
|
|
11
|
15
|
13
|
|
12
|
10
|
3
|
|
13
|
23
|
32
|
|
15
|
3
|
1
|
|
16
|
38
|
16
|
|
17
|
52
|
44
|
|
18
|
19
|
23
|
|
19
|
49
|
47
|
|
20
|
29
|
35
|
|
21
|
22
|
11
|
|
22
|
8
|
10
|
|
23
|
17
|
12
|
|
24
|
7
|
6
|
|
26
|
1
|
39
|
|
28
|
88
|
24
|
|
29
|
13
|
4
|
|
32
|
31
|
45
|
|
37
|
81
|
49
|
|
38
|
35
|
50
|
|
39
|
34
|
38
|
|
40
|
33
|
54
|
|
42
|
14
|
9
|
|
Marvin Jones
|
44
|
59
|
7
|
47
|
30
|
41
|
|
Ted Ginn Jr Jr
|
48
|
51
|
34
|
The correlation between ADP and late-season performance was 0.349.
The correlation between early-season performance and late-season performance was 0.412.
TIGHT END
Player
|
ADP
|
Games 1-4
|
Games 5-16
|
1
|
1
|
2
|
|
2
|
3
|
1
|
|
4
|
18
|
3
|
|
5
|
23
|
6
|
|
6
|
2
|
4
|
|
7
|
7
|
8
|
|
8
|
24
|
9
|
|
9
|
15
|
5
|
|
10
|
19
|
28
|
|
11
|
55
|
65
|
|
12
|
13
|
17
|
|
13
|
6
|
11
|
|
14
|
8
|
7
|
|
15
|
20
|
24
|
|
16
|
9
|
29
|
|
17
|
4
|
14
|
|
20
|
10
|
12
|
|
21
|
28
|
30
|
|
22
|
83
|
63
|
|
23
|
32
|
44
|
|
24
|
42
|
35
|
The correlation between ADP and late-season performance was 0.636.
The correlation between early-season performance and late-season performance was 0.857.
Overall
Across all positions, the correlation between ADP and late-season performance was 0.456.
The correlation between early-season performance and late-season performance was 0.570.
After six years of running this article and with eight years of collected data, how do things stand? Here are the correlations at each position. (I've only run positional breakdowns for the past four years and the two-factor averages for the past two years, hence the shorter charts.)
Quarterback
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.422
|
-0.019
|
|
2015
|
0.260
|
0.215
|
|
2016
|
0.200
|
0.404
|
0.367
|
2017
|
0.252
|
0.431
|
0.442
|
Average
|
0.284
|
0.258
|
0.405
|
Running Back
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.568
|
0.472
|
|
2015
|
0.309
|
0.644
|
|
2016
|
0.597
|
0.768
|
0.821
|
2017
|
0.540
|
0.447
|
0.610
|
Average
|
0.503
|
0.583
|
0.715
|
Wide Receiver
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
0.333
|
0.477
|
|
2015
|
0.648
|
0.632
|
|
2016
|
0.551
|
0.447
|
0.576
|
2017
|
0.349
|
0.412
|
0.443
|
Average
|
0.470
|
0.492
|
0.510
|
Tight End
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2014
|
-0.051
|
0.416
|
|
2015
|
0.295
|
0.559
|
|
2016
|
0.461
|
0.723
|
0.716
|
2017
|
0.634
|
0.857
|
0.891
|
Average
|
0.335
|
0.639
|
0.803
|
Overall
|
|||
Season
|
ADP
|
Early-Season
|
Avg of Both
|
2010-2012
|
0.578
|
0.471
|
|
2013
|
0.649
|
0.655
|
|
2014
|
0.466
|
0.560
|
|
2015
|
0.548
|
0.659
|
|
2016
|
0.599
|
0.585
|
0.682
|
2017
|
0.456
|
0.570
|
0.608
|
Average
|
0.557
|
0.555
|
0.645
|
At quarterback, running back, and wide receiver, two of the last four seasons have favored preseason ADP and two of the last four seasons have favored early-season performance. Overall, I'd say the two factors are basically in perfect balance, (or as close to perfect as you'll get from something of this nature).
At tight end, each of the last four years has favored early-season performance, which is enough for me to believe this might be a trend. This isn't to say that regression doesn't happen-- players still tend to move in the direction of preseason ADP. It's just to say that the spot they finally settle in tends to be closer to early-season performance.
Overall, though, this year just reinforces my prior belief that four games worth of stats gives us no more and no less information on a player than an offseason of study. If one person drafted a new team today straight from preseason ADP, and another drafted straight from current year-to-date rankings, both teams would probably do about equally well.
But the idea that it has to be either preseason ADP or early-season production is a false dichotomy. Most of us are closet Bayesians, which means we start with an opinion and update it with new evidence. In that case, we've reached the point of the season where we should give roughly equal weight to both factors.
Indeed, a simple average of preseason ADP and ranking through four games correlates with rest-of-year outcomes better than either factor alone, last year producing a robust 0.608 following a correlation of 0.682 in 2016. And I've demonstrated in the past that an award-winning projector like Bob Henry can outperform even that average, though for everything we do we'll probably never get much higher than correlations of 0.700.
(As an aside, in the past I've seen a study similar to this that used points scored from the previous year instead of preseason ADP, and that study discovered that week three is the informational tipping point. This led to the quip that all of the hard work we put in during the offseason is basically to buy us one extra week before being wrong.)
This week's prediction is "early-season performances will tend to regress toward preseason ADP". This prediction isn't easily trackable, so I won't be adding it to the scorecard going forward. I have a hunch that I might just be revisiting this prediction sometime during October of 2019, though...