Drinen's Notebook: Thursday, October 10, 2002

A week 6 cheatsheet from down the hall

As you may know, I'm a mathematician. If you exit my office and turn in either direction, you'll find more mathematicians. If you get past the mathematicians to the left, you'll hit the chemistry department. To the right, physics. Most of the people on my hallway are fine folks, and I've learned a lot from them. But they all share a common flaw: they don't know anything about football. And I mean literally nothing.

But fantasy football is about statistics. And the people on my hallway do know statistics. They understand how to properly use data to draw conclusions. In my week two notebook, I took some data down the hall (metaphorically) and asked the question: "what would this data say to someone who knew absolutely nothing about football?"

I'm going to do the same thing this week with a different set of data. In particular, I'm going to take a stack of data consisting of every running back's statistics for every game during the past two years, and I'm going to (metaphorically) ask my colleagues to tell me who the top RBs of week 6 will be. I don't pretend this method will produce profound and revolutionary answers. It has obvious drawbacks, which I'll discuss at the end of the article. But it is objective. The people on my hall don't own Priest Holmes. They didn't just trade for Ricky Williams. And they sure have never been burned by a Fred Taylor injury. In short, they don't care. And that puts them in a good position to assess the situation with no pre-conceived notions.

OK, let's get started. If you want to know how well RB X will do against team Y, you first need to know:

How good is running back X?
How good is team Y at stopping running backs?

Those are the two basic ingredients in the model. The question is: what is the relative importance of each of those factors in making a sensible prediction? Is it 50/50? Is it more like 70/30 toward the RB? 90/10?

Now some details...

To answer the question "How good is RB X?" I'll simply use his average fantasy points per game. To answer the question "how good is team Y at stopping running backs?" I'll use their fantasy points allowed per game to RBs compared to the league average. For example, so far this year, the Jaguars have allowed RBs to collect an average of 13.5 points per game. Pretty good. The league average is 20.4 points per game for RBs. So the Jags D is 6.9 points per game better than average at stopping opposing RBs.

So I took data from every game played in 2000 and 2001 and recorded the following:

Going into the game, what was the RB's fantasy point per game average?
Going into the game, what was the defense's rating (e.g. the Jags would be -6.9 -- see the calculation above)?
How many fantasy points did the RB score in that game?

Because we only care about sorting out the RBs who might be startable, I threw out all games where the RB was averaging fewer than 7 points per game. Also, because per-game averages can look pretty funky, I threw out all instances where the RB had not yet played at least four games.

Then we feed all this data (703 games worth, as it turns out) to the computer and tell the computer to run something called a linear regression. The computer tells me the following:

Both variables (the RB's per-game average and the defense's rating) are unquestionably useful in predicting how well the RB will do. We would have suspected this, but it's nice to have it confirmed.
Based on the data we have, the (linear) equation that best models the situation is:
Predicted fantasy points = .86*(RB fant. pt. average) + .18*(defense rating) + 1.7

Take, for example, Eddie George against Jacksonville this weekend. Eddie's current average is 11.6 points per game. The Jags' D, as we discussed earlier, has a rating of -6.9. Plug it all in, and we come up with a projection of .86*11.6 + .18*(-6.9) + 1.7, which is 10.4.

Last week, I spilled a lot of electronic ink on the topic of playing matchups. Using this formula, we can try to get a handle on just how much weight the strength of the opposing defense should get. In particular, every point of defense rating is worth .18 points of projected fantasy points. Typically, the difference between the best defense in the NFL and the worst is about 20 rating points. 20 times .18 is 3.6, so at least according to this formula, you should never start a 12-point-per-game RB ahead of a 16-point-per-game RB, no matter how good the matchup looks for the 12-pointer or how bad it looks for the 16-pointer. Since every point of defensive rating is worth .18 and every point in the RB's average is worth .86, you could say, very roughly speaking, that the running back's numbers are about five times more important than the defense's numbers in predicting how the RB will do.

I want to take this just one step further. As I've written about on numerous occasions, the ability to gain yards is more stable than the ability to score TDs. As a result, RBs who get most of their points from yards are, in general, more likely to keep scoring points in the future than RBs who get a bigger proportion of their points from TDs. So I'm going to re-run the analysis, but instead of feeding the computer the RB's fantasy point average, I'm going to feed it his yards-per-game average and TDs-per-game average separately.

A remarkable thing happens.

TDs are irrelevant. The people on my hallway would tell me to forget about TDs altogether. If you already know yards, they'd tell me, then knowing TDs does not improve the model at all. Not a bit. And the model is better with just yards than it is with just fantasy points. Here's the new and improved model:

Predicted fantasy points = .14*(RB yards-per-game average) + .17*(defense rating) + .3

Let's see what the formula says about week 6:


 Name                  OPP  FPPG   Y/G  DEF  PROJ
-------------------------------------------------
 Priest Holmes         sdg  28.1   161   -7  20.7
 Charlie Garner        ram  23.9   149   -3  19.9
 LaDainian Tomlinson   kan  20.0   140    2  19.5
 Fred Taylor           ten  18.9   144   -2  19.4
 Ahman Green           nwe  13.5   135    1  18.7
 Ricky Williams        den  20.2   142   -6  18.5
 Jamal Lewis           ind  14.0   125    5  18.0
 Edgerrin James        bal  14.4   129   -4  17.1
 Deuce McAllister      was  18.1   121   -2  16.4
 Shaun Alexander       sfo  19.8   108    0  14.9
 Corey Dillon          pit  13.0   106   -1  14.5
 Garrison Hearst       sea  11.5    85   13  13.9
 Marshall Faulk        oak  15.2   104   -2  13.9
 Tiki Barber           atl  12.6   102   -1  13.9
 Stephen Davis         nor  14.9   104   -4  13.7
 James Stewart         min  10.7    87    7  13.2
 Travis Henry          hou  17.0    98   -3  13.1
 Lamar Smith           dal  17.6   104   -9  12.9
 Moe Williams          det  12.6    81    8  12.6
 Michael Pittman       cle   8.2    82    3  11.9
 Kevan Barlow          sea   8.2    67   13  11.5
 Michael Bennett       det   8.7    72    8  11.4
 Jamel White           tam   9.4    82   -7  10.3
 Antowain Smith        gnb   7.8    66    6  10.1
 Clinton Portis        mia   9.5    71   -4   9.2
 Emmitt Smith          car   8.7    75   -8   9.1
 Eddie George          jax  11.6    68   -7   8.3
 Olandis Gary          mia   7.2    57   -4   7.3
 Warrick Dunn          nyg  11.0    50   -2   6.7
 Stacey Mack           ten   9.0    30   -2   4.1
 John Simon            jax   7.1    26   -7   2.7

FPPG = the RB's fantasy points per game average
Y/G = the RB's yards per game average
DEF = the defense's rating (negative = good D, positive = bad D)               
PROJ = projected fantasy points for week 6

I have to turn this in before the official Footballguys week 6 cheatsheets come out, so I can't compare it to what that list looks like, but I'll speculate that it differs in the following ways. First and foremost, it doesn't take into account injuries and role changes. Second, it probably has guys like Ahman Green and Michael Pittman (high yards compared to TDs) higher than most lists. Third, this list probably weights strength-of-opponent less than most lists do. After two weeks of looking pretty hard at the issue, I'm now convinced that most poeple put too much emphasis on matchups.

Again, it's important to remember where this came from. This list is, in theory, the analysis of someone who knows nothing at all about football. The goal of this exercise is to obtain an objective and unbiased starting point for your RB projections. You are not only allowed but encouraged to tweak this list. But you should only tweak it based on things that are not included in the model. Move Garner down if you think his stratospheric numbers are a something of a fluke. Move people down if they're injured or benched. Move them up if their role is likely to expand (Portis?). But don't move them because they're playing a weak defense. That's already figured into the model (unless you believe the defense has suffered significant changes that don't show up in the stats -- a recent string of key injuries, for example).

Lessons learned

The main point of all this was to take these three facts:

RBs who have done well in the past will generally continue to do well in the future.
RBs who have relied on yardage in the past will generally do better in the future than similar-scoring RBs who have relied on TDs for their scoring.
Defenses who have prevented RBs from scoring in the past will generally continue to prevent them from scoring.

And determine exactly how important each of them is in relation to the others. What we learned is that:

Item #1 above (the quality of the RB) is much more important than item #3 (the quality of the D). The last model we looked at suggests that an RB who is averaging 120 yards per game but is facing the best D in the NFL would be a better bet than an RB who is averaging 95 and facing the worst D in the NFL. I said above that the quality of the RB is much more important than the quality of the D, but that's not exactly true. I suspect that, in actuality, both factors are roughly of equal importance. But the problem, as we talked about last week, is that it's very very hard to know the true quality of the defense. For that reason, it doesn't make sense to weight that factor very heavily in making your weekly decisions.
Item #2 seems to be somewhat useful in making predictions. Odd as this sounds, past yardage is a better predictor of future fantasy points than past fantasy points are.

Is this model any good? Well, "good" is a relative concept. I'm sure this model alone wouldn't fare too well in a cheatsheet competition. For one thing, it has no knowledge of injuries, and it refuses to say anything about RBs who are not averaging at least 7 fantasy points per game or who haven't played in at least 4 games (so it would've missed Damien Anderson last week, for example). But I do believe that if you took the rankings produced by this formula and altered them only to take into account known injuries and role changes, that it would stack up very well against other cheatsheets. I could be wrong about that, but I don't think so.

If there is interest (let me know), I'll cook up a WR and QB (and maybe TE, I suppose) version of these rankings as well and post them on a weekly basis.

Unless otherwise noted, all stats come from football-reference.com and the disclaimer applies