Sports Reference Blog

Approximate Value: Methodology

The purpose of this page is to post the full details of the Approximate Value method, Doug Drinen's method of putting a single numerical value on any player's season, at any position, from any year (for now, it's just any year since 1950).

AV Editions/Changes

Beta Version 0.9.1, January 2008

Doug sketched out a rough outline of what the AV system is in these two posts.

Beta Version 0.9.2, April 2008

Two more posts were made in which Doug fine-tuned the system for defensive players.

Beta Version 0.9.3, May 2008

In the form of a Top 200 list, Doug posted an AV revision that was nearly the official release version of the stat.

Version 1.0, May 2008

After making a few small tweaks to the formula, Doug released the first official version of Approximate Value.

Version 1.1, June 2013

Added AV for punters and kickers.

What follows are Doug's words (and mine re: kickers and punters), describing how the system works and the thought process that went into its creation...

The Gruesome Details

You should consider this an evolving document, as I will probably be forever tweaking some of the constants described below. At this point, though, it's stable enough that I'm willing to post it.

Offense

Every team gets this many points to divvy up among its offensive players:

team_offense_points = 100 * (team offensive points per drive) / (league average offensive points per drive),

where

offensive points per drive = (7*(rushTD+passTD) + 3*FG) / (rushTD + passTD + turnovers + punts + FGA)

Offensive line

As a unit, the offensive line for a given team will share this many points:

team_points_for_o_line = 5 / 11 * team_offense_points

How this figure was arrived at is discussed in part II; it goes back to this post, as do many of the constants in this method.

For each offensive lineman (and fullback and tight end), we define:

individual_points = [(games played) + 5*(games started)*(pos_multiplier)] * (all_pro_multiplier),

where pos_multiplier = 1.2 for tackles, 1.0 for guards and centers, 0.3 for fullbacks, and 0.2 for tight ends,

and all_pro_multiplier = 1.9 for first-team AP all-pro, 1.6 for second-team AP all-pro, and 1.3 for a pro bowler who was not first- or second-team all-pro. [NOTE: all_pro_multiplier is for tackles, guards, and centers only, not fullbacks or tight ends.]

Finally, each individual player receives this many points:

approx_value = (individual_points) / (sum of individual_points for all players on team) * (team_points_for_o_line)

Skill-position players

Since we know the entire offensive unit will get team_offense_points, and we gave team_points_for_o_line of those to the line, we have:

team_points_for_skill_positions = team_offense_points - team_points_for_o_line

Now we split that up into two pieces:

team_points_for_rushers = team_points_for_skill_positions * (.22) * [(team_rsh_yards / team_total_yards ) / .37 ]

The .22 figure is again based on theory described in part II. The .37 is the average rushing-yards-to-total-yards ratio of all teams from 1970--present. So a team with a typical run-pass ratio will have 22% of its skill position points allotted to rushing. A team that was more run heavy will have more of its points allotted to rushing, and so on.

Now every individual player gets the following share:

approx_value = (rushing yards) / (team rushing yards) * team_points_for_rushers

Finally, we give a small bonus (or impose a small penalty) to running backs who had 200 or more carries and whose yards per carry average was much higher or lower than the league average:

bonus = .75 * [(yards per rush) - (league yards per rush by RBs)], if the player's yards per rush is better than league average.

penalty = 2 * [(yards per rush) - (league yards per rush by RBs)], if the player's yards per rush is worse than league average.

Note that quarterbacks, wide receivers, and anyone else who compiles rushing yards is eligible to get approximate value points at this stage.

Now onto the passers and receivers....

team_points_for_passers = (team_points_for_skill_positions - team_points_for_rushers) * .26. (see part II for an explanation of the .26.)

So that leaves:

team_points_for_receivers = (team_points_for_skill_positions - team_points_for_rushers) * .74.

Anyone who had a receiving yard gets this many AV points:

approx_value = (receiving yards) / (team receiving yards) * team_points_for_receivers

(Eventually, I might want to work in a touchdown bonus here, but for now there isn't one.)

And similarly for passers.

approx_value = (passing yards) / (team passing yards) * team_points_for_passers

And, as with rushers, we add an efficiency adjustment here:

bonus = .5 * [(Adjusted yards per attempt) - (League average adjusted yards per attempt)], if the player's AYPA was better than league average.

penalty = 2 * [(Adjusted yards per attempt) - (League average adjusted yards per attempt)], if the player's AYPA was worse than league average.

Defense

First a bit of a preface...

In part II I said this:

I’ll just state upfront that this is a case where I’m not necessarily opposed to tweaking the metric until it gives us results we’re happy with, instead of picking a theoretical basis and forcing ourselves to stick with it. As I quoted Bill James in the last post: “These approximations are not intended to tell you anything at all about the player that you do not already know.” They’re not supposed to teach us new things; they’re merely supposed to codify the things we already know, so it’s OK to cook the books a little bit until they do tell us what we already know. The problem here is that none of us really knows how to compare Tarik Glenn’s 2006 to Gary Clark’s 1991. And to the extent that we do “know,” we all “know” different things. The point is: while I do think we need some sort of theory to get us started in certain areas, I won’t be too apologetic about making some arbitrary changes if a strict application of the theory leads us to “wrong” answers.

On the offensive side, I let the theory drive the method for the most part, only tweaking the constants a little bit.

On the defensive side of the ball, things just aren't so clear. If we split a defense's production into rushing defense and passing defense, to what extent do then divvy up the passing defense points between pass rushers and pass defenders? I really don't have a clue how to answer that question in general. How much credit do linebackers get for pass defense versus run defense? I don't know. How do we account for the fact that some teams use three linemen and four linebackers and some do the opposite?

I'd rather admit ignorance than pretend to know. So where that leaves me is with defensive players being treated somewhat similarly to offensive linemen. That is, while stats will figure into it to some extent, a defensive player's rating will be largely based on how many games he played, how many games he started, how good his team was defensively, and whether he garnered any all-pro or pro bowl honors.

Here goes....

team_defense_points = 100 * [ (1 + 2 M - M^2) / (2 M) ],

where M = (team defensive points allowed per drive) / (league average defensive points allowed per drive)

team_points_for_front_7 = (2/3) * team_defense_points

team_points_for_secondary = (1/3) * team_defense_points

These two calculations come from the same theory that was used to determine some of the constants on the offensive side. Namely, this post.

Now, for all defensive players, we compute:

individual_points = [(games played) + 5*(games started) + sacks + 4*(fumble recoveries) + 4*(interceptions) + 5*(defensive TDs) + (tkl_constant)*(tackles)] + (all_pro_bonus),

where

tkl_constant = 0 if the year is before 1994, and otherwise, tkl_constant = .6 if the player is a defensive lineman, .3 if the player is a linebacker, and 0 of the player is a defensive back.

all_pro_bonus = (all_pro_level)*(year_multiplier),

where

all_pro_level = 1.5 for first-team all-pro, 1.0 for second-team all-pro, and 0.5 for pro bowler

year_multiplier = (year_constant) * (number_of_games_multiplier),

where year_constant = 40 for the pre-sack years (1970--1981) and 80 for the post-sack years (1982--present), and
number_of_games_multiplier = (number of games played by each team in that season) / 16

The point of all these fudge factors is to try to maintain a somewhat equal weight on making the pro bowl across seasons where the stat part of the equation is larger or smaller leaguewide. For example, in 2007, when players get credit for sacks and play 16 games, the stat part of the equation will tend to be larger than in 1972, where players don't get credit for sacks and played 14 games. Thus we need different bonuses, or else all-pro-ness will be diluted in 2007 compared to 1972.

Now, each front-seven player gets:

approx_value = [ (individual_points) / (sum of individual_points for all front-seven players on the team) ] * team_points_for_front_7

and each defensive back gets:

approx_value = [ (individual_points) / (sum of individual_points for all defensive backs on the team) ] * team_points_for_secondary

Returns

Every player gets one point of approx_value for each kick or punt return TD.

Kickers

At the moment, Kicking AV is based solely on field goal & extra point performance. The core stat that determines a kicker's performance is Kicking Points Above Average (PAA), which is derived by comparing a player's XP% and his FG% at various distances (0-19 yds, 20-29 yds, 30-39 yds, 40-49 yds, 50+ yds) to the league average in the same category, to determine the number of points he added above what a league-average kicker would produce in the same number of chances.

PAA_total = PAA_xp + PAA_fg1 + PAA_fg2 + PAA_fg3 + PAA_fg4 + PAA_fg5 + PAA_fg_u

where

PAA_xp = xpm - xpa * lg_xp_pct

PAA_fg1 = 3 * (fgm1 - fga1 * lg_fg1_pct)

PAA_fg2 = 3 * (fgm2 - fga2 * lg_fg2_pct)

PAA_fg3 = 3 * (fgm3 - fga3 * lg_fg3_pct)

PAA_fg4 = 3 * (fgm4 - fga4 * lg_fg4_pct)

PAA_fg5 = 3 * (fgm5 - fga5 * lg_fg5_pct)

PAA_fg_u = 3 * (fgm_u - fga_u * lg_fg_pct)

where fgm_u and fga_u are field goals made and attempted that are unaccounted for by the distance categories. We have complete distance data going back to 1969; from 1960-1968 we have partial distances for some players; before 1959, no distances are known. In the case of PAA from unaccounted field goals, kickers are compared to the overall league-average FG%.

At this point, I should digress into how I derived the AV scale for kickers and punters. Thanks to Rodney Fort, we have NFL salary data for the 2002-2009 seasons, all of which delineate punters/kickers from other positions, and some of which delineate Ks and Ps from each other. From this data and the existing scale of AV, we determined that the typical 16-game, 32-team season should see 170 points of AV given out to kickers/punters -- 100 (or 3.125/team) given to kickers in total, and 70 (or 2.1875/team) given to punters. The scale should be such that the average full-time kicker gets around 3 AV, with the best getting 6-7 in a given season.

From this, we can convert PAA into Approximate Value. First, determine what share of the team's "kicking playing time" the player has received:

k_playing_time = xpa + 3 * fga

pct_team_playing_time = k_playing_time / ?(team k_playing_time)

Determine the amount of AV the kicker would get in his playing time if he had an average PAA (prorating the league-average 3.125 figure from above downwards for seasons of fewer than 16 games):

avg_AV = (3.125 / 16) * team_games * pct_team_playing_time

Then adjust up or down based on his PAA_total (dividing by 5 arbitrarily calibrates the results to match the AV scale we described above):

raw_AV = avg_AV + (PAA_total / 5)

The final step is to prorate this back up to 16 team games for seasons with unusual schedules:

approx_value = 16 * (raw_AV / team_games)

Punters

Right now, punting AV is determined using gross punting average and the ability to avoid blocked punts.

Just like with kickers, the first step is determining productivity above/below average. We're using Adjusted Punt Yards, which are gross punt yards with a 13-yard penalty for blocks (the rationale being that the punter is 13 yards behind the LOS when a block occurs, but we don't assess the full 50-yard fumble/turnover penalty -- as described in The Hidden Game of Football -- because the team was punting/turning the ball over anyway).

adj_punt_ypa = (punt_yds - 13 * punt_blocked) / (punt + punt_blocked)

Then, for each league-season, compute the league's average adj_punt_ypa using individual punters' (punt + punt_blocked) totals as the weights. Then figure out how many adjusted punt yards the player added above/below the league average:

adj_punt_yds_above_avg = (punt + punt_blocked) * (adj_punt_ypa - lg_adj_punt_ypa)

We then begin to convert adj_punt_yds_above_avg into Approximate Value. Like with kickers, we determine what share of the team's "punting playing time" the player has received:

pct_team_playing_time = (punt + punt_blocked) / ?[team (punt + punt_blocked)]

Then calculate the amount of AV the punter would get in his playing time if he had an average adj_punt_yds_above_avg (again prorating the league-average 2.1875 figure from above downwards for seasons of fewer than 16 games):

avg_AV = (2.1875 / 16) * team_games * pct_team_playing_time

Then adjust this up or down based on his adj_punt_yds_above_avg (dividing by 200 arbitrarily calibrates the results to match the AV scale we described above):

raw_AV = avg_AV + (adj_punt_yds_above_avg / 200)

The final step is to prorate this back up to 16 team games for seasons with unusual schedules:

approx_value = 16 * (raw_AV / team_games)