Sports Reference Blog

Archive for the 'Advanced Stats' Category

FBref Scouting Reports and Similar Players Launched

10th February 2021

FBref is happy to announce the release of a feature we've been excited about for a while, player Scouting Reports that give you a quick look at how players compare in various statistics to other players at their position. This is currently available for players in the Big Five men's European leagues (example: Mohamed Salah), Major League Soccer (example: Diego Rossi) and the Women's Super League (example: Sam Kerr). We show 20 categories on the main Scouting Report at the top of a player's page, selected based on feedback from user research and industry experts, but you can also click through to a Complete Scouting Report which shows many more categories to compare the players by.

In addition, we have added a Similar Players table which locates the players that have the most similar percentiles in the stats used in the Scouting Reports. That table also offers Compare links which takes you to our Player Comparison tool so you can see the players' statistics side-by-side.

For more information on how the Scouting Report works, we have a longer explainer on FBref. This would not be possible without the wide array of advanced stats provided by Statsbomb, so thanks to them.

Depending on how people react, we could even adapt this feature for our other Sports-Reference sites in the future. Because of that, we are eager to hear people's thoughts on this new feature, so feel free to contact us via our feedback form.

Posted in Advanced Stats, Announcement, FBref, Features, Statgeekery | Comments Off on FBref Scouting Reports and Similar Players Launched

December 2020 WAR Update

14th December 2020

We recently fixed an issue where, because of the abbreviated 2020 season, we were not allocating enough wins to position players when calculating Wins Above Replacement. We have fixed this issue across Baseball-Reference. With this change, no position player gained more than 0.3 WAR, and no position player lost WAR. All pitcher WAR remained the same.

You can review the changes for each player here: https://docs.google.com/spreadsheets/d/18WY53wSt0GrBMMijLiIFMhVtvbmjuhbYNOaTvHfs-gE/edit?usp=sharing

If you have any questions or concerns, feel free to contact us through our feedback form.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Statgeekery, WAR | Comments Off on December 2020 WAR Update

Adjusted Shooting Stats Added to Basketball Reference

1st June 2020

There's been much debate about the greatest players in NBA history of late. One of the most difficult things about ranking players in a league with 70+ years of history is that the game has changed a lot over the years. Sure, some of it has to do with the skill and quality of the players. But some of it also has to do with the quality of the balls, the floors, the rims, the training, the travel, the accommodations, available nutrition and pretty much any other variable you can think of. For a better idea of how the league has changed over time, please see this table of league averages for each season in the history of the NBA. As you can see, 2019-20 is the fifth straight season in which a new league-wide eFG% record has been set. There are clearly things at play here beyond just player improvement. Though today's players are certainly more skilled than the ones that produced a league-wide 27.9 FG% in 1946-47 (the first year of the NBA's 'official' forerunner the BAA, which was objectively worse than the league it eventually merged with, the NBL).

To help bring a bit of objectivity to cross-era comparisons, we have added an Adjusted Shooting table to all player, team and season pages. These tables will show a player's shooting percentages and tendencies, as well as league-wide percentages and tendencies, and then scale them. Like OPS+ on our baseball site it will be scaled so that 100 represents a league-average shooter. 125 is 25% better than average and 75 is 25% worse than average. These figures are obtained by taking the player's shooting percentage, dividing it by the league-wide shooting percentages and then multiplying it by 100. So 125 doesn't mean a player was 25 percentage points above average, but 25 percent above average. We are also publishing adjusted versions of 3-point Attempt Rate and Free Throw Rate to give a better idea of how often the player shot 3s or got to the line relative to their era.

Additionally, we have calculated Field Goal Points Added and True Shooting Points Added to show how many points each player scored above or below what a league average player would have scored given an equal number of field goal attempts or true shot attempts, respectively. This is to show which players combined volume and efficiency (or those that combined volume with inefficiency, for that matter).

Read the rest of this entry

Posted in Advanced Stats, Announcement, Basketball-Reference.com, Data, Features, History, Statgeekery | 7 Comments »

BPM 2.0 on College Basketball Reference

28th May 2020

In February, our pro basketball site incorporated Daniel Myers' BPM 2.0, the update to the classic Box Plus Minus measurement. We have now completed that update for College Basketball Reference as well. BPM 2.0 aims to estimate a player's performance relative to league average by using a player's box score information and his team's overall performance.

BPM 2.0 will appear on College Basketball Reference in the same places you found old BPM, and is available back to the 2010-11 season. Leaderboards have been updated to reflect the new measurement. BPM 2.0 also allows for game-level calculations, which means that our box scores since the 2010-11 season will now include BPM 2.0 in the Advanced table.

For more information on why the update was made, you can refer to our February blog post on the BBR update, as well as Daniel Myers' in-depth explainer on how BPM 2.0 is calculated. We thank Myers for his contributions and we hope you enjoy the addition to College Basketball Reference.

Posted in Advanced Stats, Announcement, CBB at Sports Reference, Features | Comments Off on BPM 2.0 on College Basketball Reference

Big 5 Leagues Pages on FBref

27th May 2020

FBref covers basic and advanced statistics for dozens of domestic leagues around the world, with the English Premier League, Spanish La Liga, German Bundesliga, Italian Serie A and French Ligue 1, commonly referred to as the "Big 5", being the most visited league stat pages. Up to this point, if you wanted to compare statistics between the leagues, you'd need to have a tab open for each one.

That will no longer be as necessary now that FBref has added combined Big 5 stat pages, with a combined league table, leaderboards across the 5 leagues and stat registers that include players who've played in any of the leagues. In the Player Standard Stats section, sorting by G+A-PK per 90 minutes gives you Jadon Sancho (Bundesliga), Kylian Mbappé (Ligue 1) and Lionel Messi (La Liga) at the top this season. In the Squad Goal and Shot Creation section, you can see Bayern Munich and Dortmund are leading all Big 5 teams in Goal Creating Actions per 90 minutes.

Check out our new Combined Big 5 pages and so much more that we offer at FBref! You can keep up with the latest additions of statistical coverage and new features here on the Sports Reference Blog, or by signing up for the This Week in Sports Reference mailing list. Feel free to send us any questions or suggestions through our feedback form or FBref's official Twitter account.

Posted in Advanced Stats, Announcement, FBref, Features, Leaders | Comments Off on Big 5 Leagues Pages on FBref

Game-Level BPM In Play Index + Box Score Mouseovers

4th May 2020

In February, Basketball Reference made a major update in incorporating Daniel Myers' BPM 2.0, which aims to estimate a player's performance relative to league average by using a player's box score information and his team's overall performance. This statistic is also calculable at the game level, and we've made it easier to look through this by making BPM searchable in Basketball Reference's Game Finder, one of the many tools you can find in the site's Play Index.

BPM 2.0 is searchable back to the 1984-85 season, when we first have 100% coverage of all the statistical components needed to calculate this. It's important to note that BPM is a rate stat, so setting a minutes played threshold will be important. Here's a look at the top games in our system using a couple of different thresholds:

Minimum 10 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
James Robinson 1996-12-30 * MIN 10 1 1 23 74.6
Henry James 1997-04-15 * ATL 10 2 1 24 63.9
Jrue Holiday 2009-11-24 * PHI 10 6 1 11 61.1
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 20 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Brent Barry 2006-03-24 * SAS 20 2 4 23 45.5
Manu Ginóbili 2009-01-20 * SAS 21 8 3 26 41.9
Victor Oladipo 2018-01-06 * IND 24 6 9 23 40.6
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 30 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Nikola Jokić 2018-10-20 * DEN 31 11 11 35 44.4
Gilbert Arenas 2006-02-25 * WAS 30 1 2 46 40.5
Damian Lillard 2016-02-19 * POR 31 0 7 51 38.1
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 40 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Damian Lillard 2017-04-08 * POR 42 6 5 59 35.7
Manu Ginóbili 2008-02-13 * SAS 41 5 8 46 34.2
Vince Carter 2001-05-11 * TOR 45 6 7 50 34.0
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

In addition to the Game Finder addition, Basketball Reference now has mouseovers in the advanced section of box scores that display the offensive and defensive BPM breakdowns, as well as Value Over Replacement Player prorated to 82 games. For more information on how BPM 2.0 is calculated, please consult Daniel Myers' explainer. Stay tuned to the Sports Reference Blog for the latest additions to Basketball Reference!

Posted in Advanced Stats, Announcement, Basketball-Reference.com, Features, History, Play Index, Statgeekery | 1 Comment »

Launching Stathead

27th April 2020

If you haven't read it already, please read Mike Lynch's rundown of our new Stathead/Baseball service. I'm going to lay out some of the background for this change and explain some of the changes.

As I laid out in our post from early March, we are making changes to our Ad-Free and Play Index products.

Here is the thrust of what we said in March.

So we are making some changes. The Play Index for each site will be moving to Stathead.com. Stathead.com will become the center for all of our subscription products. We expect these products to include tools and information beyond just a redesigned set of Play Index tools. This won't happen all at once, but we'll start with baseball and then proceed through the remainder of our sports. Also, we will be ending our ad-free product and instead Stathead memberships will have ad-free built-in. There just aren't enough users to justify a separate ad-free product. These changes will begin this month and continue through April on baseball and then continue with the other sites after that.

If you are a subscriber, we will make every effort to make certain you are happy with the options we provide to convert your ad-free or Play Index subscription over to Stathead including the option of a refund on your subscription. You will be hearing more from us about the changes over the next few weeks as we will email users directly.

If you've looked at the cost of Stathead/Baseball vs the Play Index, you'll notice we've gone from $36/year (+ $20/year for ad-free) to $8/month. I realize this is a significant increase. As I said in my original post, we are extraordinarily reliant on ad revenue. Back in early March this seemed problematic. Now with the complete collapse of the advertising market it has the potential to be lethal. If you don't block our ads, you may have noticed that we now have more ads on our pages. This is in response to the downturn in ad revenue. Sports Reference is doing fine right now, but if we want to continue to succeed and also be aligned with the needs of our users, a healthy stream of subscription revenue is vital.

We also feel our products warrant this price. The only comparable products to our Stathead tools come from Elias and STATS LLC and would cost you $10,000+ a year to subscribe to. You could create your own from Retrosheet data, but that would probably take more than $8/month of your time to maintain.

We are using monthly billing for at least the first few quarters, so that we can monitor more directly the success we are having in recruiting and maintaining subscribers. We have discussed adding an annual billing option in the future.

For the time being, we will be maintaining both the legacy Play Index site (which has been free since the start of March) and the new site, but before too long we will take down the old Play Index site, probably late May. We are also working on converting the other Play Index sites. First will be hockey and then probably basketball after that.

We realize there aren't games being played and that you might be facing your own financial challenges at this time. Therefore, we are offering the first month free for all users. And then, until the leagues start playing games, we will be giving users the option of claiming additional free monthly subscriptions. We'll provide more details on the latter plan as we approach the time for subscriptions to be renewed.

If you are a current subscriber, we will be emailing you with information about how we will be converting your subscription to the new system and of course, we will provide your money back if you are unhappy with the conversion to Stathead that we are offering you. Our goal is to give you a more than fair deal and see you join us on stathead.com.

Please feel free to reach out to us if you have questions or concerns.

--sean forman

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Stat Questions, Statgeekery, Stathead | 14 Comments »

Goal Creation, Possession, Passing and More Advanced Stats on FBref

14th April 2020

FBref carries a wide array of advanced stats powered by Statsbomb to help give you all the needed context and analyze a player's performance from as many angles as possible. You can look through our blog's FBref tag for a look at the most recent additions we've made to the site, and today we have another mass addition of advanced stats you can now access on player pages. Here's the list:

Passing

- Total distance of completed passes
- Progressive distance of completed passes (distance toward goal)
- Progressive Passes

Pass Types

- live-ball, dead-ball, under pressure
- corner kick types (in-swinging, out-swinging, straight)
- pass height (ground, low, high)
- by body part (left/right foot, head, throw-ins, other)
- pass outcomes (completed, offsides, out of bounds, intercepted, blocked)

Goal and Shot Creation

- Goal Creating Actions (GCA) and Shot Creating Actions (SCA), meaning the two offensive leading to a shot or goal. This includes live-ball passes, dead-ball passes, successful dribbles, shots which lead to another shot, and being fouled

Defensive Actions

- tackles by location on pitch
- pressures, successful pressures, pressures by location on pitch
- blocks (shots, shots saved, and passes)
- clearances
- errors leading to an opponent's shot

Possession

- touches, touches by location on pitch
- carries (total and progressive distance)
- pass receiving (targets and completions)
- miscontrols and dispossessions

Miscellaneous

- Aerials won/lost

From league pages, these stats can be accessed by using the Squad & Player Stats tab between Scores/Fixtures and Nationalities. Of course, you can also see this on player pages if they've played in competitions we have xG data for.

We're excited to see what analysis people can derive from this new information, which is available thanks to the hard work of Statsbomb. You can keep up with the latest additions of statistical coverage and new features here on the Sports Reference Blog, or by signing up for the This Week in Sports Reference mailing list. Feel free to send us any questions or suggestions through our feedback form or FBref's official Twitter account.

Posted in Advanced Stats, Announcement, Data, FBref, Features, Statgeekery | Comments Off on Goal Creation, Possession, Passing and More Advanced Stats on FBref

Advanced Player Game Logs on Pro Football Reference

10th April 2020

In 2019 Pro-Football-Reference added advanced statistics provided by Sportradar such as air yards, yards after contact, drops, and passer rating allowed among others. We have those available at the season level on player pages, as well as on the game level within box scores. We have now added advanced game logs, accessible from player pages, so you can see an individual's advanced stats at the single-game level.

Here are links to some examples:

Aaron Rodgers

Christian McCaffrey

Richard Sherman

If you have any questions or suggestions, feel free to contact us through our feedback form or Pro Football Reference's official Twitter account. Thanks for following us!

Posted in Advanced Stats, Announcement, Features, Play Index, Pro-Football-Reference.com | Comments Off on Advanced Player Game Logs on Pro Football Reference

2020 WAR Update

16th March 2020

As we approach the beginning of the 2020 season, we have made some updates to our Wins Above Replacement calculations.  You may notice some small changes to figures as you browse the site. As always, you can find full details on how we calculate WAR here.

Defensive Runs Saved Changes

Last week, we updated Defensive Runs Saved (DRS) totals across the site with new figures from Baseball Info Solutions.  The new methodology involves breaking down infielder defense using the PART system - assigning run values to Positioning, Air Balls, Range, and Throwing.  Under the new system, an infielder’s total DRS is the sum of his Air Balls, Range, and Throwing runs saved, while Positioning runs saved are credited to the team as a whole.  You can read more about the updates in the Sports Info Solutions blog.  The PART system applies to all infielders since 2013.

Folding these numbers into WAR, we see some significant changes for individual player seasons.  The 2019 Oakland A’s get even more recognition for defense on the left side of their infield, with shortstop Marcus Semien gaining 0.7 WAR and third baseman Matt Chapman gaining 1.6 WAR from the new DRS numbers, lifting both players above Mike Trout and into second and third place respectively on the 2019 AL WAR leaderboard.  Chapman’s 1.6 additional WAR represents the largest single-season change in this update.

On the other end of the spectrum, we see Adrian Beltre with the most significant drop in this update, losing 1.5 WAR in 2015.

Since we use DRS to measure the quality of a team’s defense, these new values also impact pitcher WAR values.  Team total DRS changed by as much as 46 runs for a given team and season - the 2019 Dodgers defense improved from 75 DRS to 121 DRS by non-pitchers under the new system.  Once applied to a specific pitcher, however, the changes to WAR are much smaller in magnitude than the changes to individual fielders. The most extreme example is Hyun-Jin Ryu, who pitched 182.2 innings in front of the 2019 Dodgers defense.  Considering the Dodgers defense to be 46 runs better across the entire season, and considering that Ryu was the pitcher for 13.52% of the Dodgers’ balls in play in 2019, we adjust our expected runs allowed for Ryu by 6.2 runs for the season. After following the rest of the steps in our pitching WAR calculation, the end result is a drop of 0.3 WAR for the season.  All other changes to pitching WAR from this change to team defense are smaller than Ryu’s 0.3 WAR drop in 2019.

Park Factors

Park factors for 2018 have been re-computed to include the 2019 season, since WAR uses a three-year average for park factors when computing pitching WAR.  The most significant change here is the Miami Marlins, whose pitching park factor rose from 90 to 95 (where <100 represents a pitcher’s park and >100 represents a hitter’s park).  José Ureña sees the biggest benefit from this, with his 2018 WAR rising by 0.7 wins. All other changes to pitching WAR from updated park factors are smaller than Ureña’s 0.7 WAR gain in 2018.

New Game Logs from Retrosheet (1904-1907)

Last month, we updated the site with new data from Retrosheet, including new game logs for players from 1904 to 1907.  Having game-level data allows us to be more precise in our WAR calculations, since we can consider the specific ballparks a pitcher played in and the opponents he faced.

Take Christy Mathewson in 1907 as an example.  Prior to this change, we used the league average (excluding his team) of 3.36 runs per nine innings as the expected quality of his opposition.  However, with game-level data, we can see that Mathewson’s actual opponents averaged 3.55 runs per nine innings, showing that Mathewson was probably used strategically and started more games against better opponents.  Indeed, Mathewson pitched in 10 of the Giants’ 22 games against the league’s best offense, the Pirates, as well as 7 of the Giants’ 22 games against the Cubs, the NL’s second-best offense. Against the Dodgers and Cardinals, who each struggled offensively and scored fewer than 3 runs per game, Mathewson pitched in just 8 games total.

Knowing this about his usage, we can set more accurate expectations for how many runs an average player would have allowed under Mathewson’s circumstances.  By adjusting the quality of his opposition, we expect an average pitcher to have allowed about 7 more runs over the course of the season, resulting in a bump of 0.9 WAR in 1907.  All other changes to pitching WAR from new game log data are smaller than Mathewson’s 0.9 WAR gain in 1907.

Baserunning and Double Plays from Play-by-Play Data (1931-1947)

When calculating runs from baserunning and double plays, we use play-by-play data from seasons where it is complete enough to credit players for things like scoring from first on a double, advancing from first to third on a single, and hitting into fewer double plays than expected.

In the past, we have taken play-by-play data into account back to 1948 for baserunning and double plays, because the data further back than that has been incomplete and could give players an advantage in their WAR simply by having more complete play-by-play records than their peers.  As this data has become more complete over time, we have moved this cutoff back to 1931. The data is still somewhat sparse for games that took place during World War II (1943-45), but we felt it was worth including those years as well.

Pete Reiser of the Brooklyn Dodgers was skilled at taking extra bases, and it showed in the play-by-play accounts.  In 1942, he took extra bases at a rate of 55%, compared to the league average of 45%. Additionally, the Dodgers were tied with the Cardinals as the league’s top scoring offense, so Reiser had many opportunities to put his speed to use.  He scored from first on doubles a league-leading ten times in just 15 opportunities, and also scored from second on a single 24 times, good for 5th in the NL that year, in just 29 opportunities. Using this play-by-play data while computing WAR gives Reiser an additional 1.2 WAR in 1942.  All other changes to batting WAR from this change are smaller than Reiser’s 1.2 WAR gain in 1942.

Caught Stealing Totals from Game Logs (1926-1940)

When crediting runners for how many runs they contributed with their baserunning, we take into account their stolen base and caught stealing totals.  Caught stealing totals are missing for many players between 1926 and 1940, but we have complete game logs for players in that span.

In the past, when we didn’t have a caught stealing total for a player, we would estimate how many times they were likely to have been caught stealing based on the league’s stolen base success rate and the ways the player reached base during the season.

We are now using actual caught stealing totals from the players’ game logs, so there are some changes for players who did considerably better or worse than we had been estimating.

Take, for example, Freddie Lindstrom.  In 1928, the Giants third baseman stole 15 bases, but his official season stat line does not have caught stealing available.  Previously, we had estimated that he was caught stealing 11.57 times, based on everything else we knew about his performance and the league he played in.  However, game logs indicate that Lindstrom was caught 21 times, nearly twice as often as we had estimated. This difference gets folded into our baserunning runs calculation and results in a drop of 0.4 WAR.  All other changes to batting WAR from this change are smaller than Lindstrom’s 0.4 WAR drop in 1928.

Biggest Career Movers

Hall of Famer Ernie Lombardi sees the biggest change to his career WAR with this update, sinking from 46.8 WAR to 39.5 WAR, a drop of 7.3 wins.  The largest gain goes to infielder Lonny Frey, who picks up 5.2 wins. Both these players played in the 1930s and 1940s and saw big changes because of their baserunning.  Lombardi is known for being one of the slowest runners in baseball history, and this update shows that the numbers back that reputation. Frey was a fast runner in an era where stolen bases were rare, so he has been underrated to this point when it comes to his baserunning contributions.

On the mound, previously cited Hall of Famer Christy Mathewson is the big winner.  As discussed above, his WAR now recognizes how his manager would use him against tougher opponents, and he sees his career WAR jump by 2.2 wins.  Barney Pelty experiences the biggest drop of 1.9 wins.

We’ve highlighted some of the more extreme changes here, but to see full lists of the largest changes to season and career WAR totals, please see the spreadsheet here.

We're very excited about these new additions and hope you enjoy them as well. Thanks to Baseball Info Solutions for their contributions. Please let us know if you have any comments, questions or concerns.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Features, History, Leaders, Play Index, Statgeekery, WAR | 5 Comments »