Sports Reference Blog

Archive for the 'Advanced Stats' Category

2021 WAR Update

31st March 2021

As we approach the beginning of the 2021 season, we have made some updates to our Wins Above Replacement calculations. You may notice some small changes to figures as you browse the site. As always, you can find full details on how we calculate WAR here.

Defensive Runs Saved Changes

Last week, we updated Defensive Runs Saved (DRS) totals across the site with new figures from Sports Info Solutions that incorporate more accurate hit timing data. This impacts some fielders from 2017 to 2020. You can read more about the updates in the Sports Info Solutions blog, including which teams and fielders were most impacted.

2019 Park Factors

Park factors for 2019 have been re-computed to include the 2020 season, since WAR uses a three-year average for park factors when computing pitching WAR. The most significant change here is the Cincinnati Reds, whose pitching park factor rose from 103 to 108 (where <100 represents a pitcher’s park and >100 represents a hitter’s park). Luis Castillo sees the biggest benefit from this, with his 2019 WAR rising by 0.7 wins. All other changes to pitching WAR from updated park factors are smaller than Castillo’s 0.7 WAR gain in 2018.

2020 Park Factors

When a season is in progress, our three-year average park factors are computed using a prorated combination of the current season and two years prior. Due to the shortened 2020 schedule, the park factors for 2020 were still using some data from 2018, because the 60-game schedule was being treated as a partial in-progress season. We’ve addressed this in our park factor calculations so that the 2020 park factors only include 2019 and 2020. This change was reflected in OPS+, ERA+, Rbat+, and rOBA in the past week, but it is now also incorporated in WAR, leading to small changes for a handful of players.

Lance Lynn gains the most from this, adding 0.3 wins with Globe Life Field moving from a slight hitters park (102) to a more extreme hitters park (107). Trea Turner has the largest change on offense, also gaining 0.3 wins with Nationals Park moving from being a slight hitters park (102) to being a slight pitchers park (98).

New Game Logs from Retrosheet (1901-1903)

Last summer, we updated the site with new data from Retrosheet, including new game logs for players from 1901 to 1903. Having game-level data allows us to be more precise in our WAR calculations, since we can consider the specific ballparks a pitcher played in and the opponents he faced.

We presented a more in-depth example of this in our last WAR update, when Hall-of-Famer Christy Mathewson’s WAR rose after we added new game logs. This time around, pitcher Doc White saw the biggest change, gaining 1.5 WAR over the course of his career.

Biggest Career Movers

The top mover for position players in career WAR is Trea Turner, gaining 1.8 wins through a combination of additional runs saved and beneficial park factor changes. Trevor Story is close behind at 1.7 wins, primarily through additional runs saved.

On the pitching side, we see Doc White with 1.5 wins gained as described above. Among modern players, Patrick Corbin saw his career total drop by 0.8 wins. This is the flipside to how Turner gained credit. Corbin is debited for playing in a more pitcher-friendly park than previously thought, and for playing in front of defenders like Turner who are getting additional credit for their defense. Both of these changes decrease the number of runs we’d expect Corbin to have allowed, and as a result his performance is not as valuable as previously calculated.

We’ve highlighted some of the more extreme changes here, but to see full lists of the largest changes to season and career WAR totals, please see the spreadsheet here.

Thanks to Baseball Info Solutions and Retrosheet for their contributions. Please let us know if you have any comments, questions or concerns.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Features, History, Statgeekery, Stathead, WAR | No Comments »

Advanced Stats on Player Pages: How We Made It

26th February 2021

On Tuesday night, we added a new table of Advanced Stats to player pages.

This is what it looks like for hitters:

Mike Trout Advanced Stats

And for pitchers:

Gerrit Cole Advanced Stats

Rather than simply explain what we added, I’m going to describe how we added it. How does something go from an idea to a feature on Baseball-Reference? The entire process starts with you, the user.

At the beginning of January, we began conducting interviews with several users to discuss their experience using Baseball Reference and Stathead. By the time we launched the feature, we had spoken with nearly 50 users. It’s important to note that when we started the interview process, we didn’t have a particular solution or even a particular problem in mind.

There were several goals for these interviews. We wanted to find:

  1. What is the general perception of Baseball Reference compared with other sites?
  2. What features would users like us to add to Baseball Reference?
  3. What features would users like us to add to Stathead?
  4. What features of Baseball Reference and Stathead are users having a hard time using, finding, or just remembering to find?
  5. In what ways are people using our sites that we hadn’t anticipated?

Many of the interviews confirmed what we already knew. But every interview had at least one piece of gold that we could learn from. One interview in particular stood out to me and sent me on a path towards designing the feature you see on the site today.

I spoke with Mark Gorosh (@sportz5176 on Twitter) on February 3. Mark was lamenting that we don’t have advanced metrics such as BB% and K% on Baseball-Reference player pages. He didn’t understand why we had so many columns about the inner workings of WAR (in the Player Value table), but not established advanced stats like walk rate.

The issue, of course, is that we do have those stats. At this point I showed Mark the Advanced Batting page and… I’m not going to say Mark yelled at me, but he gave us some tough love that we really needed to hear. He couldn’t understand why all these great stats were not on a player’s main page.

And he was right.

There were a few different paths we could take.

  1. We could take all of the tables on the Advanced Batting pages and put them on the main player page. This wasn’t practical, however. There’s also an Advanced Fielding page and, of course, and Advanced Pitching page for pitchers. Adding all of these for a pitcher would lead to dozens of tables. Having so much on one page would negatively affect user experience.
  2. We could pick and choose certain things to bring over to the main page. Perhaps we could do this in a way that also leads users to click to the Advanced pages.
  3. We could move nothing, but focus on doing a better job of directing users to the player sub-pages (such as advanced batting and pitching, splits, and game logs).

We opted for the second option, but will also be looking to address the third option. The solution for the immediate job at hand is getting some advanced stats on the main player pages. But the fact that Mark (and other users) didn’t even know we had these advanced stats is a symptom of another issue—some users either are not noticing these sub-pages or they know about them but don’t think to use them (because they’re a click away).

This is a big deal because Baseball-Reference has a lot of users, but the super-users are the ones that have discovered the game logs, splits, and other advanced features. From there, they move on to Stathead to get even more powerful tools for their research. We want as many users discovering those features as possible so they can also turn into power users. So, in the future I’ll be looking to improve the player (and team and league) sub-navigation.

Now that we chose the path to explore, there were still different ways to proceed. One was to move the Player Value table (where we show WAR and its components) to the Advanced Batting page, but bring the most important columns (such as WAR, WAA, oWAR, dWAR, etc.) along with the most important columns from other Advanced Batting tables.

We began testing with that.

Francisco Lindor Advanced Stats Mockup

This early mockup tested well but some users showed a very strong preference for keeping the Player Value table where it was and adding a separate Advanced Stats table below it. Honestly, that was probably the right solution all along, but I wanted to see we could solve this without increasing the number of tables on player pages. We ended up adding one, but that’s fine.

There were several key things from this mockup that tested well, such as:

  1. The collection of stats we chose (which were the result of team discussions and also a survey we shared on Twitter).
  2. The addition of rOBA (our version of wOBA—Reference weighted OBA) and Rbat+ (our version of wRC+—based on the Rbat used in WAR). Despite the fact that these stats are brand new, I was impressed by how many guessed right away what they were.
  3. The links under the table to let users quickly jump to any table on the Advanced Batting page from the main player page. Not only does this help raise awareness of the Advanced Batting page, but also lets users know what tables are specifically on the page before they even go there.

The next version we tested kept all of these features, but put them in a separate Advanced Batting table. We also added base-running data, more batted ball data (such as the oft-requested Exit Velocity and Hard Hit %), and a row to display league averages for each stat (because users may not know what a good XBT% is).

That version of the mockup looked much like what you see today:

Francisco Lindor Advanced Stats

This version tested exceedingly well. Now it came down to building it. I asked Kenny Jackelen (@kennyjackelen on Twitter), Baseball-Reference’s developer, for a summary of the development process for a new feature like this. Kenny said he:

  1. Iterated multiple times with the team internally to get feedback on the table implementation (including how the table should render for players from different eras).
  2. Created new database tables for exit velocity data (which also powers the Hard Hit %)
  3. Added columns to existing tables to store rOBA and Rbat+ more permanently (previously these calculations were done as an intermediate step to get to WAR, so the database structure needed some updates to make it easier to pull them into the page-building process alongside other stats).
  4. Added logic to our play-by-play processing to assign batted balls a Pull/Center/Oppo location so that we can get a count of each type and compute the percentages for the Advanced Batting table
  5. Read a lot of slack messages in ALL CAPS from Adam D—like a marathon runner being handed a cup of water.
  6. When it was ready, I got Mark back on Zoom to see his reaction. He said “it’s a 10.” He elaborated further, saying “It's not enough to be baseball’s best data aggregator. You have to present the information in a way that fans will be able to find it. I was honored that BRef and Adam took my suggestions to heart. The new player page designs put so many great pieces of data in easy to find places… near the top of the page.”

    As a researcher, it was very fulfilling to come full circle with Mark. He went from tough love to delight.

    As helpful as it was, not all user interviews revolve around tough love. Many users I have spoken with weren’t sure what to expect when they hopped on a call. Far more often than not, it’s just a casual conversation about baseball, the different ways people use the site, and what they’d like to be able to do.

    I asked interview subject Jim Passon (@PassonJim on Twitter) if he had any thoughts on the interview process (so you don’t just have to take my word for it). He said “When Adam reached out to me to have a conversation about features that I’d like to see in the future, I couldn’t get the meeting set up quick enough. As expected, the meeting was awesome! I got to make a few suggestions, learn some new tricks, and catch a glimpse of the cool features that were already being developed for the site (which I absolutely loved). I now feel like I’m a part of my favorite site on the web... and that feels pretty good!”

    Interviewee Jessica Brand (@JessicaDBrand on Twitter) echoed a similar sentiment, saying “I felt at ease, just discussing sports in depth in every which way with friends. It’s a great way to get those endorphins going to see and meet up with friends at your local stadium/arena/pitch you can’t necessarily see because of social distancing. Interviewing with Adam and Kenny provided the same warm and fuzzies.”

    And honestly, in this time of social distancing and quarantine, hopping on the phone to talk about Baseball Reference with some of my favorite writers and analysts has been incredibly fulfilling. If you’d like to chat with me about how you’re using Baseball-Reference and Stathead, feel free to reach out at @baseballtwit on Twitter or go ahead and book a time on my calendar to chat.

Posted in Advanced Stats, Baseball-Reference.com, Data, Features, History, WAR | 1 Comment »

FBref Scouting Reports and Similar Players Launched

10th February 2021

FBref is happy to announce the release of a feature we've been excited about for a while, player Scouting Reports that give you a quick look at how players compare in various statistics to other players at their position. This is currently available for players in the Big Five men's European leagues (example: Mohamed Salah), Major League Soccer (example: Diego Rossi) and the Women's Super League (example: Sam Kerr). We show 20 categories on the main Scouting Report at the top of a player's page, selected based on feedback from user research and industry experts, but you can also click through to a Complete Scouting Report which shows many more categories to compare the players by.

In addition, we have added a Similar Players table which locates the players that have the most similar percentiles in the stats used in the Scouting Reports. That table also offers Compare links which takes you to our Player Comparison tool so you can see the players' statistics side-by-side.

For more information on how the Scouting Report works, we have a longer explainer on FBref. This would not be possible without the wide array of advanced stats provided by Statsbomb, so thanks to them.

Depending on how people react, we could even adapt this feature for our other Sports-Reference sites in the future. Because of that, we are eager to hear people's thoughts on this new feature, so feel free to contact us via our feedback form.

Posted in Advanced Stats, Announcement, FBref, Features, Statgeekery | No Comments »

December 2020 WAR Update

14th December 2020

We recently fixed an issue where, because of the abbreviated 2020 season, we were not allocating enough wins to position players when calculating Wins Above Replacement. We have fixed this issue across Baseball-Reference. With this change, no position player gained more than 0.3 WAR, and no position player lost WAR. All pitcher WAR remained the same.

You can review the changes for each player here: https://docs.google.com/spreadsheets/d/18WY53wSt0GrBMMijLiIFMhVtvbmjuhbYNOaTvHfs-gE/edit?usp=sharing

If you have any questions or concerns, feel free to contact us through our feedback form.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Statgeekery, WAR | Comments Off on December 2020 WAR Update

Adjusted Shooting Stats Added to Basketball Reference

1st June 2020

There's been much debate about the greatest players in NBA history of late. One of the most difficult things about ranking players in a league with 70+ years of history is that the game has changed a lot over the years. Sure, some of it has to do with the skill and quality of the players. But some of it also has to do with the quality of the balls, the floors, the rims, the training, the travel, the accommodations, available nutrition and pretty much any other variable you can think of. For a better idea of how the league has changed over time, please see this table of league averages for each season in the history of the NBA. As you can see, 2019-20 is the fifth straight season in which a new league-wide eFG% record has been set. There are clearly things at play here beyond just player improvement. Though today's players are certainly more skilled than the ones that produced a league-wide 27.9 FG% in 1946-47 (the first year of the NBA's 'official' forerunner the, BAA, which was objectively worse than the league it eventually merged with, the NBL).

To help bring a bit of objectivity to cross-era comparisons, we have added an Adjusted Shooting table to all player, team and season pages. These tables will show a player's shooting percentages and tendencies, as well as league-wide percentages and tendencies and then scale them. Like OPS+ on our baseball site it will be scaled so that 100 represents a league-average shooter. 125 is 25% better than average and 75 is 25% worse than average. These figures are obtained by taking the player's shooting percentage, dividing it by the league-wide shooting percentages and then multiplying it by 100. So 125 doesn't mean a player was 25 percentage points above average, but 25 percent above average. We are also publishing adjusted versions of 3-point Attempt Rate and Free Throw Rate to give a better idea of how often the player shot 3s or got to the line relative to their era.

Additionally, we have calculated Field Goal Points Added and True Shooting Points Added to show how many points each player scored above or below what a league average player would have scored given an equal number of field goal attempts or true shot attempts, respectively. This is to show which players combined volume and efficiency (or those that combined volume with inefficiency, for that matter).

Read the rest of this entry

Posted in Advanced Stats, Announcement, Basketball-Reference.com, Data, Features, History, Statgeekery | 7 Comments »

BPM 2.0 on College Basketball Reference

28th May 2020

In February, our pro basketball site incorporated Daniel Myers' BPM 2.0, the update to the classic Box Plus Minus measurement. We have now completed that update for College Basketball Reference as well. BPM 2.0 aims to estimate a player's performance relative to league average by using a player's box score information and his team's overall performance.

BPM 2.0 will appear on College Basketball Reference in the same places you found old BPM, and is available back to the 2010-11 season. Leaderboards have been updated to reflect the new measurement. BPM 2.0 also allows for game-level calculations, which means that our box scores since the 2010-11 season will now include BPM 2.0 in the Advanced table.

For more information on why the update was made, you can refer to our February blog post on the BBR update, as well as Daniel Myers' in-depth explainer on how BPM 2.0 is calculated. We thank Myers for his contributions and we hope you enjoy the addition to College Basketball Reference.

Posted in Advanced Stats, Announcement, CBB at Sports Reference, Features | Comments Off on BPM 2.0 on College Basketball Reference

Big 5 Leagues Pages on FBref

27th May 2020

FBref covers basic and advanced statistics for dozens of domestic leagues around the world, with the English Premier League, Spanish La Liga, German Bundesliga, Italian Serie A and French Ligue 1, commonly referred to as the "Big 5", being the most visited league stat pages. Up to this point, if you wanted to compare statistics between the leagues, you'd need to have a tab open for each one.

That will no longer be as necessary now that FBref has added combined Big 5 stat pages, with a combined league table, leaderboards across the 5 leagues and stat registers that include players who've played in any of the leagues. In the Player Standard Stats section, sorting by G+A-PK per 90 minutes gives you Jadon Sancho (Bundesliga), Kylian Mbappé (Ligue 1) and Lionel Messi (La Liga) at the top this season. In the Squad Goal and Shot Creation section, you can see Bayern Munich and Dortmund are leading all Big 5 teams in Goal Creating Actions per 90 minutes.

Check out our new Combined Big 5 pages and so much more that we offer at FBref! You can keep up with the latest additions of statistical coverage and new features here on the Sports Reference Blog, or by signing up for the This Week in Sports Reference mailing list. Feel free to send us any questions or suggestions through our feedback form or FBref's official Twitter account.

Posted in Advanced Stats, Announcement, FBref, Features, Leaders | Comments Off on Big 5 Leagues Pages on FBref

Game-Level BPM In Play Index + Box Score Mouseovers

4th May 2020

In February, Basketball Reference made a major update in incorporating Daniel Myers' BPM 2.0, which aims to estimate a player's performance relative to league average by using a player's box score information and his team's overall performance. This statistic is also calculable at the game level, and we've made it easier to look through this by making BPM searchable in Basketball Reference's Game Finder, one of the many tools you can find in the site's Play Index.

BPM 2.0 is searchable back to the 1984-85 season, when we first have 100% coverage of all the statistical components needed to calculate this. It's important to note that BPM is a rate stat, so setting a minutes played threshold will be important. Here's a look at the top games in our system using a couple of different thresholds:

Minimum 10 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
James Robinson 1996-12-30 * MIN 10 1 1 23 74.6
Henry James 1997-04-15 * ATL 10 2 1 24 63.9
Jrue Holiday 2009-11-24 * PHI 10 6 1 11 61.1
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 20 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Brent Barry 2006-03-24 * SAS 20 2 4 23 45.5
Manu Ginóbili 2009-01-20 * SAS 21 8 3 26 41.9
Victor Oladipo 2018-01-06 * IND 24 6 9 23 40.6
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 30 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Nikola Jokić 2018-10-20 * DEN 31 11 11 35 44.4
Gilbert Arenas 2006-02-25 * WAS 30 1 2 46 40.5
Damian Lillard 2016-02-19 * POR 31 0 7 51 38.1
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

Minimum 40 MP

Query Results Table
Player Date Tm MP TRB AST PTS BPM
Damian Lillard 2017-04-08 * POR 42 6 5 59 35.7
Manu Ginóbili 2008-02-13 * SAS 41 5 8 46 34.2
Vince Carter 2001-05-11 * TOR 45 6 7 50 34.0
Provided by Basketball-Reference.com: View Original Table
Generated 5/5/2020.

In addition to the Game Finder addition, Basketball Reference now has mouseovers in the advanced section of box scores that display the offensive and defensive BPM breakdowns, as well as Value Over Replacement Player prorated to 82 games. For more information on how BPM 2.0 is calculated, please consult Daniel Myers' explainer. Stay tuned to the Sports Reference Blog for the latest additions to Basketball Reference!

Posted in Advanced Stats, Announcement, Basketball-Reference.com, Features, History, Play Index, Statgeekery | 1 Comment »

Launching Stathead

27th April 2020

If you haven't read it already, please read Mike Lynch's rundown of our new Stathead/Baseball service. I'm going to lay out some of the background for this change and explain some of the changes.

As I laid out in our post from early March, we are making changes to our Ad-Free and Play Index products.

Here is the thrust of what we said in March.

So we are making some changes. The Play Index for each site will be moving to Stathead.com. Stathead.com will become the center for all of our subscription products. We expect these products to include tools and information beyond just a redesigned set of Play Index tools. This won't happen all at once, but we'll start with baseball and then proceed through the remainder of our sports. Also, we will be ending our ad-free product and instead Stathead memberships will have ad-free built-in. There just aren't enough users to justify a separate ad-free product. These changes will begin this month and continue through April on baseball and then continue with the other sites after that.

If you are a subscriber, we will make every effort to make certain you are happy with the options we provide to convert your ad-free or Play Index subscription over to Stathead including the option of a refund on your subscription. You will be hearing more from us about the changes over the next few weeks as we will email users directly.

If you've looked at the cost of Stathead/Baseball vs the Play Index, you'll notice we've gone from $36/year (+ $20/year for ad-free) to $8/month. I realize this is a significant increase. As I said in my original post, we are extraordinarily reliant on ad revenue. Back in early March this seemed problematic. Now with the complete collapse of the advertising market it has the potential to be lethal. If you don't block our ads, you may have noticed that we now have more ads on our pages. This is in response to the downturn in ad revenue. Sports Reference is doing fine right now, but if we want to continue to succeed and also be aligned with the needs of our users, a healthy stream of subscription revenue is vital.

We also feel our products warrant this price. The only comparable products to our Stathead tools come from Elias and STATS LLC and would cost you $10,000+ a year to subscribe to. You could create your own from Retrosheet data, but that would probably take more than $8/month of your time to maintain.

We are using monthly billing for at least the first few quarters, so that we can monitor more directly the success we are having in recruiting and maintaining subscribers. We have discussed adding an annual billing option in the future.

For the time being, we will be maintaining both the legacy Play Index site (which has been free since the start of March) and the new site, but before too long we will take down the old Play Index site, probably late May. We are also working on converting the other Play Index sites. First will be hockey and then probably basketball after that.

We realize there aren't games being played and that you might be facing your own financial challenges at this time. Therefore, we are offering the first month free for all users. And then, until the leagues start playing games, we will be giving users the option of claiming additional free monthly subscriptions. We'll provide more details on the latter plan as we approach the time for subscriptions to be renewed.

If you are a current subscriber, we will be emailing you with information about how we will be converting your subscription to the new system and of course, we will provide your money back if you are unhappy with the conversion to Stathead that we are offering you. Our goal is to give you a more than fair deal and see you join us on stathead.com.

Please feel free to reach out to us if you have questions or concerns.

--sean forman

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Stat Questions, Statgeekery, Stathead | 14 Comments »

Goal Creation, Possession, Passing and More Advanced Stats on FBref

14th April 2020

FBref carries a wide array of advanced stats powered by Statsbomb to help give you all the needed context and analyze a player's performance from as many angles as possible. You can look through our blog's FBref tag for a look at the most recent additions we've made to the site, and today we have another mass addition of advanced stats you can now access on player pages. Here's the list:

Passing

- Total distance of completed passes
- Progressive distance of completed passes (distance toward goal)
- Progressive Passes

Pass Types

- live-ball, dead-ball, under pressure
- corner kick types (in-swinging, out-swinging, straight)
- pass height (ground, low, high)
- by body part (left/right foot, head, throw-ins, other)
- pass outcomes (completed, offsides, out of bounds, intercepted, blocked)

Goal and Shot Creation

- Goal Creating Actions (GCA) and Shot Creating Actions (SCA), meaning the two offensive leading to a shot or goal. This includes live-ball passes, dead-ball passes, successful dribbles, shots which lead to another shot, and being fouled

Defensive Actions

- tackles by location on pitch
- pressures, successful pressures, pressures by location on pitch
- blocks (shots, shots saved, and passes)
- clearances
- errors leading to an opponent's shot

Possession

- touches, touches by location on pitch
- carries (total and progressive distance)
- pass receiving (targets and completions)
- miscontrols and dispossessions

Miscellaneous

- Aerials won/lost

From league pages, these stats can be accessed by using the Squad & Player Stats tab between Scores/Fixtures and Nationalities. Of course, you can also see this on player pages if they've played in competitions we have xG data for.

We're excited to see what analysis people can derive from this new information, which is available thanks to the hard work of Statsbomb. You can keep up with the latest additions of statistical coverage and new features here on the Sports Reference Blog, or by signing up for the This Week in Sports Reference mailing list. Feel free to send us any questions or suggestions through our feedback form or FBref's official Twitter account.

Posted in Advanced Stats, Announcement, Data, FBref, Features, Statgeekery | 1 Comment »