Sports Reference Blog

Archive for the 'Baseball-Reference.com' Category

2021 WAR Update

31st March 2021

As we approach the beginning of the 2021 season, we have made some updates to our Wins Above Replacement calculations. You may notice some small changes to figures as you browse the site. As always, you can find full details on how we calculate WAR here.

Defensive Runs Saved Changes

Last week, we updated Defensive Runs Saved (DRS) totals across the site with new figures from Sports Info Solutions that incorporate more accurate hit timing data. This impacts some fielders from 2017 to 2020. You can read more about the updates in the Sports Info Solutions blog, including which teams and fielders were most impacted.

2019 Park Factors

Park factors for 2019 have been re-computed to include the 2020 season, since WAR uses a three-year average for park factors when computing pitching WAR. The most significant change here is the Cincinnati Reds, whose pitching park factor rose from 103 to 108 (where <100 represents a pitcher’s park and >100 represents a hitter’s park). Luis Castillo sees the biggest benefit from this, with his 2019 WAR rising by 0.7 wins. All other changes to pitching WAR from updated park factors are smaller than Castillo’s 0.7 WAR gain in 2018.

2020 Park Factors

When a season is in progress, our three-year average park factors are computed using a prorated combination of the current season and two years prior. Due to the shortened 2020 schedule, the park factors for 2020 were still using some data from 2018, because the 60-game schedule was being treated as a partial in-progress season. We’ve addressed this in our park factor calculations so that the 2020 park factors only include 2019 and 2020. This change was reflected in OPS+, ERA+, Rbat+, and rOBA in the past week, but it is now also incorporated in WAR, leading to small changes for a handful of players.

Lance Lynn gains the most from this, adding 0.3 wins with Globe Life Field moving from a slight hitters park (102) to a more extreme hitters park (107). Trea Turner has the largest change on offense, also gaining 0.3 wins with Nationals Park moving from being a slight hitters park (102) to being a slight pitchers park (98).

New Game Logs from Retrosheet (1901-1903)

Last summer, we updated the site with new data from Retrosheet, including new game logs for players from 1901 to 1903. Having game-level data allows us to be more precise in our WAR calculations, since we can consider the specific ballparks a pitcher played in and the opponents he faced.

We presented a more in-depth example of this in our last WAR update, when Hall-of-Famer Christy Mathewson’s WAR rose after we added new game logs. This time around, pitcher Doc White saw the biggest change, gaining 1.5 WAR over the course of his career.

Biggest Career Movers

The top mover for position players in career WAR is Trea Turner, gaining 1.8 wins through a combination of additional runs saved and beneficial park factor changes. Trevor Story is close behind at 1.7 wins, primarily through additional runs saved.

On the pitching side, we see Doc White with 1.5 wins gained as described above. Among modern players, Patrick Corbin saw his career total drop by 0.8 wins. This is the flipside to how Turner gained credit. Corbin is debited for playing in a more pitcher-friendly park than previously thought, and for playing in front of defenders like Turner who are getting additional credit for their defense. Both of these changes decrease the number of runs we’d expect Corbin to have allowed, and as a result his performance is not as valuable as previously calculated.

We’ve highlighted some of the more extreme changes here, but to see full lists of the largest changes to season and career WAR totals, please see the spreadsheet here.

Thanks to Baseball Info Solutions and Retrosheet for their contributions. Please let us know if you have any comments, questions or concerns.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Features, History, Statgeekery, Stathead, WAR | No Comments »

Advanced Stats on Player Pages: How We Made It

26th February 2021

On Tuesday night, we added a new table of Advanced Stats to player pages.

This is what it looks like for hitters:

Mike Trout Advanced Stats

And for pitchers:

Gerrit Cole Advanced Stats

Rather than simply explain what we added, I’m going to describe how we added it. How does something go from an idea to a feature on Baseball-Reference? The entire process starts with you, the user.

At the beginning of January, we began conducting interviews with several users to discuss their experience using Baseball Reference and Stathead. By the time we launched the feature, we had spoken with nearly 50 users. It’s important to note that when we started the interview process, we didn’t have a particular solution or even a particular problem in mind.

There were several goals for these interviews. We wanted to find:

  1. What is the general perception of Baseball Reference compared with other sites?
  2. What features would users like us to add to Baseball Reference?
  3. What features would users like us to add to Stathead?
  4. What features of Baseball Reference and Stathead are users having a hard time using, finding, or just remembering to find?
  5. In what ways are people using our sites that we hadn’t anticipated?

Many of the interviews confirmed what we already knew. But every interview had at least one piece of gold that we could learn from. One interview in particular stood out to me and sent me on a path towards designing the feature you see on the site today.

I spoke with Mark Gorosh (@sportz5176 on Twitter) on February 3. Mark was lamenting that we don’t have advanced metrics such as BB% and K% on Baseball-Reference player pages. He didn’t understand why we had so many columns about the inner workings of WAR (in the Player Value table), but not established advanced stats like walk rate.

The issue, of course, is that we do have those stats. At this point I showed Mark the Advanced Batting page and… I’m not going to say Mark yelled at me, but he gave us some tough love that we really needed to hear. He couldn’t understand why all these great stats were not on a player’s main page.

And he was right.

There were a few different paths we could take.

  1. We could take all of the tables on the Advanced Batting pages and put them on the main player page. This wasn’t practical, however. There’s also an Advanced Fielding page and, of course, and Advanced Pitching page for pitchers. Adding all of these for a pitcher would lead to dozens of tables. Having so much on one page would negatively affect user experience.
  2. We could pick and choose certain things to bring over to the main page. Perhaps we could do this in a way that also leads users to click to the Advanced pages.
  3. We could move nothing, but focus on doing a better job of directing users to the player sub-pages (such as advanced batting and pitching, splits, and game logs).

We opted for the second option, but will also be looking to address the third option. The solution for the immediate job at hand is getting some advanced stats on the main player pages. But the fact that Mark (and other users) didn’t even know we had these advanced stats is a symptom of another issue—some users either are not noticing these sub-pages or they know about them but don’t think to use them (because they’re a click away).

This is a big deal because Baseball-Reference has a lot of users, but the super-users are the ones that have discovered the game logs, splits, and other advanced features. From there, they move on to Stathead to get even more powerful tools for their research. We want as many users discovering those features as possible so they can also turn into power users. So, in the future I’ll be looking to improve the player (and team and league) sub-navigation.

Now that we chose the path to explore, there were still different ways to proceed. One was to move the Player Value table (where we show WAR and its components) to the Advanced Batting page, but bring the most important columns (such as WAR, WAA, oWAR, dWAR, etc.) along with the most important columns from other Advanced Batting tables.

We began testing with that.

Francisco Lindor Advanced Stats Mockup

This early mockup tested well but some users showed a very strong preference for keeping the Player Value table where it was and adding a separate Advanced Stats table below it. Honestly, that was probably the right solution all along, but I wanted to see we could solve this without increasing the number of tables on player pages. We ended up adding one, but that’s fine.

There were several key things from this mockup that tested well, such as:

  1. The collection of stats we chose (which were the result of team discussions and also a survey we shared on Twitter).
  2. The addition of rOBA (our version of wOBA—Reference weighted OBA) and Rbat+ (our version of wRC+—based on the Rbat used in WAR). Despite the fact that these stats are brand new, I was impressed by how many guessed right away what they were.
  3. The links under the table to let users quickly jump to any table on the Advanced Batting page from the main player page. Not only does this help raise awareness of the Advanced Batting page, but also lets users know what tables are specifically on the page before they even go there.

The next version we tested kept all of these features, but put them in a separate Advanced Batting table. We also added base-running data, more batted ball data (such as the oft-requested Exit Velocity and Hard Hit %), and a row to display league averages for each stat (because users may not know what a good XBT% is).

That version of the mockup looked much like what you see today:

Francisco Lindor Advanced Stats

This version tested exceedingly well. Now it came down to building it. I asked Kenny Jackelen (@kennyjackelen on Twitter), Baseball-Reference’s developer, for a summary of the development process for a new feature like this. Kenny said he:

  1. Iterated multiple times with the team internally to get feedback on the table implementation (including how the table should render for players from different eras).
  2. Created new database tables for exit velocity data (which also powers the Hard Hit %)
  3. Added columns to existing tables to store rOBA and Rbat+ more permanently (previously these calculations were done as an intermediate step to get to WAR, so the database structure needed some updates to make it easier to pull them into the page-building process alongside other stats).
  4. Added logic to our play-by-play processing to assign batted balls a Pull/Center/Oppo location so that we can get a count of each type and compute the percentages for the Advanced Batting table
  5. Read a lot of slack messages in ALL CAPS from Adam D—like a marathon runner being handed a cup of water.
  6. When it was ready, I got Mark back on Zoom to see his reaction. He said “it’s a 10.” He elaborated further, saying “It's not enough to be baseball’s best data aggregator. You have to present the information in a way that fans will be able to find it. I was honored that BRef and Adam took my suggestions to heart. The new player page designs put so many great pieces of data in easy to find places… near the top of the page.”

    As a researcher, it was very fulfilling to come full circle with Mark. He went from tough love to delight.

    As helpful as it was, not all user interviews revolve around tough love. Many users I have spoken with weren’t sure what to expect when they hopped on a call. Far more often than not, it’s just a casual conversation about baseball, the different ways people use the site, and what they’d like to be able to do.

    I asked interview subject Jim Passon (@PassonJim on Twitter) if he had any thoughts on the interview process (so you don’t just have to take my word for it). He said “When Adam reached out to me to have a conversation about features that I’d like to see in the future, I couldn’t get the meeting set up quick enough. As expected, the meeting was awesome! I got to make a few suggestions, learn some new tricks, and catch a glimpse of the cool features that were already being developed for the site (which I absolutely loved). I now feel like I’m a part of my favorite site on the web... and that feels pretty good!”

    Interviewee Jessica Brand (@JessicaDBrand on Twitter) echoed a similar sentiment, saying “I felt at ease, just discussing sports in depth in every which way with friends. It’s a great way to get those endorphins going to see and meet up with friends at your local stadium/arena/pitch you can’t necessarily see because of social distancing. Interviewing with Adam and Kenny provided the same warm and fuzzies.”

    And honestly, in this time of social distancing and quarantine, hopping on the phone to talk about Baseball Reference with some of my favorite writers and analysts has been incredibly fulfilling. If you’d like to chat with me about how you’re using Baseball-Reference and Stathead, feel free to reach out at @baseballtwit on Twitter or go ahead and book a time on my calendar to chat.

Posted in Advanced Stats, Baseball-Reference.com, Data, Features, History, WAR | 1 Comment »

Sports Reference Purchases the Databases of Pete Palmer, Ken Pullis, and Gary Gillette

24th February 2021

February 25, 2021

Sports Reference LLC is pleased to announce that they have purchased the historical, statistical databases of Pete Palmer, Ken Pullis and Gary Gillette. This includes full historical databases for

Major League Baseball,
the National Basketball Association,
the National Hockey League, and
the National Football League.

Since their launch in 2000, the Sports Reference sites have presented and relied upon the groundbreaking and painstaking work of Palmer, Pullis and Gillette. Palmer’s pioneering work in baseball statistics has made his database the gold standard in the field, and his work with John Thorn on the Hidden Game of Baseball and Total Baseball is legendary. Pullis’s award-winning work in the field of pro football statistics formed the basis for the ESPN Pro Football Encyclopedia--the last pro football encyclopedia ever printed. Gillette created and edited the ESPN Baseball and Pro Football Encyclopedias and compiled a set of unique MLB databases for subjects like the Disabled/Injured List that previously had never been covered.

We are excited that we will now be the stewards of these databases. We intend to build upon Ken, Pete and Gary's extraordinary work. At Sports Reference, our purpose is to answer questions, so our users can grow their appreciation, understanding, and love of the game. Owning these databases will allow us to continue doing that, but also open up potential new opportunities such as making free databases available for researchers and publishing new products incorporating these datasets.

We are honored that Pete Palmer and Ken Pullis will continue the work on their databases as consultants to Sports Reference and look forward to expanding the scope of what is known about the history of North American sports. We will also be working with Gary Gillette on several special baseball projects in the future.

Sports Reference LLC is based in Philadelphia, PA and serves millions of users a month through its websites: Baseball-Reference.com, Basketball-Reference.com, Pro-Football-Reference.com, Hockey-Reference.com and others.

Pete Palmer is a titan in the field of baseball research and history and has been one of the foremost chroniclers of the National Pastime for the past five decades. He has edited or contributed to virtually every baseball encyclopedia that has been published in the last 50 years. Along with John Thorn, Palmer served as co-editor for seven editions of Total Baseball. Along with Gary Gillette, Palmer served as co-editor for five editions of the ESPN Baseball Encyclopedia. Palmer was also the co-author with Thorn of the seminal 1984 analytics book The Hidden Game of Baseball—a landmark work republished by the University of Chicago Press in 2015. Along with Gillette and Pullis, he served as co-editor of the ESPN Pro Football Encyclopedia. Palmer is also known as co-author of The Hidden Game of Pro Football and as a contributor to Total Football. He lives in Hollis, New Hampshire.

Gary Gillette is the founder and current chair of the Friends of Historic Hamtramck Stadium, a nonprofit that is working to restore the former Negro League ballpark near his home in Detroit. Gillette also served for a decade on the Tiger Stadium Conservancy’s board of directors. He has four decades of baseball research, writing, and editing experience, beginning with his work with Bill James and Project Scoresheet in the mid-1980s. A contributor to six editions of Total Baseball, Gillette later designed and co-edited with Pete Palmer the five editions of the ESPN Baseball Encyclopedia. Gillette also designed the ESPN Pro Football Encyclopedia and served as executive editor for both editions of that reference work. A former member of the Society for American Baseball Research’s (SABR) board of directors, Gillette is a past co-chair of two of SABR’s major research committees—the Business of Baseball Committee and the Ballparks Committee. He was the founder and president of SABR’s Detroit Chapter and is now the chair of SABR’s new Southern Michigan Chapter.

Ken Pullis is a retired air traffic controller and former US Air Force pilot. He has had a lifelong interest in pro football statistics and began doing original research in the late 1980s. Pullis is the 2002 PFRA Ralph Hay Award winner for Pro Football Research and Historiography and was co-editor with Gillette and Palmer of the ESPN Pro Football Encyclopedia, volumes 1 and 2. He currently resides in Vermilion, Ohio.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, Expire30d, Hockey-Reference.com, Pro-Football-Reference.com, Statgeekery, Stathead | 3 Comments »

Katie Sharp joins Sports Reference

7th January 2021

Katie Sharp has joined Sports Reference and will be working both on social media and customer success for our Stathead subscription service. Katie spent seven years as a researcher with ESPN's Stats and Info Group, and since then has worked as a writer, editor and researcher on dozens of articles and books. Most recently, she has been a recurring guest on Jomboy Media's Talkin' Yanks podcast. Katie graduated from Williams College and also has an MBA from the University of Oregon. She's on twitter at @ktsharp.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, FBref, Hockey-Reference.com, Play Index, Play Index 101, Pro-Football-Reference.com, Stathead, Uncategorized | Comments Off on Katie Sharp joins Sports Reference

December 2020 WAR Update

14th December 2020

We recently fixed an issue where, because of the abbreviated 2020 season, we were not allocating enough wins to position players when calculating Wins Above Replacement. We have fixed this issue across Baseball-Reference. With this change, no position player gained more than 0.3 WAR, and no position player lost WAR. All pitcher WAR remained the same.

You can review the changes for each player here: https://docs.google.com/spreadsheets/d/18WY53wSt0GrBMMijLiIFMhVtvbmjuhbYNOaTvHfs-gE/edit?usp=sharing

If you have any questions or concerns, feel free to contact us through our feedback form.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, Data, Statgeekery, WAR | Comments Off on December 2020 WAR Update

Sports Reference adds a DevOps Engineer

10th November 2020

Nick Pazoles has joined Sports Reference as a DevOps engineer. Nick previously worked for Cleo in the Chicagoland area and is the first White Sox and University of Michigan football fan on staff.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, CBB at Sports Reference, CFB at Sports Reference, Expire30d, FBref, Hockey-Reference.com, Pro-Football-Reference.com | Comments Off on Sports Reference adds a DevOps Engineer

Sports Reference Adds Two

3rd November 2020

I'm pleased to announce that Charlotte Eisenberg and Adam Darowski have joined Sports Reference in full-time roles.

Charlotte is joining us as a Data Developer working on baseball, basketball and hockey. She was an intern with Sports Reference previously and also spent a year with the Texas Rangers front office.

Adam is our new Head of User Experience. He has been a long-time consultant for Sports Reference and is responsible for the responsive redesign we undertook in 2016-17 and the design of our new Stathead service. You can follow him on twitter @baseballtwit.

In another move, Jaclyn Mahoney has been promoted from Data Developer and will now manage our Business Intelligence efforts.

About Sports Reference.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, CBB at Sports Reference, CFB at Sports Reference, expire21d, General, Hockey-Reference.com, Stathead | Comments Off on Sports Reference Adds Two

Annual Stathead Subscriptions Now Available!

29th October 2020

By popular demand, we're thrilled to announce that annual Stathead subscriptions are now available at Stathead.com. Stathead is the premier set of sports research tools available to the public and is available for Baseball, Basketball, Football & Hockey. Monthly subscriptions, which are $8/month for a single sport and $16/month for all sports also remain available. The new annual subscriptions are priced to give users who choose them two months for free: $80/year for a single sport or $160/year for all sports. If you're a current subscriber and wish to change from monthly to annual, you may do so here. To learn more about Stathead, please visit here.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, Features, Hockey-Reference.com, Pro-Football-Reference.com, Stathead | 2 Comments »

Stathead Baseball Adds the Pivotal Play Finder

21st October 2020

Last month, we added Championship Leverage Index (cLI) and Championship Win Probability Added (cWPA) to Baseball-Reference. These stats measure how much of an impact each player had on their team's chances of winning the World Series. Today, we are launching the Pivotal Play Finder, which measures the impact that each individual play had on a team's World Series win probability. This tool allows you to customize your query using a number of different filters to find the most impactful plays in a given situation.

It's not surprising to see that the most pivotal play in MLB history occurred in Game 7 of the 1960 World Series. But many would be shocked to find out that it was not Bill Mazeroski's walk-off (which is 6th all-time). The most pivotal play actually occurred an inning earlier. In the bottom of the 8th inning with 2 outs, the Pirates were down 7-6 with runners on the corners when Hal Smith put his team up by two runs with a 3-run home run. This play increased the Pirates' chances of winning the World Series from 30% to 93%. Unfortunately for Smith, the Yankees erased the lead in the top of the 9th, and then Mazeroski became the hero.

With the Pivotal Play Finder, you can search by event type and find out that Babe Ruth's caught stealing to end the 1926 World Series was the most impactful caught stealing in MLB history (10.22%), or that Fred Snodgrass' muff in the 1912 World Series was the most critical error in history (24.39%).

We can also search for plays involving a particular player. Derek Jeter was involved in many memorable moments during his career, but none more pivotal than his walk-off home run in Game 4 of the 2001 World Series.

We can drill down even further and search by team to see that Randy Arozarena's home run off Lance McCullers Jr. in Game 7 of the 2020 ALCS was the most pivotal home run in Tampa Bay Rays history.

In addition to sorting by Championship Win Probability Added, we can also sort by Championship Leverage Index to find the most crucial moments. These situations are usually the most pressure-packed because the difference between an out and a run has an enormous impact on a team's World Series win probability. The situation with the highest cLI in MLB history came in Game 7 of the 1962 World Series. In the bottom of the 9th inning, the Giants were down 1-0 with 2 outs and runners on 2nd and 3rd with Willie McCovey at the plate. A hit would likely tie or win the game (and World Series) for the Giants, while an out would mean a championship for the Yankees. As we know, McCovey lined out sharply to Bobby Richardson to end the series.

Please note that at the time of this writing, Regular Season event data is complete back to 1973, mostly complete back to 1950, and somewhat complete back to 1916. Postseason event data is complete back to 1903. Please see the data coverage page for details.

Posted in Announcement, Baseball-Reference.com, Features, Stathead | 1 Comment »

Baseball-Reference Adds Championship Win Probability Added

30th September 2020

Just in time for the 2020 postseason, Baseball-Reference has added championship win probability added (cWPA) and championship leverage index (cLI) to the site.

Just as single-game win probability added (WPA) measures how a player impacts their team's chances of winning a game, cWPA measures how a player impacts their team's chances of winning the World Series. Similarly, championship leverage index uses the same concept of single-game leverage index (LI), but expands the scope to measure the importance of a particular play, in how it impacts a team's chances of winning the world series.

These stats are highly dependent on context and are best used as "story stats" rather than determining which player was better. When telling the story of the history of baseball, we point to the greatest moments such as Bobby Thomson's Shot Heard 'Round the World, Bucky Dent's home run over the monster, David Freese's clutch performance in Game 6, or Madison Bumgarner's Game 7 performance. Moments like these are captured in cWPA and cLI, but it's not just history's greatest moments. Every event in our play-by-play database has a value.

How are cWPA and cLI calculated?
Let's look at Bobby Thomson's Shot Heard 'Round the World for example. This was the third and final game of the National League tiebreaker series. A win for the Giants would clinch the pennant with a 50% chance of winning the world series. However, a loss would end their season, meaning a 0% chance of winning the world series. The difference between a win and a loss in this game is 50%. To get the championship leverage index, we simply divide .5 by our baseline of .006 (The baseline is explained here). This means that the Giants' cLI for the game is 83.33 (.5/.006). The LI for Bobby Thomson's final at-bat was 4.74. To get the cLI for the at-bat, we simply multiply the game cLI by the at-bat's LI, which gives us 395.0 (83.33*4.74). This mean's that this at-bat is 395x more important to the Giants' chances of winning the world series than the average play on opening day.

When Thomson stepped to the plate, the Giants were down 3-1 with 1 out in the bottom of the 9th, giving them just a 29% probability of winning the game at the start of the at-bat. Since the home run ended the game, the probability of winning the game at the end of the at-bat was 100%. To get the cWPA for the play, we multiply the difference between game win probability at the start and end of the at-bat by the difference between the championship win probability of a win and a loss. This gives Thomson .355 cWPA ((1.0 - .29) * (.5-0)). This means that Thomson's home run increased the Giants' probability of winning the world series by 35.5 percentage points. On the flip side, the opposing pitcher Ralph Branca is given -.355 cWPA for the play.

Note: cWPA values are displayed in percentage format, so the example above displays as 35.5%.

There are currently a number of places to find cWPA and cLI on Baseball-Reference:
Regular Season Leaderboards: Career Regular Season Batting Leaders
Postseason Leaderboards: All-Time Batting Leaders
Batting and Pitching Game Logs: Yaz's amazing 1967 season
Batting and Pitching Win Probability Tables: Sandy Koufax's Pitching Win Probability
Postseason Series Pages: 1991 World Series
Box Scores: 1960 World Series Game 7
League Batting and Pitching Win Probabiliy Pages: 2020 MLB Batting cWPA
Team Batting and Pitching Win Probability Tables: 1975 Reds Batting
Team Schedules: 1978 Yankees

If you have any questions or feedback on this new feature, feel free to contact us through our feedback form.

Posted in Announcement, Baseball-Reference.com, Uncategorized | Comments Off on Baseball-Reference Adds Championship Win Probability Added