Back in 2003, Baseball Mogul simulated each game one plate appearance at a time. The simulation engine determined the result of each batter-pitcher matchup, but it didn’t simulate each individual pitch. Because I wanted to include realistic pitch counts, I needed to come up with a way to estimate the number of pitches thrown in each outing, based on the results of that outing. Baseball fans know that some at-bats end on the first pitch, but most last longer, with walks and strikeouts using up the most pitches. I wrote code to randomly generate balls and strikes for each plate appearance, with some interesting tweaks for realism. For example, batting averages are a bit higher on the first pitch than on later pitches of the at-bat, meaning that batters that get a hit will see fewer pitches — on average — than batters who put the ball in play but are put out. Then I stumbled into an article at Baseball Prospectus where Nate Silver ran a linear regression on data from 2001 and 2002 to come up with a formula he called "Implied Pitch Count". His regression generated this formula: Implied pitch count (IPC) = (3.17 * BF) + (3.44 * BB) + (1.53 * K) (BF = batters faced) I still have the following comment in my code: Additional Formulas I went back and adjusted my sim code until the number of pitches thrown in an outing closely matched the result given by the Baseball Prospectus formula. That same year, Tom Tango looked at more data and published a formula that he called the Basic Pitch Count Estimator: Pitch Count Estimate = (3.3 * BFP) + (2.2 * BB) + (1.5 * K) In 2007, Brian Yonushonis used more recent data to create a new formula: (3.29 x BF) + (1.92 x K) + (2.04 x BB). Simplifying the Formula In 2020, I wanted to create a “pitch count generator” for Season Ticket Baseball. It thought it would be cool if you could include a pitch count in your box score without having to do a lot of extra math. When I played around with the numbers I found out that lowering Tango’s 3.3 number to exactly 3 and raising the other coefficients to exactly 3 provided slightly more accurate estimates for seasons in the 21st century. And they have the advantage of being integers, making the math really easy. This spreadsheet compares this formula to the formulas published by Tom Tango and Brian Yonushonis. For each season, we know the number of walks/game, strikeouts/game and pitches/PA. The ‘Estimate’ column shows the Pitches/PA calculated by the formula and the ‘Diff’ column shows the different between the real-life number and the estimate. (The next column shows the difference in percentage terms.) Tango's formula is very good (only off by 1.37% on average) but it suffers a bit in this analysis from the fact that most of the baseball data I’m using was generated after his formula was published. Anyway… using a coefficient of '3' for all the terms gives up the following super-simple pitch count estimator for Season Ticket Baseball: Pitch Count Estimate = (BF + BB + K) x 3 Just add up the batters faced, walks and strikeouts. Then multiply that total by 3. Strikes Thrown I was curious if I could come up with a formula for the number of strikes in each plate appearance. The league average number of strikes thrown per outing is usually around 62% of all pitches thrown.
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
Archives
April 2025
Categories
All
|