Back in 2003, Baseball Mogul simulated each game one plate appearance at a time. The simulation engine determined the result of each batter-pitcher matchup, but it didn’t simulate each individual pitch. Because I wanted to include realistic pitch counts, I needed to come up with a way to estimate the number of pitches thrown in each outing, based on the results of that outing. Baseball fans know that some at-bats end on the first pitch, but most last longer, with walks and strikeouts using up the most pitches. I wrote code to randomly generate balls and strikes for each plate appearance, with some interesting tweaks for realism. For example, batting averages are a bit higher on the first pitch than on later pitches of the at-bat, meaning that batters that get a hit will see fewer pitches — on average — than batters who put the ball in play but are put out. Then I stumbled into an article at Baseball Prospectus where Nate Silver ran a linear regression on data from 2001 and 2002 to come up with a formula he called "Implied Pitch Count". His regression generated this formula: Implied pitch count (IPC) = (3.17 * BF) + (3.44 * BB) + (1.53 * K) (BF = batters faced) I still have the following comment in my code: Additional Formulas I went back and adjusted my sim code until the number of pitches thrown in an outing closely matched the result given by the Baseball Prospectus formula. That same year, Tom Tango looked at more data and published a formula that he called the Basic Pitch Count Estimator: Pitch Count Estimate = (3.3 * BFP) + (2.2 * BB) + (1.5 * K) In 2007, Brian Yonushonis used more recent data to create a new formula: (3.29 x BF) + (1.92 x K) + (2.04 x BB). Simplifying the Formula In 2020, I wanted to create a “pitch count generator” for Season Ticket Baseball. It thought it would be cool if you could include a pitch count in your box score without having to do a lot of extra math. When I played around with the numbers I found out that lowering Tango’s 3.3 number to exactly 3 and raising the other coefficients to exactly 3 provided slightly more accurate estimates for seasons in the 21st century. And they have the advantage of being integers, making the math really easy. This spreadsheet compares this formula to the formulas published by Tom Tango and Brian Yonushonis. For each season, we know the number of walks/game, strikeouts/game and pitches/PA. The ‘Estimate’ column shows the Pitches/PA calculated by the formula and the ‘Diff’ column shows the different between the real-life number and the estimate. (The next column shows the difference in percentage terms.) Tango's formula is very good (only off by 1.37% on average) but it suffers a bit in this analysis from the fact that most of the baseball data I’m using was generated after his formula was published. Anyway… using a coefficient of '3' for all the terms gives up the following super-simple pitch count estimator for Season Ticket Baseball: Pitch Count Estimate = (BF + BB + K) x 3 Just add up the batters faced, walks and strikeouts. Then multiply that total by 3. Strikes Thrown I was curious if I could come up with a formula for the number of strikes in each plate appearance. The league average number of strikes thrown per outing is usually around 62% of all pitches thrown.
0 Comments
When I first researched injuries for Baseball Mogul, I collected a LOT of data:
One thing that surprised me: most injuries didn’t officially “occur” in the middle of a game. In other words, when a player missed 30 games because of a broken toe, there was nothing in the play-by-play record showing them leaving the game. In some cases, the broken toe was a “stress injury” that accumulated over time until the player needed time off. But even when there’s a specific injury event (like a collision or hit-by-pitch) the player often stayed in the game.[1] It wasn’t until they got into the locker room (to see the trainer or team doctor, get X-rays etc.) that they got put on the DL. The original version of Baseball Mogul was a pure “GM” simulation. You traded and signed players, set your lineup and rotation, simulated games, and viewed the results. There was no in-game Play-By-Play Mode. In this context, it made perfect sense to implement injuries at the end of each day. Only when you to set up your lineup for the next game do you see that your shortstop was dealing with a “hairline wrist break”. [1] This has changed somewhat over the years. The new concussion protocols are leading to more players leaving in the middle of a game. But if you look at injuries from the 1960s – 1980s, it seems that most games missed are due to “nagging” injuries that don’t have a single causal event. In-Game Injuries When I added Play-By-Play Mode, it was tempting to change the game to inflict injuries in the middle of games instead of after the game. If you remember other baseball games from 20+ years ago, you might remember a ton of annoying popups that would hold up gameplay. Something like “Your lineup is invalid – you must fix this error to proceed”. I hated these popups and did my best to eliminate them completely from Baseball Mogul. One of the ways I did this was by keeping injuries out of Play-By-Play Mode. There are hardcore baseball replayers on YouTube (Kurt Bergland etc.) who choose to completely ignore the injury rules in whatever game they are playing. They don’t want anything to break up the flow of the game and I have sympathy for this viewpoint. Nevertheless, people kept asking for in-game injuries and I kept postponing this feature. One reason was that I felt I had made the correct design decision to minimize the number of “annoying popups” that got in the way of game play. But the biggest reason was that I didn’t want to break the existing system. I had fine-tuned it over the years to generate the correct number of injuries for each player. For example, a player who missed 26 games per season in real-life would miss that many games (on average) in Baseball Mogul[1]. Linking injuries to on-field events instead of the statistical record messes up this math: shortstops suffer more collisions than DHs; base stealers and aggressive runners require more injury checks; etc. Baseball Mogul 2025
How Does It Work?When an injury occurs to a player on your team during Play-By-Play, you will be asked to replace that player. In a long extra-inning game where you have used all of your bench and bullpen players, it is theoretically possible for this to create a situation where you can’t continue. If this happens, click “Edit Player”. This will give you the option to erase that player’s injury.
There's also a new option for watching games between two computer-controlled teams. In previous versions, the plays and animations would zip by without stopping. So I've implemented a suggested option to require a click between each play (similar to the interface when you are controlling one or both teams). This option can be turned on in the Tools Menu, or by clicking the 'Pause' button on the Play-By-Play Screen.
Baseball Mogul 2025 adds detailed pitch-by-pitch data for more than 5,000 pitchers. This includes pitch arsenals and pitching styles for more than 59,000 pitcher-seasons from 1901 through 2024 and pitch usage patterns for more than 16,000 pitcher-seasons from 2002 through 2024. A "pitcher-season" is data for one season of a pitcher's career. For example, our database includes 14 pitcher-seasons for A.J. Burnett (2002-2015). New pitch ratings have been calculated for all pitchers based on available data regarding each pitch's effectiveness and usage rate, ensuring that Baseball Mogul continues to be the most realistic plate appearance simulator in video game history.
Update (November 2024): Replay Mode is being revised for Baseball Mogul 2025 to generate more realistic results for players with very little playing time in the given season (e.g. previous versions had trouble creating accurate ratings for a player with only a handful of plate appearances or batters faced). Baseball Mogul is a multi-season General Manager simulator in which player talent is calculated from overall career stats. An example can be seen below. Although Dwight Gooden pitched poorly in 1994, his ratings at the beginning of the season are quite high — because of his performance in the rest of his career. However, I've gotten a number of emails from people who would like to use Baseball Mogul to replay single historical seasons, in the same way that they might use a game like Strat-O-Matic or Season Ticket Baseball. Replay Mode makes this possible by determining each player's performance from their real-life statistics for the single season that is being simulated. This is Gooden's Scouting Report when a new game is started in Replay Mode: Replay Mode loads each player's historical stats for the season that is being simulated, and uses these as the player's "Predicted" stats for the upcoming season. Because Gooden had a bad year in 1994, his ratings are much lower than they were in the previous example. Playing Time Limits
Continued Development As mentioned above, Replay Mode is designed for single-season historical replays. Playing multiple seasons can cause problems — primarily because teams will no longer have enough playing time at every position. I could have limited Replay Mode to a single season, but I left it open so that I can continue to test and get feedback. If you are using Replay Mode in a multi-season sim, please let me know how I can improve this feature going forward. (For example, I could add a feature that forced team trades to match historical trades if there is enough interest.)
Thanks! |
Archives
April 2025
Categories
All
|