Baseball News

Statistical Discussion: Calculate Availability and Mystery of K-BB%.

Recently the Podroach Angle podcast did an episode called ERA Estimates Made Simple where they combined the different ERA ratios available to evaluate a pitcher’s performance (ERA) in comparison to his surrounding subordinates. The idea here is that a pitcher’s ERA is noisy (BABIP luck, HR/FB luck, LOB%, all kinds of luck), but his underlying “skill” should be consistent. A pitcher goes from having a 2.50 ERA to a 4.50 ERA the next season? Maybe his FIP or SIERA said his ERA should have been 3.50, and he just got lucky one year and unlucky the next. The concept of plotting an ERA estimate similar to SIERA was covered in my previous article Scatman Plotter and New Scatter Plot Tool.

There are many of these “estimates” that we use to predict future ERA. Those covered in the podcast include FIP, xFIP, SIERA, xERA, and the biggest one is K-BB%. The concept of K-BB% is simple: it’s just the rate at which you walk batters (walks divided by batters faced) subtracted from your strikeout rate (batters faced). Formulas like FIP, xFIP, and SIERA are just nice extensions of K-BB%, they basically start with strikeouts minus walks, and add something like home runs allowed to FIP, the “normal” home run rate allowed to xFIP, a combination of contact ratings based on the type of contact you give up (line drives / fly balls / ground balls / two ground balls) he walks again to be beaten.

The mystery begins when you look at how well K-BB% does at predicting future performance, consistently beating the “popular” versions like xFIP and SIERA, and doing a better job than those more advanced metrics like xERA or PLV. However, how is that possible? Strikes and walks account for only 30% of the results of an average plate appearance, 70% of the time, which is not a strikeout or a walk. Imagine someone watching 30% of what you eat and being able to predict what your future calorie consumption will be. You don’t know what I eat 70% of the time!

To dig into this mystery, I decided to look at what happens before a strike or a walk. The thing about strike and walk is there is only one way to land each. To walk a batsman, you need three balls first, and to strike, you need to be hit twice first. So if a person strikes out more than he walks, it is assumed that he gets two more strikes than three balls. This means that the pitcher is likely running a 1-2 or 0-2 count more often, which means they have Calculate Leverage to the hitter. The thing about Count Leverage is that it actually has a huge impact on at-bat results. If you watch Foolish Baseball The Only Thing That Matters In Baseballhe talks about the importance of starting the at-bat 1-0 vs 0-1. My thought was to find a figure called “Count Leverage” (CoLev for short)it will only serve as a proxy for K-BB%, but it will capture the other 70% of at-bats because hitters are hitting soft grounders in the 1-2 spot.

Analyzing pitch-by-pitch data for 1.6 million pitches across 400 pitchers, I was able to calculate the CoLev for every pitcher, each year, between 2021-2025. For example, here is the ideal leaderboard for 2025:

CoLev 2025 leaderboard

It’s not too surprising that many of last year’s best pitchers are looking up, of course, the interesting addition is that. Janson Junk. Junk represents an interesting finding with CoLev, namely that it doesn’t correlate well with K-BB%, which means that getting to a positive number doesn’t mean you’ll get more strikeouts and avoid walks. A major factor in that was that CoLev was good at predicting walks, but terrible at predicting hits.

  • r(CoLev, BB/9) = -0.686 (respectively strongly correlated)
  • r(CoLev, K/9) = +0.271 (not well correlated)

The reasons for that are very simple. If the pitcher lands on a bad pitch, there is no skill they need to walk someone, they just need to throw the fourth ball. However, going 0-2 or 1-2 still requires the pitcher to do something difficult – get a big hitter or watch for a third strike. That’s why someone like Janson Junk can be powerful Calculate Leverage but don’t hit too many hitters.

Moving on to the bigger question, though, is CoLev better than K-BB% in predicting ERA? Here’s a summary table showing the projected R2 value of pitchers season by season for a bunch of different metrics, including CoLev (remember, the higher the R2 number the better):

CoLev R2 results

Womp Womp Gif

What a blast! It actually did *worse* on any metric. Which, if you read what I wrote about Janson Junk, you probably already figured it out. CoLev already misses a large portion of strikeouts and walks because it doesn’t predict future K-BB%, and given the importance of K-BB% if CoLev couldn’t get that right, there was a very low chance it would get anything like a future ERA. So this made me say, aha! But what about those 70% of at-bats that end without a walk or a hit, 1-2 slow-hitting ground balls? Certainly CoLev will do a good job predicting those. It appearedCoLev didn’t tell us much about anything in that 70%. Using wOBACON (wOBA in contact), I found that CoLev has an R2 of 0.055 with wOBACON. Compare that to 0.170 R2 and freckin’ K-BB%, a number that *ignores* balls in play we do a better job of than the new stat that should help explain batted balls!

The real lesson here is that the rest 70% is almost impossible to predict. By taking all kinds of different calculations, I found that it is almost impossible that “this pitcher has the magic power to limit the damage of the balls in play”. Even taking a statistic like xwOBACON, which clearly predicts how a hitter *should* perform on balls in play, doesn’t correlate well with how that hitter will manage damage in the future. So the answer to the K-BB% mystery is there is none of these metrics actually do a good job of predicting ERA, and the main reason for that is trying to predict an unpredictable 70% of plate appearances. Some pitchers will do better in wOBACON next year, some will do worse, but when combined, it’s pretty random how that will affect a pitcher’s ERA next year. The only thing we can count on is strikeouts and walks, and so is K-BB%.

Finally, CoLev is available in a plot disassembly tool for anyone who wants to test it. The interesting thing here is that you can visualize how CoLev and K% are not related, but CoLev and BB% seem to be highly related.

CoLev vs Walk Rate - 2025 starters with +100 IP

CoLev vs Strikeout Rate - 2025 Starters with +100 IP

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button