Article: Pitching WAR Primer: The Math Behind Baseball's Least Understood Stat

Maddie Landis · June 13, 2025

If you follow baseball in the modern day, you’ve probably heard of the statistic WAR. WAR (wins above replacement) is a relatively new, advanced metric to quantify a player’s value compared to a replacement-level player. MLB provides the following definition: “WAR measures a player's value in all facets of the game by deciphering how many more wins he's worth than a replacement-level player at his same position (e.g., a Minor League replacement or a readily available fill-in free agent).”

WAR has solid calculations for position players. However, for pitchers, WAR is a different rodeo. This piece will focus on pitcher WAR. FanGraphs and Baseball Reference, the two leading baseball statistics sites, use different formulas to calculate pitcher WAR. FanGraphs (fWAR) includes Fielding Independent Pitching (FIP), and Baseball Reference (bWAR) uses Runs Allowed Per Nine Innings Pitched (RA9) in their formula.

FanGraphs’ equation:

WAR = [[([(League “FIP” – “FIP”) / Pitcher Specific Runs Per Win] + Replacement Level) * (IP/9)] * Leverage Multiplier for Relievers] + League Correction

Read more about FanGraphs' breakdown of their formula here, WAR for Pitchers, and their explanation on FIP, Why Our Pitcher WAR Uses FIP and Why Our Pitcher WAR Uses FIP, Part Two.

FanGraphs’ FIP formula:

AD_4nXdvNLZ3bDIygouG0695hA84BA5JtXaKAj0jrlB4Y52bi1UYdsoJawKYXSbc5AgFU-hGQ6JyzvMuqjsswuu8xaG1GxrKdtFFtCDGsjKFmE4iq46vehUj6Nxbdz3C34HDICx2uPBW?key=VEInY6c9xpPEMAP2ieWzGA

Baseball Reference’s equation:

WAR = WAR_rep + WAA + WAA_adj

Read more about Baseball Reference’s explanation of their formula here.

Since Baseball Reference and FanGraphs use different statistics in their formulas, there are some discrepancies between their WAR. FIP measures a pitcher’s performance and calculates events they have control over (strikeouts, walks, hit-by-pitches, and home runs), whereas RA9 counts all runs, earned and unearned, that are scored when a pitcher is on the mound. FIP doesn’t account for factors like defense, pitch sequencing, and luck.

A pitcher who induces more weak contact with his pitchers will likely post a higher FIP, but since they limit the number of runs scored, their RA9 would be lower, resulting in a higher bWAR and lower fWAR.

A comparison of Hunter Dobbins and Brayan Bello’s fWAR is a great case study to examine the usefulness and flaws of WAR. Generally speaking, pitchers who have thrown more innings and limit the amount of runs scored against them generate higher WARs. However, Dobbins has a 4.35 ERA (41.1 IP) and 0.6 fWAR, in comparison to Bello’s 3.96 ERA (52.1 IP) and 0.3 fWAR. Despite Dobbins’ higher ERA in fewer innings, he’s posted twice the amount of WAR as Bello.

Pitcher	fWAR	FIP	ERA
Hunter Dobbins	0.6	3.68	4.35
Brayan Bello	0.3	4.57	3.96

The simple explanation for these discrepancies is that Dobbins’ FIP is significantly lower than Bello’s. FIP factors home runs, batted balls, hit by pitches, and strikeouts, and Bello has yielded more balls (25) and hit by pitches (four) than Dobbins (six and two, respectively). Dobbin’s positive 0.67 ERA-FIP differential suggests he’s been unlucky, while Bello’s negative 0.61 ERA-FIP differential indicates he’s gotten lucky. Conversely, Dobbins and Bello share the same bWAR (0.3), and their RA9 are closer (Dobbins: 4.38, Bello: 4.47).

Depending on who you ask, ERA is a flawed statistic that’s dependent on factors outside a pitcher’s control, including defense, pitch sequencing, and pure luck. For instance, Red Sox reliever Liam Hendriks’ ERA benefited from Gold Glove-caliber defense from Wilyer Abreu and Ceddanne Rafaela to rob Kerry Carpenter of a home run.

Statistics aren’t perfect; bWAR and fWAR are just two of many metrics that are used to evaluate a pitcher’s performance. While bWAR assumes pitchers have control over the outcome of every ball hit into play, fWAR places more emphasis on a pitcher’s direct ability to prevent runs. Each possesses inherent qualities, and which database fans choose ultimately depends on their preference. Fortunately, FanGraphs and Baseball Reference offer solid metrics for fans to review and determine their own opinions of a pitcher’s performance.

View full article

Kimmi · June 13, 2025

A common criticism of WAR is the difference, sometimes large, between a player's bWAR and fWAR. One way the difference is described is that bWAR tells you more of what actually happened on the field while fWAR tells you more of what should have happened on the field. fWAR is the better predictive stat.

The difference between the two stats does not make WAR a bad or meaningless stat, however.

Tomisphere · June 13, 2025

It's not right to say "bWAR assumes pitchers have control over the outcome of every ball hit into play". First of all, every outcome, including home runs, strikeouts, walks, involves some luck. Great pitches get hit. Bad pitches get missed. Calls get missed. Balls miss being a home run or a fair ball by an inch. Pitchers exert only partial control over every outcome. The best you can say is that pitchers exert more control over some of them (home runs, Ks, BBs, infield pop ups), and less over others (fieldable balls in play).

"Control" is an unfortunate word choice here, I think it gives people some flawed ideas about what's going on. I think it's better to speak in terms of probabilities. By pitching well, a pitcher can reduce the probabilities of bad outcomes happening, and increase the probabilites of good outcomes happening. This applies across all kinds of outcomes, and not just the FIP outcomes. But what those probabilities are are only partly influenced by the pitcher's performance. They're also influenced by other things. The probability of a hit on a ball in play is influenced by the quality of the defense, and the park dimensions. The probability of a ball or a strike is influenced by the catcher's framing ability, which umpire is at home plate, and how well the batter is doing that day. Home runs are influenced by park dimensions, temperature, humidity, and wind strength and direction. And who is hitting.

The Baseball Reference approach to pitcher WAR is to assess how much influence some of those other things likely had on the pitcher's run total, and adjust the pitcher's allowed run total from there. The quality of the defense behind the pitcher is one of the things they adjust for. This approach considers 100% of a pitcher's outcomes.

The FanGraphs approach to pitcher WAR is to disregard the 63% (on average) of a pitcher's outcomes that are more greatly influenced by other (non-pitching) factors, and to only regard the 37% (on average) of outcomes that are less influenced by other factors. Like Baseball Reference, they also make some adjustments that consider the influence of some of those other factors. But I find this overall approach to be inferior. It just disregards too much.

Here's a study I did of a case where these differing approaches made a huge difference:

Brock Beauchamp · June 13, 2025

2 hours ago, Tomisphere said:

It's not right to say "bWAR assumes pitchers have control over the outcome of every ball hit into play". First of all, every outcome, including home runs, strikeouts, walks, involves some luck. Great pitches get hit. Bad pitches get missed. Calls get missed. Balls miss being a home run or a fair ball by an inch. Pitchers exert only partial control over every outcome. The best you can say is that pitchers exert more control over some of them (home runs, Ks, BBs, infield pop ups), and less over others (fieldable balls in play).

"Control" is an unfortunate word choice here, I think it gives people some flawed ideas about what's going on. I think it's better to speak in terms of probabilities. By pitching well, a pitcher can reduce the probabilities of bad outcomes happening, and increase the probabilites of good outcomes happening. This applies across all kinds of outcomes, and not just the FIP outcomes. But what those probabilities are are only partly influenced by the pitcher's performance. They're also influenced by other things. The probability of a hit on a ball in play is influenced by the quality of the defense, and the park dimensions. The probability of a ball or a strike is influenced by the catcher's framing ability, which umpire is at home plate, and how well the batter is doing that day. Home runs are influenced by park dimensions, temperature, humidity, and wind strength and direction. And who is hitting.

The Baseball Reference approach to pitcher WAR is to assess how much influence some of those other things likely had on the pitcher's run total, and adjust the pitcher's allowed run total from there. The quality of the defense behind the pitcher is one of the things they adjust for. This approach considers 100% of a pitcher's outcomes.

The FanGraphs approach to pitcher WAR is to disregard the 63% (on average) of a pitcher's outcomes that are more greatly influenced by other (non-pitching) factors, and to only regard the 37% (on average) of outcomes that are less influenced by other factors. Like Baseball Reference, they also make some adjustments that consider the influence of some of those other factors. But I find this overall approach to be inferior. It just disregards too much.

Here's a study I did of a case where these differing approaches made a huge difference:

Thanks for the great post and welcome!

TedYazPapiMookie · June 14, 2025

NOTE: This is a warning for the folks that don't want to hear about the massive issues with metrics, please skip this response because it conflicts with everything you believe about metrics.

Having just read the equation above; one realizes that metrics are a fabrication of data because formulas include arbitrary components like FIP Constant. Stats are facts and this formula is nothing more than an ESTIMATE. Why? The data is normalized and there is a constant that is arbitrary.

That means the specifics of a pitcher are put into a caldron with all the other pitchers to create a universal pourage that represents baseball regardless of whether the data should be further stratified because all pitchers ARE NOT alike. Do we want to blend data from guys like Skubal, Skene, deGrom and Wheeler with slugs so we can pretend this particular clustering of the data represents the league? This concept alone negates any relevance to this specific metric. There is no scientific justification for what constants are accurate or why the size of the sample is appropriate.

For you who are fantasy players this might make the most sense since you must deal with this predicament prior to each season. If you are in a league with 12 teams and each team must start 9 starting pitchers or relievers.

When you calculate averages for the league rankings do you use:

1 - All pitchers

2 - 9 times 12 or 108 pitchers (the max that can be played in the league at one time)

3 - Less than 108 because you only want to compare Starting Pitchers and the 108 is comprised of both starters and relievers.

4 - What point in time do you pick to use in the estimate? 2024, the last 3 years or maybe a projection provided by someone else or your own projection?

5 - How does one account for time lost to injury? Is the player's numbers normalized to eliminate the time lost or is the actual number used reducing the players effectiveness due to lost time?

This is a small sample of how many things we need to consider in ranking a pitcher prior to a draft. Each decision above impacts what goes into the caldron that gets normalized. Each choice creates a unique result for the caldron. Each choice varies the estimate to varying degrees.

When I see a pitcher's ERA and his historical ERA and see if there is any trending, I am using real performance data not something that gets thrown in a caldron for normalization. The numbers are his and his alone. I can then look at the next pitcher and see the exact same numbers and compare them, Facts vs Facts.

Now if a fabricated number comes out of the caldron like FIP or any of the other metrics, what do we know about that number?

There is a generation who has grown up on the fabrications and completely believes in them. They say things like average doesn't matter or fielding percentage doesn't matter without having a clue as to why they actually do matter and are the foundation of baseball statistics.

If you like what-if analysis, then metrics are great. They pretend their numbers are real and try to compare things that can't be compared effectively. Check out the list of issues above related to drafting in fantasy baseball with respect to the normalization that happens in metrics. Now calculate the number of permutations that could be generated by FanGraphs or any of the other metric sites based on simply varying the sample size of players to be thrown into the caldron to determine what is a fair sample size. Each permutation provides a different estimate, so which one is the most accurate?

I'll take baseball stats over guesses from companies that provide massive equations filled with constants THEY pick, variables THEY pick and normalization techniques THEY pick. That's why so much of metrics is questionable with regard to accuracy.

Here is the key to all metrics from my perspective:

Is the metric (let's use FIP) a stat pulled from a record book or an interesting estimate of reality that is plus or minor percentage points wrong? The answer is it's an estimate suggesting that the unique set of circumstances that occurred on one play, or one week or one season or one career rendered a conclusion about how lucky a pitcher was? Is indicating that the pitcher was luckier or unluckier than he could have been relevant to his stats? No because it might not have been luck. Does it project his future? Nope because the future is unknown. So then why should an estimate like FIP be invented and how does it help analyze data when it's not accurate? The ultimate answer is it doesn't and for fans like me I ask who cares and why? Fun trivia? Maybe. Valid future indicator? ABSOLUTELY NOT.

The unique set of circumstances that fabricated a specific FIP is not repeatable because it is historic and to predict using the number is like guessing heads or tails. The number is of no assistance in predicting the future. It's a poorly constructed recap of real baseball stats that try to introduce implications that can't be validly calculated. That's why it seems like a massive waste of time due to the inaccuracy of the data.

If someone came to me and said I can take your data as recorded officially in the record books and put it into a computer with a few folks providing interpretations of that data through extremely complicated algorithms that make gross assumptions about the game, the data and how to use it to conclude the estimates needed to analyze baseball and it will be delivered to you since you can't really calculate it yourself, would you believe the results as an act of faith? I don't.

I walk away when the salesman says, "trust me". When estimates determine such important things as WAR and WAR is calculated by every outlet that hustles metrics, the game is in bad shape and needs to rethink the concept of stats versus metrics. Stats represent what you did, and metrics are things you might theoretically do but it's pure conjecture. Players can only produce during real events, those real events can be interpreted to suggest what a player might do if not constrained by reality. Hypotheticals are extrapolations of reality that don't truly exist so they should have no place in reality where ball players can only do what the circumstances allow him.

Baseball is about scoring more runs than the other team not being more athletic than the other team. Winning is the essence of the game not the individual feats that can be extrapolated thanks to athleticism. Without the out on each play, the athleticism is of no value at all.

Metrics are designed for showcase events as a way to guess what happens when an apples to oranges situation occurs in baseball. Stats are the recorded events and reflect the achievements of the player and team.

Now baseball needs to figure out how to keep fans from thinking the metric estimates representing hypotheticals reflect reality or predict the future. They are simply fun what-ifs that fans have done since the inception of baseball that try to answer the age-old question: Was Babe Ruth better than Hank Aaron? Any precision in the numbers? NOPE. Is it fun to make estimates support any argument you want to make to come out on top in your argument? YEP. Factual or reliable? NOPE. Entertaining? YEP

There is a place for metrics in baseball as long as there is a HUGE caveat that nobody takes them seriously. They are hypothetical because you can't rewind a groundball, and have it match the length of the guy you are comparing him with so you can see if both players are defensively comparable. Instead, you assume one has more skills because he had a larger average distance to balls hit to him so therefore, he has greater range because the universe of plays was not as kind to the second player who ended up closer to all balls hit to him. Without complete control over the environment one player is at a disadvantage being evaluated by the very estimates that are supposed to clear up the shortcomings of the Fielding Percentage. That is the double edge sword of metrics. Seeking more precision by grossly estimating averages and applying them equally to unequal game situations. It simply creates different apples to oranges comparisons, it resolves nothing.

I hope a few old school folks got a kick out of the detailed dissection of metrics. So go use those metrics for your entertainment just remember they are only estimates with massive shortcomings.

Bellhorn04 · June 14, 2025

Great example of how large the spread between bWAR and fWAR can be, not just for a season but for a whole career: our old friend Rick Porcello.

Career bWAR 18.8

Career fWAR 29.6

That's large!

In his final season, the short one of 2020, Porcello had a grisly ERA of 5.64 in 59 innings, but a FIP of 3.33.

fWAR - 1.6 - ace level!

bWAR - 0.2 - barely above replacement level!

Since he never pitched again, did teams reject the usefulness of fWAR for pitchers?

moonslav59 · June 14, 2025

Great article and follow-ups, except one.

I have a much better understanding, now and more faith in how close WAR comes to measuring how good pitchers are than I had before. Thanks.

TedYazPapiMookie · June 15, 2025

14 hours ago, Bellhorn04 said:

Great example of how large the spread between bWAR and fWAR can be, not just for a season but for a whole career: our old friend Rick Porcello.

Career bWAR 18.8

Career fWAR 29.6

That's large!

In his final season, the short one of 2020, Porcello had a grisly ERA of 5.64 in 59 innings, but a FIP of 3.33.

fWAR - 1.6 - ace level!

bWAR - 0.2 - barely above replacement level!

Since he never pitched again, did teams reject the usefulness of fWAR for pitchers?

For me, bWAR, FWAR or ANY OTHER WAR is meaningless because it's not accurate because the sheer concept of wins above replacement is silly. It's a guess. The index is supposed to measure performance, but it creates a fantasy world where a RUN is inferred by actions universally across all players. When in baseball is anything universally comparable among players?

What makes any of the specific WAR values more accurate in defining the value of a player than an equation that incorporates BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage? Those are the factual components of a player's performance. Nothing extrapolated, nothing normalized, just the key components of value produced by a player. Hypothetical runs scored equivalents? Seriously. That even sounds bogus.

I appreciate you providing a perfect example supporting my comments!!

The public has bought into the fabricated metrics unfortunately because they want answers, no matter how incorrect, to the question of who is better than this player? Think about how much the industry grew without a mathematically valid product!! The fact that maybe the single most critical ESTIMATE called WAR couldn't be agreed upon so companies started adding prefixes to avoid resolving the issue tells you everything about the accuracy of these numbers.

FYI.... there are a billion examples like the one above because the techniques used in calculating WAR are invalid and render inconsistent results as shown above.

Bellhorn04 · June 15, 2025

13 hours ago, TedYazPapiMookie said:

For me, bWAR, FWAR or ANY OTHER WAR is meaningless because it's not accurate because the sheer concept of wins above replacement is silly. It's a guess. The index is supposed to measure performance, but it creates a fantasy world where a RUN is inferred by actions universally across all players. When in baseball is anything universally comparable among players?

What makes any of the specific WAR values more accurate in defining the value of a player than an equation that incorporates BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage? Those are the factual components of a player's performance. Nothing extrapolated, nothing normalized, just the key components of value produced by a player. Hypothetical runs scored equivalents? Seriously. That even sounds bogus.

I appreciate you providing a perfect example supporting my comments!!

The public has bought into the fabricated metrics unfortunately because they want answers, no matter how incorrect, to the question of who is better than this player? Think about how much the industry grew without a mathematically valid product!! The fact that maybe the single most critical ESTIMATE called WAR couldn't be agreed upon so companies started adding prefixes to avoid resolving the issue tells you everything about the accuracy of these numbers.

FYI.... there are a billion examples like the one above because the techniques used in calculating WAR are invalid and render inconsistent results as shown above.

I guarantee you that actual major league baseball teams in the year 2025 use a lot more than BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage to evaluate players.

You realize that teams have large analytics departments now?

moonslav59 · June 15, 2025

It's not a guess.

When we used to stand around the watercooler trying to debate over who is a better players based on stats, defense and our bias, that was "real scientific," wasn't it?

sk7326 · June 15, 2025

13 hours ago, TedYazPapiMookie said:

For me, bWAR, FWAR or ANY OTHER WAR is meaningless because it's not accurate because the sheer concept of wins above replacement is silly. It's a guess. The index is supposed to measure performance, but it creates a fantasy world where a RUN is inferred by actions universally across all players. When in baseball is anything universally comparable among players?

What makes any of the specific WAR values more accurate in defining the value of a player than an equation that incorporates BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage? Those are the factual components of a player's performance. Nothing extrapolated, nothing normalized, just the key components of value produced by a player. Hypothetical runs scored equivalents? Seriously. That even sounds bogus.

I appreciate you providing a perfect example supporting my comments!!

The public has bought into the fabricated metrics unfortunately because they want answers, no matter how incorrect, to the question of who is better than this player? Think about how much the industry grew without a mathematically valid product!! The fact that maybe the single most critical ESTIMATE called WAR couldn't be agreed upon so companies started adding prefixes to avoid resolving the issue tells you everything about the accuracy of these numbers.

FYI.... there are a billion examples like the one above because the techniques used in calculating WAR are invalid and render inconsistent results as shown above.

If you look at baseball reference's definition of Replacement Player - it notes that a team of Replacement Players i.e. the sort of guy you can find on the waiver wire or in AAA ... their winning percentage would be about .294 ... about 48-114 over a full season. The Red Sox in 2024 had a combined team level of 26.8 position bWAR and 12.7 pitcher bWAR. That comes out to 87.5 wins, which is a little higher than the actual 81-81, but well in the neighborhood. In 2021 is 24.4 position, 18.9 pitching which gets you to 91.3 wins in a season where they actually went 91-70.

The offensive portion of WAR is quite factual - it assigns weights based on historical analysis - to something we all know intuitively: all bases are not created equal and not all batting outcomes are equal for scoring runs. The defensive part is dicey just because measuring the most important part of defense - GETTING TO THE BALL - is not quantified by any sort of box score "factual" component.

With pitching, I look at it this way. fWAR is - in general - a better forward looking indicator. What happens after a player makes contact IS very noisy. However, FIP assumes that all batted balls in play are effectively random, and that is obviously not true. The nice thing is, the batted ball data we are getting now helps there. If there is a good FIP combined with iffy line drive rates, that sounds like something which will correct in the future. bWAR is a bit better for awards, because it is speculating a bit less.

Either way, if you are using WAR expecting one stat to explain everything, that is a you problem. WAR in general is good for doing historical comparisons, and for identifying candidates to be considered further for awards and whatnot. I do not know whether a player with 6.2 bWAR had a better season than someone with 5.8. But I do know both of them were terrific.

sk7326 · June 15, 2025

28 minutes ago, Bellhorn04 said:

I guarantee you that actual major league baseball teams in the year 2025 use a lot more than BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage to evaluate players.

You realize that teams have large analytics departments now?

In particular, a lot of the inputs teams are probably using in-house are incorporating all of that PitchFX and Statcast data that is coming in. The nice thing is that data is starting to allow the industry to DIRECTLY measure these areas (defense, contact quality) which we've intuitively understood but could only quantify indirectly.

moonslav59 · June 15, 2025

We've all tried to figure out how valuable defense, baserunning, getting on base and power are when determining who are the best or better players. We all have or biases and favorite players and stats or areas of skill.

It's been a years long endeavor and debate.

WAR tries to quantify this endeavor by placing value on these aspects of the game. It uses historical data to determine how much a hit, homer, double, SB, error or great play leads to a run, run saved, and ultimately a team win or loss.

It's far from perfect, but so are our debates and personal opinions on who is better: the .310 batter who plays GG defense at SS or the .275 batter who smacks 50 Hrs, has a higher OBP than the SS, but is just a so-so defender in LF.

I think I'd trust a system based on data and historical values that determine their formula, than any one person's opinion, including my own.

If you choose to ignore WAR and metrics, that's fine with me, but IMO, nobody can prove a claim that their non data based methodology is a better one.

Take any top 10 list of players by WAR and compare to any other top 10 list by any stat, even OPS or wRC+ and tell me that single stat does a better ranking than WAR.

TedYazPapiMookie · June 15, 2025

1 hour ago, Bellhorn04 said:

I guarantee you that actual major league baseball teams in the year 2025 use a lot more than BA, ISO, Walk Rate, SB, Runs Produced and Fielding Percentage to evaluate players.

You realize that teams have large analytics departments now?

All tools are at their fingertips. Yes, I am well aware. The point is you start with the reliable data and then you run your what-if analysis just like you do in the business world. The metrics are supplements whereas the stats are the diet. You can't live on supplements.

moonslav59 · June 15, 2025

20 minutes ago, TedYazPapiMookie said:

All tools are at their fingertips. Yes, I am well aware. The point is you start with the reliable data and then you run your what-if analysis just like you do in the business world. The metrics are supplements whereas the stats are the diet. You can't live on supplements.

They aren't supplements. They are based on the diet- all of it- not just the meat or potatoes. They include the whole diet- not just one part or trying to look at several parts all at once on our own: they are not supplements to it.

TedYazPapiMookie · June 15, 2025

44 minutes ago, sk7326 said:

If you look at baseball reference's definition of Replacement Player - it notes that a team of Replacement Players i.e. the sort of guy you can find on the waiver wire or in AAA ... their winning percentage would be about .294 ... about 48-114 over a full season. The Red Sox in 2024 had a combined team level of 26.8 position bWAR and 12.7 pitcher bWAR. That comes out to 87.5 wins, which is a little higher than the actual 81-81, but well in the neighborhood. In 2021 is 24.4 position, 18.9 pitching which gets you to 91.3 wins in a season where they actually went 91-70.

The offensive portion of WAR is quite factual - it assigns weights based on historical analysis - to something we all know intuitively: all bases are not created equal and not all batting outcomes are equal for scoring runs. The defensive part is dicey just because measuring the most important part of defense - GETTING TO THE BALL - is not quantified by any sort of box score "factual" component.

With pitching, I look at it this way. fWAR is - in general - a better forward looking indicator. What happens after a player makes contact IS very noisy. However, FIP assumes that all batted balls in play are effectively random, and that is obviously not true. The nice thing is, the batted ball data we are getting now helps there. If there is a good FIP combined with iffy line drive rates, that sounds like something which will correct in the future. bWAR is a bit better for awards, because it is speculating a bit less.

Either way, if you are using WAR expecting one stat to explain everything, that is a you problem. WAR in general is good for doing historical comparisons, and for identifying candidates to be considered further for awards and whatnot. I do not know whether a player with 6.2 bWAR had a better season than someone with 5.8. But I do know both of them were terrific.

Your interpretation of metrics seems normal to me. You take a very gross estimate with very little accuracy and try to rationalize why the numbers were reasonably close to the actual number of wins. That's what estimates are for. They are GUESSES. Sometimes they are right and sometimes they are very wrong and everything in between. It's fun to try to retrofit these guesses into a pattern that makes you think there is some validity to the formula but as I always say, predicting the future is simply a guess because it's unknown.

If you notice that heads, come up every third flip and use it as a predictor you can then rationalize that it's a great predictor if heads actually come up 3rd frequently. Baseball fans are always looking for an edge or insight into the future. It's easy to be fooled to think these estimates are accurate. They aren't. It's luck or a coincidence or simply the odds that occasionally the guess will be a good one. A clock reads 12 twice a day!

If you want to know if a guy is good? Look at his stats not metrics. Want fun facts about what he can theoretically do based on a set of assumptions chosen by a Fangraphs or Statcast? Check out his metrics. Both sources of information tell a story, the only difference is stats are after the fact results and metrics are guesses that may or may not reflect reality, but they aren't reality.

Hope you don't mind one question. In bWAR or any estimate you like to use, there is some normalization. How many players are in the normalization? All or some and how much different would it be if you chose a number customized for a specific group of players like using elite pitchers for calculating a bWAR for an elite pitcher and using bad pitchers with ERAs over 5.00 for calculating bWAR for players in that group. Would you be more accurate, less accurate or wouldn't it matter since the number is strictly a guess?

sk7326 · June 15, 2025

1 hour ago, TedYazPapiMookie said:

Your interpretation of metrics seems normal to me. You take a very gross estimate with very little accuracy and try to rationalize why the numbers were reasonably close to the actual number of wins. That's what estimates are for. They are GUESSES. Sometimes they are right and sometimes they are very wrong and everything in between. It's fun to try to retrofit these guesses into a pattern that makes you think there is some validity to the formula but as I always say, predicting the future is simply a guess because it's unknown.

If you notice that heads, come up every third flip and use it as a predictor you can then rationalize that it's a great predictor if heads actually come up 3rd frequently. Baseball fans are always looking for an edge or insight into the future. It's easy to be fooled to think these estimates are accurate. They aren't. It's luck or a coincidence or simply the odds that occasionally the guess will be a good one. A clock reads 12 twice a day!

If you want to know if a guy is good? Look at his stats not metrics. Want fun facts about what he can theoretically do based on a set of assumptions chosen by a Fangraphs or Statcast? Check out his metrics. Both sources of information tell a story, the only difference is stats are after the fact results and metrics are guesses that may or may not reflect reality, but they aren't reality.

Hope you don't mind one question. In bWAR or any estimate you like to use, there is some normalization. How many players are in the normalization? All or some and how much different would it be if you chose a number customized for a specific group of players like using elite pitchers for calculating a bWAR for an elite pitcher and using bad pitchers with ERAs over 5.00 for calculating bWAR for players in that group. Would you be more accurate, less accurate or wouldn't it matter since the number is strictly a guess?

The term "Estimate" is doing a lot of work here ... given your world view here, this would largely completely negate almost the entirely of human scientific discovery. Indeed, the Statcast data are not guesses AT ALL. They are just adding more facts to the soup - how often does the player get the barrel of the bat on the ball, what is the launch angle. Those are things being measured.

As far as your question - the metrics are calculated using regression on thousands of game interactions. I mean there are 200 to 300 of them in every single game played. Now, non-pitchers and non-catchers are not participating in most of those events - but it still adds up well past the sort of sample sizes where outliers have an outsized impact. I tend to lean towards more data is better.

If you really think about it, slugging percentage is a guess! It is assuming that a double is twice as valuable as a single ... this is something you can actually test, indeed that's what is going on.

Brock Beauchamp · June 15, 2025

3 hours ago, TedYazPapiMookie said:

Your interpretation of metrics seems normal to me. You take a very gross estimate with very little accuracy and try to rationalize why the numbers were reasonably close to the actual number of wins. That's what estimates are for. They are GUESSES. Sometimes they are right and sometimes they are very wrong and everything in between. It's fun to try to retrofit these guesses into a pattern that makes you think there is some validity to the formula but as I always say, predicting the future is simply a guess because it's unknown.

If you notice that heads, come up every third flip and use it as a predictor you can then rationalize that it's a great predictor if heads actually come up 3rd frequently. Baseball fans are always looking for an edge or insight into the future. It's easy to be fooled to think these estimates are accurate. They aren't. It's luck or a coincidence or simply the odds that occasionally the guess will be a good one. A clock reads 12 twice a day!

If you want to know if a guy is good? Look at his stats not metrics. Want fun facts about what he can theoretically do based on a set of assumptions chosen by a Fangraphs or Statcast? Check out his metrics. Both sources of information tell a story, the only difference is stats are after the fact results and metrics are guesses that may or may not reflect reality, but they aren't reality.

Hope you don't mind one question. In bWAR or any estimate you like to use, there is some normalization. How many players are in the normalization? All or some and how much different would it be if you chose a number customized for a specific group of players like using elite pitchers for calculating a bWAR for an elite pitcher and using bad pitchers with ERAs over 5.00 for calculating bWAR for players in that group. Would you be more accurate, less accurate or wouldn't it matter since the number is strictly a guess?

So I've read the two thousand or so words you've typed in this thread and I suggest you do one thing:

Go actually read about how and why these metrics are the way they are instead of just calling them "guesses".

Actual statisticians are creating these things and they're pretty rigorously tested throughout the industry. There are hundreds of metrics in the trash bin because they didn't stand up over time. You're saying "why don't we just use stats" except then you use stats that are so deeply flawed they're nearly useless (fielding percentage, for example). The industry moved on from the stats you like because they... well, because they suck for predictive purposes.

Unless you just think that these billion-dollar franchises are stupid and don't understand math, despite about half of them being run by economists, mathematicians, and Ivy League graduates, including your favorite team.

TedYazPapiMookie · June 16, 2025

2 hours ago, Brock Beauchamp said:

So I've read the two thousand or so words you've typed in this thread and I suggest you do one thing:

Go actually read about how and why these metrics are the way they are instead of just calling them "guesses".

Actual statisticians are creating these things and they're pretty rigorously tested throughout the industry. There are hundreds of metrics in the trash bin because they didn't stand up over time. You're saying "why don't we just use stats" except then you use stats that are so deeply flawed they're nearly useless (fielding percentage, for example). The industry moved on from the stats you like because they... well, because they suck for predictive purposes.

Unless you just think that these billion-dollar franchises are stupid and don't understand math, despite about half of them being run by economists, mathematicians, and Ivy League graduates, including your favorite team.

No offense but I respectfully disagree with your comment. They are guesses. The fact that you want to call them educated guesses doesn't make them any more accurate because unless something changes the future is ALWAYS unknown. Stats are recorded facts, metrics are estimates. That's a fact of life and as much as you choose to rationalize the validity of guesses, they are still guesses which makes them no more valid than if you or I chose to guess how a player is to perform at an aspect of baseball.

In business it is common practice to run what-if scenarios to estimate future outcomes. They don't have to be done by economists, mathematicians or Ivy League graduates for them to achieve their goal of simulation. That doesn't render the results correct, it simply provides a series of possible outcomes that could be correct and SOMEONE (whether it's STATCAST or the person that ordered the what-if analysis) needs to choose the guess that best suits them for the decision that is to be made.

Each decision maker in baseball has the choice of using fabricated estimates or real statistics or a combination of the two. Since metrics are filled with constants, variables and normalizations they have been tweaked for a couple of decades and will be tweaked for many more decades. Will they EVER be facts like statistics? No. That's the simple truth. Will the 2050 metrics be potentially more accurate than the 2025 metrics? Absolutely. Will they still be estimates? YEP!!

Brock Beauchamp · June 16, 2025

1 hour ago, TedYazPapiMookie said:

No offense but I respectfully disagree with your comment. They are guesses. The fact that you want to call them educated guesses doesn't make them any more accurate because unless something changes the future is ALWAYS unknown. Stats are recorded facts, metrics are estimates. That's a fact of life and as much as you choose to rationalize the validity of guesses, they are still guesses which makes them no more valid than if you or I chose to guess how a player is to perform at an aspect of baseball.

In business it is common practice to run what-if scenarios to estimate future outcomes. They don't have to be done by economists, mathematicians or Ivy League graduates for them to achieve their goal of simulation. That doesn't render the results correct, it simply provides a series of possible outcomes that could be correct and SOMEONE (whether it's STATCAST or the person that ordered the what-if analysis) needs to choose the guess that best suits them for the decision that is to be made.

Each decision maker in baseball has the choice of using fabricated estimates or real statistics or a combination of the two. Since metrics are filled with constants, variables and normalizations they have been tweaked for a couple of decades and will be tweaked for many more decades. Will they EVER be facts like statistics? No. That's the simple truth. Will the 2050 metrics be potentially more accurate than the 2025 metrics? Absolutely. Will they still be estimates? YEP!!

This is a lot of words to say basically nothing. You were railing on people for using metrics instead of “real” stats, now you’re saying something different.

Except for maybe the Rockies, basically every major league front office disagrees with pretty much everything you’ve posted in this thread. Maybe think about that for a minute.

TedYazPapiMookie · June 16, 2025

5 hours ago, sk7326 said:

The term "Estimate" is doing a lot of work here ... given your world view here, this would largely completely negate almost the entirely of human scientific discovery. Indeed, the Statcast data are not guesses AT ALL. They are just adding more facts to the soup - how often does the player get the barrel of the bat on the ball, what is the launch angle. Those are things being measured.

As far as your question - the metrics are calculated using regression on thousands of game interactions. I mean there are 200 to 300 of them in every single game played. Now, non-pitchers and non-catchers are not participating in most of those events - but it still adds up well past the sort of sample sizes where outliers have an outsized impact. I tend to lean towards more data is better.

If you really think about it, slugging percentage is a guess! It is assuming that a double is twice as valuable as a single ... this is something you can actually test, indeed that's what is going on.

I mean no disrespect but nearly everything you just wrote is unequivocally incorrect. The comment about it negating the entirety of human scientific discovery is such an exaggeration I'm not even going to address it any farther, I get you are passionate about this, but you need to keep your wits about you and not use hyperboles because you are passionate.

Look up fact. Look up estimate. Stats are after the fact. Estimates project the future or retrofit components into other numbers historically by normalizing data into a fact. The result is an estimate. That's how estimates work. Also, you have faulty logic saying SLG is a guess. SLG is total bases divided by at bats. Each component is a fact that results in a fact. Your interpretation of what SLG is simply wrong. The definition of a double is not that it's some % better than a single, it's a two base hit which is one base more than a single. That can't be clearer.

You have presented a lot that has nothing to do with the discussion.

TedYazPapiMookie · June 16, 2025

1 hour ago, Brock Beauchamp said:

This is a lot of words to say basically nothing. You were railing on people for using metrics instead of “real” stats, now you’re saying something different.

Except for maybe the Rockies, basically every major league front office disagrees with pretty much everything you’ve posted in this thread. Maybe think about that for a minute.

Any railing that has happened was due to me being called names for not buying into the metric bs. Defending my position is within my rights. Gross exaggerations don't make your arguments any more valid. I know what front offices believe I've been around the game a long time. I also know every word I've written is true. Facts are facts and metrics are estimates. The definition of each word is simple and straight forward and if execs pick one or another or both together that is their call. That doesn't make it right, it makes it their decision.

A reference book like Baseball Reference gives you facts and it's expanded to include estimates that are not base statistics. Are the stats more accurate. Absolutely. Could the estimates be as accurate? No way to know. It's up to the interpreter of the data to decide. This is called mathematics. Stats are after the fact, estimates suggest a possible outcome that is based on a hypothetic instance, not a real event.

This isn't controversial. If you want certainty, you look at history. If you want to predict the future or how two events that are apples and oranges compare, you use estimates. I've repeated this mantra through all the hysterics from metrics supporters. It's all true whether you agree or not.

Brock Beauchamp · June 16, 2025

9 hours ago, TedYazPapiMookie said:

Any railing that has happened was due to me being called names for not buying into the metric bs. Defending my position is within my rights. Gross exaggerations don't make your arguments any more valid. I know what front offices believe I've been around the game a long time. I also know every word I've written is true. Facts are facts and metrics are estimates. The definition of each word is simple and straight forward and if execs pick one or another or both together that is their call. That doesn't make it right, it makes it their decision.

A reference book like Baseball Reference gives you facts and it's expanded to include estimates that are not base statistics. Are the stats more accurate. Absolutely. Could the estimates be as accurate? No way to know. It's up to the interpreter of the data to decide. This is called mathematics. Stats are after the fact, estimates suggest a possible outcome that is based on a hypothetic instance, not a real event.

This isn't controversial. If you want certainty, you look at history. If you want to predict the future or how two events that are apples and oranges compare, you use estimates. I've repeated this mantra through all the hysterics from metrics supporters. It's all true whether you agree or not.

How is fielding percentage, one of your own examples, "real"? It's not, it's a subjective statistic!

And how is OPS "real"? A single is worth 2.0, a double 3.0, and a homer 5.0. That's not congruous with reality whatsoever, in fact it's easily the biggest flaw of OPS, a generally "pretty good" stat.

Many of your so-called "real" stats aren't based in reality at all, you just think they are.

TedYazPapiMookie · June 16, 2025

5 hours ago, Brock Beauchamp said:

How is fielding percentage, one of your own examples, "real"? It's not, it's a subjective statistic!

And how is OPS "real"? A single is worth 2.0, a double 3.0, and a homer 5.0. That's not congruous with reality whatsoever, in fact it's easily the biggest flaw of OPS, a generally "pretty good" stat.

Many of your so-called "real" stats aren't based in reality at all, you just think they are.

Simple. Fielding Percentage is an after the fact number. Until baseball fixes the error decision by score keepers it will have some issues but in the end EVERY time an out is made it is properly recorded to calculate the fielding percentage that lives in the historical statistics of baseball. That makes it real.

OPS is a metric not a stat. It's an incredibly bad metric since it double accounts batting average.

REAL STATS are those that reside in Baseball Reference as documentation of the games played. I don't know how you can possibly argue that point. It IS the history of baseball. Metrics are supplemental estimates that can vary by sample size or any of the factors that go into calculating the number. That's why it's an estimate. These are very straight forward principles. They are really not up for debate.

Sign In

Article: Pitching WAR Primer: The Math Behind Baseball's Least Understood Stat

Recommended Posts

Create an account or sign in to comment

Create an account

Sign in

Member Statistics

Prospect News & Highlights

Recent News

Notes & Rumors

Recent Blog Entries

Recent Status Updates