Due to the recent influx of heathens on the site who constantly bash Sabermetrics/Performance Analysis without knowing the first thing about it, I've decided to make this thread in hopes that it helps educate the non-believers. Input from fellow intelligent posters such as vhawk, DesertCat, Thremp, Jack of Arcades, and others are welcomed and encouraged.
What is Performance Analysis / Sabermetrics?
Sabermetrics is the analysis of baseball through objective evidence, especially baseball statistics. Wikipedia's information on Sabermetrics is a great jumping-off point for the foundations of it (Bill James, SABR, Baseball Prospectus) and the influence it has had in the game. Sabermetrics looks to analyze the game of baseball by using statistics to evaluate how runs are created/prevented - subsets of these involve baserunning, hitting, fielding, pitching, home park advantage, steroids/PEDs, and many other fields of analysis.
What can be measured with Sabermetrics?
In short,
everything. Opponents of Sabermetrics will state that "you can't measure heart/clutch," but they're wrong. We will only ask you how you define "clutch" - perhaps hits with runners in scoring position with 2 outs in the 7th inning and later? Due to
Retrosheet, we can go back through hundreds of thousands of game logs in the past and collect situational hitting data to see whether or not Derek Jeter or David Ortiz has the "clutch" ability at the plate, or if certain pitchers are really "big game" pitchers and outperform their regular season lines in a meaningful way.
What is the role of sample size in baseball?
Sample size is simply the number of trials being used in a given analysis. The larger the sample size, the more accurate the analysis will be. A player hitting .300/.440/.600 over 10 ABs means very little, since his true line will be all over the place. However, if this hitter performs at that level over 600 ABs, that's very meaningful - even moreso over 2000 ABs. By pointing to a hitter and saying that he is 0 for his last 30 AB doesn't mean much. To prove this, simply look at a 10 pages of randomly generated numbers from 1 to 1000. Assume that the hitter is a .300 lifetime hitter and nothing would deter him from hitting this rate. If the number on the page is 1-300, the hitter gets a hit. If it's 301-1000, he doesn't. If you go through the randomly generated numbers, you will see long streaks of numbers where he gets no "hits." This is the essence of the mean, or average, of numbers.
How can we tell if a pitcher is likely to improve?
Statistical analysis of pitchers shows that these three things are most important in posting good ERA/WHIP numbers:
1) Strikeout rate. The more, the better.
2) Walk rate. The less, the better.
3) Batter's SLG % against. The lower, the better.
That's it. No opponent batting average against, no ERA, no hits allowed, nothing of the sort. GB/FB% plays a large role, but it correlates highly with #3, so we use SLG % against to represent that.
Opponents of Sabermetrics will often say that Pitcher X sucks because he has a high ERA or doesn't win any games. The truth is that pitcher Wins are largely irrelevant, as they rely so heavily on run support, which the pitcher cannot contribute to in any meaningful fashion (even in the NL). ERA, while a decent statistic, is flawed for many reasons. Let's take a look at the following pitchers:
Pitcher 1: 16-10, 3.83 ERA, 221 IP, 211 H, 27 HR, 99 BB, 151 K
Pitcher 2: 12-10, 4.08 ERA, 169 IP, 146 H, 14 HR, 61 BB, 144 K
Standard baseball analysts will say that Pitcher 1 is much better because he won 4 more games and has a lower ERA. Sabermetric analysts will point to Pitcher 1's high walk rate and HR allowed rate and favor Pitcher 2 because of his higher strikeout rate and lower HR allowed rate.
Pitcher 1 is 2006 Barry Zito. Pitcher 2 is 2007 Dustin McGowan.
When projecting pitchers, you should focus on their peripherals - walks per nine innings, strikeouts per nine innings, and home runs allowed per nine innings. They are the best indicators of who is lucky/unlucky in the ERA department and will give you the best idea of who will be successful in the future. Why? These stats are called
Three True Outcomes - situations where the ball is not dependent on anyone but the pitcher and the hitter. When you introduce fielders into the equation, you add an element of randomness and skill that pitchers cannot meaningfully change. Which leads to our next point...
Wait - pitchers can't control balls in play?
Mostly correct, believe it or not! When a hitter puts a ball in play, the outcome is largely out of control of the pitcher on the mound. This axiom runs counter to common sense, but statistical analysis proves it - Batting Average on Balls in Play (BABIP) is relatively constant regardless of the pitcher. Don't believe me?
Pitcher 1: .296
Pitcher 2: .285
Pitcher 3: .292
Pitcher 4: .292
Pitcher 5: .272
The only one that is statistically significant is Pitcher 5. Must be an excellent pitcher to allow this type of BABIP rate, right?
Pitcher 1 - Jake Peavy
Pitcher 2 - Johan Santana
Pitcher 3 - Brandon Webb
Pitcher 4 - Jamie Moyer
Pitcher 5 - Barry Zito
Oops. For more information, Google Defense Independent Pitching Statistics (DIPS).
What books should I read to learn more?
Good question. The following books are excellent:
Furthermore, you will learn a lot if you read the following sites:
Baseball Prospectus
The Hardball Times
Fangraphs
If you are interested in how Sabermetrics meets up with traditional pitching/hitting mechanics analysis, you can read the following blogs:
Driveline Mechanics (my site)
Saber-Scouting
The bottom line is that Sabermetrics/Performance Analysis
can analyze things that you might not expect. As a player and coach of the game, things I have learned from the analytical study of baseball
directly contradict what I feel is the right course of action to do in a baseball game.
That does not make the math wrong. Billy Beane, Sabermetrics's hero and GM of the Oakland Athletics, says that watching baseball is so difficult for him because his objective side is constantly at war with his subjective side. So it will be for you as you learn more about the analytical side of the game.
In general, Sabermetrics/Performance Analysis is a way of objectively analyzing the game and studying the reasons behind how the game works. Never take a cliche at face value - "Offense wins games, Defense wins championships," "Pitching is 80% of the game," or "Baseball is about three things: Pitching, fundamentals, and the three run home run." Investigate these claims and see for yourself.
If you are unwilling to learn, then kindly stay out of threads where you are ill-equipped to put forth informative arguments about the subject matter. It will only anger the posters in the forum and make you look idiotic. Take the time to research this area of baseball if you really love the game.