January 23, 2004

Why All The Stats?

Although, the comments were lost when I switched from blogspot to chattablogs, Josh Melton, in one of our lengthy baseball debates, called into question the value of stats. I believe his actual quote was, “stats don’t mean anything.” His direct quote is wrong, but the philosophy behind it is right, that is, if I may be so bold as to interpret his meaning, stats don’t tell us everything. Or sometimes stats appear to tell us something when they may not, or they may be telling us something other than they appear to. In the fantasy baseball world, stats are everything, but in the real world of baseball they simply cannot tell us the whole story. In fantasy baseball, Yankee’s shortstop Derek Jeter is the fifth or sixth best player at his position, but in real baseball, Jeter’s value is immeasurable. The charisma with which he plays, the leadership he provides in the clubhouse, and his uncanny ability at making miracle plays right at a critical moment can’t be measured by the numbers – at least not very well. In these areas, Jeter is arguably better than Alex Rodriguez, who is generally considered to be the best shortstop across the board.


So we have a problem. Based on the raw stats, we can hardly reach a conclusion about the quality of two players from the same position, in the same league, in the same era. To complicate the problem further, if we have such problems in the A-Rod/Jeter case, how can we begin to compare players from different eras? How does Barry Bonds compare to Mike Schmidt? What about Willie Mays or even Babe Ruth? The truth is, stats don’t tell us everything. They only tell us some things, which then allows us to focus and argue about the things the stats don’t tell us.


But there’s another problem. What do the stats tell us? I argue they tell us far more than you might think, but only if we know how to use them. In truth, comparing players from the same season, or even in the same era is rather simple. We really can’t make claims about A-Rod’s leadership or “miracle” ability, since the folks he works for haven’t put him in a position to display those things. Nevertheless, the numbers explicitly tell us that A-Rod is better, and in this case they’re right. It’s not really disputable; everybody knows it, including Brian Cashman and George Steinbrenner. We can, of course, make certain adjustments for both: what ballpark did they play most of their games in? Do these ballparks make a difference? What players bat in front or behind them? Does that make a difference? Again, the greater difficulty comes when we want to compare Bonds and Ruth, but there is a way we can use the numbers to give us a better understanding than simply comparing their number of home runs or hits. (This is not the point of this article, but I want to make my position clear on the matter of Bonds and Ruth. The answer to the question of which one is better is debatable. Bonds is every bit in the category with Ruth, and maybe a few others, as to who is the greatest of all-time. However, no player, Bonds included, will ever mean more or be a greater figure of legend and baseball heroism than Babe Ruth.)

When we want to compare eras, we must remember the context of the era in which a certain player played. Though the years could be broken down further, I feel there are nine significant eras in the history of professional baseball; these eras are somewhat arbitrarily fixed by me, and the tendencies of an earlier or later era may be present at the beginning or end of the era in question. Anyway, here’s how I separate the eras:


1876-1902: early pro-ball, almost incomparable to future eras
1903-1919: The dead-ball era
1920-1941: a period of tremendous offensive achievement (key year 1930)
1942-1946: World War II era, with many great players in the service
1947-1959: a largely balanced era, most similar to the 70’s
1960-1970: the most pitching dominated era, most comparable to the dead-ball period (key year: 1968)
1971-1977: largely balanced
1978-1990: good pitching era, and the ultimate stolen base era (aberration year: 1987)
1991-Present: offensive explosion very similar to the 20’s and 30’s


There are reasons for dividing the eras as I did, but that really a subject for another project.


In each of these eras there are significant conditions that apply to the way a players statistics look, but there is a relatively simple way to measure the contribution of a pitcher or hitter in the context of the game’s history. The way to do it is to compare players to the league they played in, rather than comparing them directly to another player from another era. That means we have to approach the numbers in a different way. The question then becomes, “How do Mark McGwire’s numbers compare to the league averages of the era in which he played?” Once we’ve answered that question for McGwire and another player, say Harmon Killebrew, we can then begin to make an assessment for how the two players compare. Of course, the other conditions still apply: quality of team, type of ballpark, defensive contribution, and so on.


Let’s look at a particular extreme case and see how it works out. In 1924, Rogers Hornsby won the batting title with a .424 average – the highest total in the modern era. In 1968, Carl Yastrzemski won the batting title with a .301 average – the lowest average to win a batting title ever. Who’s better? If your answer is Hornsby, you’re right, but it’s not as simple as all that. Hornsby was not .123 better in the same way that Albert Pujols, who hit .368 in 2003 would be .123 better than a player that hit .245 in the same year. Both numbers, .424 and .301, have a context.


The truth is, Hornsby’s 1924 batting average was 50% better than the National League average (.283), while Yaz’s .301 was 31% better than the American League average in 1968 (.230). Using a direct comparison approach, Hornsby’s .424 is 41% better than .301, but by putting the numbers in context, we end up with a 19% difference. So if we could place each player in that other league in the other era and the percentages remained the same, we would expect Hornsby to hit .345 in 1968 and Yaz to hit .371 in 1924. That makes a difference. To put the differences between the eras into perspective, the lowest team batting average in the 1968 AL was .214 (Yankees) compared to 1924’s NL low of .256 (Boston Braves). On the high side, the New York Giants hit .300 as a team in 1924, while the A’s topped the AL in 1968 with a .240 mark. The high in 1968 is still .016 lower than the low of 1924.


An old-timer would look at this and say the solution is simple: players were better in 1924. That’s just not true, but that’s a subject for another long research project.


Back to the comparison. 19% is still a big gap, which forces us to conclude that Hornsby was better in 1924 than Yaz was in 1968, and that’s a fair conclusion. But consider this, we just compared the highest batting title average to the lowest, what about another example? Here are four more modern batting title winners, with their averages, their league averages, their percentage improvements, and their expected 1924 average:


Pete Rose, NL 1968, .335 AVG (.243 league AVG), 38% better - would have hit: .391
George Brett, AL 1980, .390 AVG (.269 league AVG), 45% better - would have hit: .410
Tony Gwynn, NL 1994, .394* AVG (.267 league AVG), 48% better – would have hit: .419
Albert Pujols, NL 2003, .359 AVG (.262 league AVG), 37% better – would have hit: .388
* strike year


For the record, I chose those years at random. It is possible that another batting title winner in another year was more that 50% better than his league’s average. Anyway, the point is, every stat has a context, and there is a way we can find that context and put it to use. Obviously, we can’t really determine how two players compare based on one statistic, so the process would have to be run on several levels to determine which player was actually better. Then, to make a truly accurate conclusion, you’d need to account for a player’s defensive contribution (both the position they played and whether they played it well), the ballpark they played in, and potentially a time-line adjustment (because I believe that the overall quality of play has improved over time, making it more difficult to dominate today the way players in the past, especially before WWII, but again, that’s for another time).


Pitching stats create another problem, most specifically, the way pitchers are used has continually changed, from the first pitch in history to the last pitch of the 2003 World Series, and it is likely to keep doing so. We’ve gone from an era when pitchers went on two days rest, and completed almost every game - win or lose - to a time when pitchers have four days of rest and rarely complete games. In this case more so than in the case of fielding or hitting statistics, the raw numbers tell us a whole lot, if we use the right ones, in the right ways. However, the eras of the game, ballparks, and team defense all come into play as well. The league context method takes care of many of these problems though.


As a side on pitchers and the way they’re used, I do not believe that the pitchers that threw more in the past are by that fact alone, better than the pitchers of today. In a three-way comparison of Walter Johnson, Sandy Koufax, and Roger Clemens, we find extreme differences in starts per season and innings pitched, which produces variances in total wins and losses, walks, strikeouts, hits allowed, total earned runs, complete games and shutouts. However, the stats based on average (winning percentage, strikeout to walk ratio, WHIP, and ERA) still give us a good starting point to compare pitchers, but always within the context of the league. I’ll do one short comparison: 1968 Bob Gibson vs. 1995 Greg Maddux.


Gibson had a mind blowing 1.12 ERA in 1968, but Maddux’s 1.56 ERA in 1994 was even better. Gibson’s ERA was 2.66 times better than the league ERA (2.98). Maddux’s ERA was 2.71 times better than the league (4.22). If we let them switch, Gibson would have a 1.59 ERA in 1994, while Maddux in 1968 would be at 1.10. In the end we’re only talking about approximately 2.5 hundredths of a run, which is very close – much closer than the initial comparison of 1.12 vs. 1.56 indicates. You might be one to notice that I used Maddux’s 1994 numbers, which were shortened by the strike. But that’s not a problem for Greg and his dominance of the mid 90’s. He was 2.56 times better than the league in 1995 (1.63 vs. 4.18), which would give him a 1.16 ERA in 1968


Finally, you should know that the philosophical concepts in this article come from Bill James. The words are mine, the particular comparisons are mine, but this line of thinking belongs to him. For similar types of study, I highly recommend his New Historical Baseball Abstract


In the future, I hope to do other longer posts on the problems of defensive stats, the overall quality of play in today’s game vs. the past, and the best ways to use the raw stats, which will include the reasons why OPS (on base percentage + slugging percentage) is a good measuring stick, but not a great one.

Posted by at January 23, 2004 07:48 PM
Comments

Yeah, stats are important, but when it comes down to it, my boy can mash... your boy sucks.

Posted by: Duane at January 24, 2004 10:28 AM

"Stats don't mean anything". I think you're right about putting them in context of league and year. However, I don't understand how one could make that statement unless they were just frustrated about losing an argument. I mean 85 RBI per year is almost always better than 60. Like you said there is the rare individual like Jeter that you can't talk about without mentioning his "clutch" aspect. But, overall there are not many better indicators of a players past and, probable, current skill. There are exceptions, namely, Jeff Cirillo.

Posted by: mike aman at January 26, 2004 08:33 PM

85 RBI per year is always better than 60 if you make that comparison in the same year or the same era. But the game changes over the years. Here's an example from our era: In 1989, Kevin Mitchell hit 47 home runs and our jaws hit the floor. In the past 5 years, that figure has been met or exceeded 21 times. That doesn't mean the players today are better power hitters than they were in 1989, it just means the game is different; the balance between pitching and hitting shifted. What Mitchell did in 1989 was a tremendous accomplishment - much more so than when Troy Glaus hit 47 in 2001.

Posted by: the booth at January 26, 2004 11:14 PM

Yeah. Like my TWO dingers last year would have been like FIFTY in 1958.

Posted by: Tony Womack at January 27, 2004 08:41 AM

How much would the 12 rebounds I made yesterday count for in 1984?

Posted by: Aaron Boone at January 27, 2004 10:53 AM

Hey, my brother might have torn his ACL, but at least he didn't crack his dome or bust his grill.

Posted by: Bret Boone at January 27, 2004 02:46 PM

Hey Aaron,

I actually deleted some lines in my first response because they seemed to be too simple. As it turns out, deleting them made the whole response too simple. I was talking about evaluations based upon the statistical history of a player being the most reliable way to size him up, so to speak. Deleting this made the rest of my post seem like I was saying something like, "four is more than two!".

Posted by: mike aman at January 28, 2004 11:48 AM

How ugly would we be if we had played back in the twenties?

Posted by: Otis Nixon and Willie McGee at January 28, 2004 08:31 PM
Post a comment









Remember personal info?