In my last column (), I mentioned that z-scores offer a way of turning raw totals into relative totals. This means the z-score for 40 homeruns will be higher than the z-score for 40 stolen bases only if there are more players expected to hit 40+ homeruns than there are players expected to steal 40+ bases.
To determine a hitter’s total value, we calculate his z-score for R, HR, RBI, SB, and BA and add them together to determine his total z-score. Similarly, we calculate a pitcher’s z-score for W, K, ERA, and WHIP and add them together for his total z-score.
A corollary of calculating total value this way is that a z-score of 1 in homeruns is worth the same as a z-score of 1 in stolen bases. In other words, it is just as valuable to be one standard deviation above the mean in homeruns as it is in stolen bases.
The total score for a hitter is
are the amount of runs, homeruns, RBI, stolen bases, and “extra hits” a hitter totals (“extra hits” will be explained in more detail below). µ is the mean and σ is the standard deviation of the entire hitter pool, with the subscript indicting the relevant category.
The important point to notice is that
,
,
,
, and
are constant for all hitters. Therefore, because all hitters are affected the same way, we can exclude the µ terms to make the calculation simpler.
The simpler version is given by the below formula:
In order to calculate z-scores, we need to calculate the sample standard deviation (Microsoft Excel command “stdev”). When calculating the standard deviation, we use the pool of all players that will be drafted, which requires us to determine the player pool.
Determining the player pool can be a tricky exercise and is a bit of a catch-22. Determining the player pool requires knowing who the best players are, but determining the best players requires knowing the player pool. There’s no easy way around this that I know of.
One possible solution is to use a third party source that has already ranked the players (such as ESPN.com, for example), and pick their top x players, where x is the number of players that will be drafted, and use that as the player pool.
Determining the z-score for the counting stats is straightforward – just follow the formula. Determining the value of BA is a slightly more difficult exercise. Since a .290 batting average with 600 at bats is worth much more than a .290 batting average with 200 at bats, we need to convert batting average to a counting stat. The way to do this is by finding the number of hits above average a player contributes.
The formula for this is:
where
is the (weighted) batting average of the player pool. This can be calculated as:
I would not bother calculating z-scores for relief pitchers. The number of saves a closer gets is so unpredictable that your time is better spent doing more reliable analysis elsewhere.
For starting pitchers, we use the following formula:
and
are calculated in the exact same way as the counting stats for hitters.
and
are calculated using the same concept as
.
The formulas are:
and
where
and
Note: the negative sign in
and
is because lower values are better.