P.Mean: A standard deviation that is too big for its own britches (created 2008-10-22).

I am a medical editor (manuscript editor) at a peer-reviewed journal and have noticed that some authors supply standard deviations (SD) with means even when their SDs are more than half the value of their means. (Hypothetical example: patients recovered function at a mean (+/- SD) of 220 days +/- 190 days after surgery.) It is my understanding that an SD is meaningless when it is this large (relative to the mean).

That's a tricky question. To answer it properly, you need to think of the variety of ways that readers might use a standard deviation.

The first way that readers use a standard deviation is to get an approximate feel for the range of the data. There is an empirical rule that says that approximately 95% of the data lies between plus and minus two standard deviations of the mean. That rule only works though for data that is approximately symmetric and with extreme values reasonably represented by the tail areas of the classic bell shaped curve. More precisely, you could state that the empirical rule works well with data that is approximately normally distributed.

The empirical rule does not usually work for skewed data, data that is artificially truncated, or data that has many outliers. You could also say that the empirical rule does not work well for data that deviates substantially from a normal distribution.

Let's look at your example. The recovery time has a mean of 220 days and a standard deviation of 190 days. Plus or minus two standard deviations would be -160 days to +500 days. If I were a surgeon, I'd love to operate on those patients who recover 160 days prior to surgery. Maybe they anticipated the benefits that the surgery provided and talked themselves into an early cure.

The reason you are uncomfortable with a standard deviation that is more than half the mean is that it produces a negative lower limit using the empirical rule. Negative values are possible for things like my checkbook balance, but for most physiological measures, they are impossible.

So this leads to a corollary to the empirical rule. If a non-negative set of data has a standard deviation that is more than half of the mean, it is an indication that the data deviates substantially from a bell shaped curve. Almost always this is an indication of a skewed distribution.

If you can't apply the empirical rule, then (according to some) you should not report a standard deviation. What to report in its place is open to debate, but typically the recommendation is to use the range (minimum and maximum value) or the interquartile range (25th percentile and 75th percentile).

For the record, it is very possible to have a standard deviation that is much smaller than the mean, and yet still have a highly non-normal distribution. So application of the empirical rule to ANY data set is potentially problematic, unless the researchers present a histogram of the data or other measure that allows you to assess how close the data is to a normal distribution. Still, it works pretty well in practice.

Now, I may be in a minority, but I would still like to see a standard deviation, even when it is more than half the mean. The reason is that there are other uses for a standard deviation besides application of the empirical rule.

First, the standard deviation allows you to compute a quick confidence interval. If the sample size in the above example was 100, then you would compute a standard error by dividing the standard deviation by the square root of 100 to get 19. Plus or minus two standard errors would provide an interval of 182 to 258, a reasonably good approximation to the 95% confidence interval for the true mean. The confidence interval will work even for skewed, truncated, or outlier heavy distributions, because of the Central Limit Theorem.

Second, I may want to replicate this study with a few changes, and the standard deviation allows me to calculate power for my new study. The standard deviation is absolutely vital for a parametric test (such as a t-test or ANOVA), but it can also be used to get an approximate power for a non-parametric test.

Third, I might want to use the standard deviation as a rough indicator of skewness. A set of non-negative numbers with a large standard deviation relative to the mean must be skewed and as the ratio of the two quantities becomes more extreme, the skewness becomes more extreme as well. It's not a perfect measure since a standard deviation that is one-tenth the mean could still be associated with a highly skewed distribution, but it works pretty well in practice.

I'm fine with a rule that says something like "report an alternative measure of the spread of your data, such as the interquartile range when your data is highly skewed" but I would also be fine with a policy that let people report a standard deviation for any continuous data, skewed or symmetric.