Blog post: An example of a poor color choice

Steve Simon

2020/03/11

I ran across a graph in a journal article. The article itself was good, but the graph had a rookie mistake. I shouldn’t point this out, because I myself have been guilty of far worse mistakes. But this graph illustrated the point far better than anything I could have said.

The display is a series of stacked bar charts, and it showed that certain types of information were most likely to be entered by a physician. It was least likely to be entered by a case manager.

Find the full article found here.

Notice that the colors used to distinguish physician from social worker from nurse from case manager represented a gradient of oranges from a dark almost brown orange to a very light orange.

Gradients are very useful for showing changes in a continuous variable, and possibly in an ordinal variable as well. But the list of providers is nominal. There is no natural ordering from physician to social worker to nurse to case manager. For nominal data, you want to choose a set of colors that are readily distinguishable from one another. The usually means evenly spaced points along a color wheel.

Here’s an example of what I mean.

This color wheel includes the pure colors (pure yellow, pure green, pure cyan, pure red, pure magenta, pure blue) and these colors are sometimes a bit harsh on your eyes. A darker set of colors is a bit easier to view.

There are lots of choices in various graphics packages. Here is one nice set of categorical color choices in R.

## [1] "#E41A1C" "#377EB8" "#4DAF4A" "#984EA3" "#FF7F00"

These aren’t taken directly from a color wheel but they are well spaced out from a visual perspective, meaning that it is easy to distinguish among these different colors.