Analyzing bad data (created 2009-05-22).

A discussion on the MEDSTATS email discussion group centered around a data set involving blood loss. Blood loss was quantified into categories with  values of

The discussion centered on the inefficiencies created when continuous data is reported in categories like these. While this was appropriate, some of the language was a bit harsh, with one person suggesting that we should just "walk away" if a client insists on collecting data like this. There is a potential ethical concern here, because the greater sample size created by the inefficient data collection scheme can put too many subjects at risk.

I pointed out, though, that often you "inherit" a study that was planned poorly. Rather than walking away, I would suggest analyzing the data as best you can and explaining the efficiencies that would accrue in future studies. I'd mention that old saying about catching flies with honey and not vinegar, but most of my clients would not like being compared to flies.

