Dear Professor Mean: I am using a risk stratification tool for patients presenting to the ED with chest pain. This has been a well validated tool in the ED, but I want to show that the scores are reproducible irrespective of the grade of doctor or assessment nurse calculating the score. I’m going to collect a convenience sample of patients presenting to the ED, and after I get informed consent, I will have those patients assessed separately by a triage-trained nurse, an intern doctor, a registrar and a consultant. I will calculation agreement using the intraclass correlation coefficient (ICC). My question is: How do I calculate the sample size in this context?

There is no formal hypothesis in this setting, so you can’t really do a power calculation. Well, maybe you could but it would be a rather forced and artificial setting.

What you want here is a confidence interval for the intraclass correlation coefficient (ICC). And you want that confidence interval to be reasonably narrow. An ICC with a confidence interval that goes from 0.06 to 0.91 is pretty worthless.

So dig out the formula for the confidence interval for ICC and find a sample size that makes your interval reasonably narrow. Make sure that you plug in a plausible value for