I don’t want to start obsessing about sample size calculations, because most of the time they’re pretty pointless and irrelevant, but I came across a great one recently.
My award for least logical sample size calculation goes to Mitesh Patel et al, Intratympanic methylprednisolone versus gentamicin in patients with unilateral Meniere’s disease: a randomised, comparative effectiveness trial, in The Lancet, 2016, 388: 2753-62.
The background: Meniere’s disease causes vertigo attacks and hearing loss. Gentamicin, the standard treatment, improves vertigo but can worsen hearing. So the question is whether an alternative treatment, methylprednisolone, would be better – as good in reducing vertigo, and better in terms of hearing loss. That’s actually not what the trial did though – it had frequency of vertigo attacks as the primary outcome. You might question the logic here; if gentamicin is already good at reducing vertigo, you might get no or only a small improvement with methylprednisolone, but methylprednisolone might not cause as much hearing loss. So you want methylprednisolone to be better at reducing hearing loss, as long as it’s nearly as good as gentmicin at reducing vertigo.
Anyway, the trial used vertigo as its primary outcome, and recruited 60 people, which was its pre-planned sample size. But when you look at the sample size justification, it’s all about hearing loss! Er… that’s a completely different outcome. They based the sample size of 60 people on “detecting” a difference of (i.e. getting statistical significance if the true difference was) 9dB (sd11). Unsurprisingly, the trial didn’t find a difference in vertigo frequency.
This seems to be cheating. If you’re going to sign up to the idea that it’s meaningful to pre-plan a sample size based on a significance test, it seems important that it should have some relation to the main outcome. Just sticking in a calculation for a different outcome doesn’t really seem to be playing the game. I guess it ticks the box for “including a sample size calculation” though. Hard to believe that the lack of logic escaped the reviewers here, or maybe the authors managed to convince them that what they did made sense (in which case, maybe they could get involved in negotiating Brexit?).
Here’s their section on sample size, from the paper in The Lancet:
Original post 21 September 2017 http://blogs.warwick.ac.uk/simongates/entry/best_sample_size/