Originally posted at http://blogs.warwick.ac.uk/simongates/entry/missing_data_in/ on 6th September 2013.
Most systematic reviews state in their results sections and abstracts how many studies they included. But you usually find that not all outcomes are reported by all studies; it’s quite common for important outcomes to be reported by only a minority of studies. What is usually done in this situation is essentially nothing; the subset of studies that have data is used to calculate the estimated treatment effects and this is presented as the review’s result.
For example, in the Cochrane review of “Interventions for preventing falls in older people living in the community,” there were 40 studies that evaluated multifactorial interventions (these are interventions that consist of several components, for example exercises for strength or balance, medication review, vision assessment, home hazard assessment etc etc; patients are assessed to find out what risk factors for falling they have and specific interventions for these are then provided). The review looked at the number of fallers as one outcome, and also more importantly, the number of participants sustaining fractures. The meta-analysis of the number of fallers included 34 studies, so only six did not provide data on this outcome. However, the meta-analysis of fractures included only 11 studies (27.5% of the studies included in the review), so the conclusion about fractures is based on an analysis in which most of the data are missing. Obviously, this outcome exists for all studies that were conducted; the participants either had a fracture or didn’t during the follow-up period, but we only know about how many did and didn’t for 11 trials. For the other 29, the information is missing.
The big problem here is the risk of introducing bias. When conducting trials and considering them for inclusion in a systematic review, incomplete outcome data are one of the criteria for judging risk of bias. A common rule of thumb is that more than 20% missing data can put a study at high risk of bias (though obviously that is simplistic, and its origin is obscure). More than 50% of data missing would be very worrying and you would not expect to put much credence on the results. So surely in a situation like the falls review, 72.5% of missing studies we shoud have major reservations about the estimated treatment effect? Yet treatment effects estimated from a subset of studies are routinely presented on a equal footing with results with small amounts of missing data. This doesn’t seem right.
If there is an important outcome (like death) that is only reported by a few studies, and there happens to be a difference in those studies, that is likely to be prominently featured in the review’s results and conclusions. But the particpants in all of the other trials either died or didn’t die; the results for these trials exist but weren’t recorded. It is quite possible that if they were known they would completely negate the positive effects in the trials that reported death. Maybe the reason those two trials reported it was precisely because of the treatment benefir?
 Gillespie LD, Robertson MC, Gillespie WJ, Sherrington C, Gates S, Clemson LM, Lamb SE. Interventions for preventing falls in older people living in the community. Cochrane Database of Systematic Reviews 2012, Issue 9. Art. No.: CD007146. DOI: 10.1002/14651858.CD007146.pub3.