I, Richard Emsley and Darren Dahly wrote a review (here) of a trial of lopinavir and ritonavir for people hospitalised with Covid-19 infection , that was published in the New England Journal of Medicine (paper here), and was one of the first trials of treatments for the current pandemic to come out. It was a good trial, especially considering the circumstances under which it was conducted and the rapidity with which it was set up, done and reported. Nevertheless, we commented critically on a few aspect of it.
Thomas Jaki, who was one of the trial investigators, contacted us with some criticisms of our review, which he was kind enough to write up and post, linked from our review (comment is here).
I am continuing the discussion here because I think it raises some interesting more general issues for clinical trials, which are worth exploring further. I should say that I have written this post alone; my review co-authors are not respobsible for any of this and I don’t know if they would agree with me.
Main issue: the trial’s primary outcome
The issues centre on the primary outcome measure that was used by the trial. This was a 7-category ordinal outcome:
- not hospitalized with resumption of normal activities;
- not hospitalized, but unable to resume normal activities;
- hospitalized, not requiring supplemental oxygen;
- hospitalized, requiring supplemental oxygen;
- hospitalized, requiring nasal high-flow oxygen therapy, noninvasive mechanical ventilation, or both;
- hospitalized, requiring ECMO, invasive mechanical ventilation, or both;
However, in the trial it was not used as an ordinal outcome, but instead was used to analyse the time to clinical improvement, which was defined as an improvement of 2 categories on the ordinal scale, or discharge alive from the hospital. So the ordinal outcome was first dichotomised, then was analysed as time to achieving one of the dichotomised states.
This seemed to us an odd way to use an ordinal outcome, that was likely to be less efficient that using an ordinal regression model. We noted in the review that a likely reason for using the time to event analysis was that speed of recovery was an important outcome.
The criticism and some thoughts on it
Thomas Jaki made three criticisms of our suggestion that an ordinal regression model may have been a better analysis. I want to focus on the first two, which are about how to include the important dimension of time into the analysis.
The criticism was that an ordinal regression analysis is usually conducted as a cross-sectional analysis at a single time point, and looks for differences between the trial arms in the number of patients in each of the categories. Eventually, everyone will end up in one of a small number of categories, either recovered or dead (presumably almost everyone in 1 or 7, maybe with some 2 or 3). Hence, analysing too late may show no difference between the arms, even if there are major differences in rate of recovery between an intervention and standard care. Similarly, an analysis point may be too early and miss the point where the trial arms have diverged. Hence it is difficult, especially in a new disease where the best time point for analysis is uncertain, to specify in advance what the timing of the analysis should be. However, constructing a new outcome of time to recovery (albeit based on dichotomising patients into responders and non-responders, and using an arbitrary definition of improvement that isn’t the same for all starting points) allows us to investigate whether the therapy leads to an improvemetn in recovery times, and the proportion who recover.
The second (related) criticism is that time to recovery is a really important outcome, but would not be taken into account if an ordinal regression was used.
My comment relates to the idea that, if we were to do a standard ordinal regression analysis, we would need to specify in advance a single time point for it to be the trial’s “primary outcome.” This would be standard practice for clinical trials. But why? The data to construct the ordinal outcome for each day must surely exist, as they were needed in order to construct the “time to improvement” outcome that was analysed. So why not do the ordinal regression for each day? That would be much more informative, and would allow us to learn more by tracking the trajectory of recovery in the trial arms through time, enabling us to learn from the data – which is after all the point of the trial. Then there wouldn’t be any problem of doing he analysis too early or too late, and we would probably be able to see more clearly if the intervention might be doing something useful.
So what are the arguments against doing this? I expect they would involve concerns about multiple analyses and Type 1 error rates – but to my mind that’s just an illustration of the way that those frequentist statistical considerations actually get in the way of learning useful stuff from trials.
We commented in our review that it should be possible to do a repeated measures ordinal analysis, which would both retain the ordinal outcome and allow for the effects of time. This would, I think, be everyone’s preferred analysis. Now Frank Harrell and Chris Lindsell have proposed just such an analysis (here) using Stan and rstanarm. This looks to me like defnitely the best way to model these data, so thanks to everyone involved for a fantastic job – now it’s up to the multiple ongoing and planned trials to use this.