The objective of non-inferiority trials is to compare a novel treatment to an active treatment, with a view of demonstrating that it isn’t clinically worse with regard to a specified endpoint. As treatments improve, showing superiority of a new therapy becomes more and more difficult because incremental improvements are ever smaller, whereas showing non-inferiority becomes ever easier. Going further down this wormhole, in many conditions (as in oncology), a treatment may be very effective in a minority but doesn’t outperform a placebo in all comers. Imagine how easy it would be to produce a study of a novel medication that doesn’t have to prove it’s better in that minority but instead shows itself to be non-inferior to the older treatment for all comers—non-inferior to placebo overall?
It’s important to remember that the non-inferiority trial does not have to show it’s as good as the old treatment (ie, equivalence)—it just has to show that the new intervention is “not unacceptably worse” than the intervention used as the control.1 As editor in chief of the Canadian Journal of Emergency Medicine, I believe that non-inferiority trials are of a quality that does not produce valid enough data to warrant publication.
A CLOSER LOOK AT NON-INFERIORITY
Given the above, why would we ever want to see a non-inferiority trial? Some theoretical reasons to show non-inferiority are:
- Marketing: It’s no worse but more “convenient” to take (once a day instead of twice a day, for example).
- Marketing: The adverse-effect profile is superior so that quality of life is potentially better.
- Safety: Its benefit isn’t unacceptably worse, but fewer people suffer harm.
There are inherent problems to this approach. The authors themselves define what they mean by “not unacceptably worse” (non-inferior). Since that margin of difference is essentially based on the opinion of the biased authors rather than on any consensus clinical outcome, it leads us to a slippery slope of sample-size calculation. If the authors theorize that a 10 percent worse outcome is acceptable (as compared to 5 percent), they need to recruit fewer patients. When calculating sample size for a study, the smaller the difference between the two groups, the greater the number of patients required to detect that difference. In a non-inferiority study, making the assumption that there is a larger difference means you need to recruit fewer patients. If the authors choose less restrictive inclusion criteria (eg, all people with chronic heart failure [CHF] instead of only those with CHF from coronary artery disease), recruitment is easier and requires fewer patients screened. Since the goal is to prove non-inferiority, fewer patients means it is easier to not find a difference! That is, it must be non-inferior.
The researchers doing non-inferiority trials are therefore “encouraged” to be less rigorous in their methodology and assumptions!
In general, non-inferiority trials require fewer patients than a standardized superiority trial. So hold on then: The vast majority of superiority trials aren’t large enough to assess the number needed to harm (NNH)—only the number needed to treat (NNT). Superiority trials can successfully reject the null hypothesis (stating that there is no relationship between specified groups, with any observed differences due to sampling or experiment errors) and demonstrate superiority.
In a 2014 The BMJ article, 76 percent of studies in a Cochrane cohort failed to report adequately on harm. Even in studies reporting adverse events, 41 percent failed to adequately report outcomes related to harm.2 Non-inferiority studies, which on average recruit fewer patients into a trial than a superiority trial, consistently fail to report NNH for three reasons:
- It isn’t in the interest of the pharmaceutical industry to publish data about harm when they cannot even show equivalency.
- The sample sizes used are insufficient to accrue enough data to calculate NNH.
- The number to be recruited would increase the cost of doing the study to the point of being outside the budget of most researchers.
To confuse the reader more, given that the researcher chooses the confidence interval for the margin of efficacy (it isn’t as efficacious but within our assumed limits to be non-inferior), a result of such a study may simultaneously demonstrate that the novel treatment is both non-inferior and inferior at the same time. (The opposite of non-inferior is “not non-inferior” rather than “inferior.”)1 Such results are never reported by the authors of non-inferiority trials.
A confidence interval basically is used like in a poll: We chose 1,000 people for the poll, which makes our result valid ±0.3 percent. (Confidence interval is roughly the equivalent of 3/n, where “n” is the number of people recruited). If you assume a difference required to make it non-inferior as being larger than what the difference actually is (wide confidence intervals), the normal study calculation may find that your arm is indeed inferior but still within the range of what you chose as being non-inferior. Remember: You choose a difference of performance that you declare as acceptable and therefore non-inferior. That does not mean analysis may not actually find it inferior but still within your assumed range that you declared at the start of the study.
The Center for Drug Evaluation and Research (CDER) and the Center for Biologics Evaluation and Research (CBER) of the US Food and Drug Administration issued a draft Guidance for Industry on non-inferiority trials in 2010, trying to prevent studies that show a novel drug to be non-inferior to an active control but not superior to placebo.3 They have tried to establish that the effect of the novel treatment must fall within the 95 percent confidence interval of the active control (ie, taking the subjective choice of defining non-inferiority out of the researchers’ hands). Unfortunately, as witnessed by many publications, authors are not adhering to that mandate.
More than 500,000 articles were indexed in Medline in 2015. No one will ever read them all. We have to choose selectively what we’ll read. John Ioannidis, a professor of medicine and health research and policy at Stanford University School of Medicine and a professor of statistics at Stanford University School of Humanities and Sciences, once wrote, “Ultimately, over 95 percent of what is published in the medical literature will be proven to be false.”
Non-inferiority trials cannot show a drug is safe. They can prove a drug is inferior while also proving it’s non-inferior. (This still gives me a headache.) They cannot show a medication to be equivalent or superior. All that leaves as justification for doing these studies is marketing, which is why most of these trials are pharmaceutical industry funded. Given all that I have to read to try and keep up, I’ve decided that the non-inferiority trial class of research is a class I can safely put aside—the quality is just too inferior to be worth it.
Dr. Ducharme is editor in chief of the Canadian Journal of Emergency Medicine, clinical professor of medicine at McMaster University in Hamilton, Ontario, and adjunct professor of family medicine at Queen’s University in Kingston, Ontario.
References
- Schumi J, Wittes JT. Through the looking glass: understanding non-inferiority. Trials website. Accessed Sept. 14, 2016.
- Saini P, Loke YK, Gamble C, et al. Selective reporting bias of harm outcomes within studies: findings from a cohort of systematic reviews. BMJ. 2014;348:g501.
- Guidance for Industry Non-Inferiority Clinical Trials 2010. United States Food and Drug Administration website. Accessed Sept. 14, 2016.
Pages: 1 2 3 | Multi-Page
No Responses to “Non-Inferiority Trial Research Can Be Inferior”