Non-Inferiority Trial 실전

KSIC Summer Conference 2023

Jinseob Kim (차라투)

6/24/23

Contents

  1. 샘플수 계산

  2. 비열등성 마진(margin)

  3. 분석결과 제시

References

  1. LOADSTAR(JAMA `23): YUHS

  2. RACING(Lancet `22): YUHS

  3. FLAVOUR(NEJM `22): SNU

  4. RENOVATE-COMPLEX-PCI(NEJM `23): SMC

샘플 수 계산

Proportion vs Survival based approach

Proportion

  • Event rate 만 이용.
  • Racing, LOADSTAR, FLAVOUR

Survival

  • Event rate, accural/FU time 고려
  • RENOVATE-COMPLEX-PCI

Racing

LOADSTAR

FLAVOUR

RENOVATE-COMPLEX-PCI

Use gsdesign packages

library(gsDesign)
## Annual incidence
Pc <- 0.06; Pt <- 0.036
## Hazard rate: S(t) = exp(-lambda * t)
lambda_c <- -log(1 - Pc); lambda_t <- -log(1 - Pt)
hr <- lambda_t/lambda_c

nSurv(
  lambdaC = lambda_c,  # Hazard rate of control group
  hr = hr,             # Alternative hypothesis
  hr0 = 1,             # Null hypothesis
  T = 4,               # Total study period              
  R = 3,               # Accrural period 
  minfup = 1,          # 4- 3  
  beta = 0.1,       
  alpha = 0.025,       # 1-sided
  eta = 0,             # Annual drop out rate
  ratio = 2)           # Randomization ratio, experimental/control
Fixed design, two-arm trial with time-to-event
outcome (Lachin and Foulkes, 1986).
Solving for:  Accrual rate 
Hazard ratio                  H1/H0=0.5925/1
Study duration:                   T=4
Accrual duration:                   3
Min. end-of-study follow-up: minfup=1
Expected events (total, H1):        165.1735
Expected sample size (total):       1609.307
Accrual rates:
    Stratum 1
0-3  536.4356
Control event rates (H1):
      Stratum 1
0-Inf    0.0619
Censoring rates:
      Stratum 1
0-Inf      0.02
Power:                 100*(1-beta)=90%
Type I error (1-sided):   100*alpha=2.5%
Randomization (Exp/Control):  ratio= 2 

비열등성 마진(margin)

Statistical & Clinical

  • 차이가 없을 때 p < 0.05 나와야 함.

  • 임상적으로 허용가능한 범위

https://www.nejm.org/doi/full/10.1056/NEJMra1510063

LOADSTAR 리비전

An unclear rational for a non-inferiority trial at all, i.e., why is the targeted approach considered less burdensome and therefore advantageous if equally effective?

  • “target이 burden이 작기때문에 non-inferiority를 증명하는 것만으로 충분하다는 논리” 에 대한 의문.
  • targeted approach 가 왜 burden이 작냐는 질문에 답변 필요

Poor justification for the 3% non-inferiority margin, since that would correspond to a NNT of about 33, which would seem to justify the use of fixed high-intensity therapy if that were the true difference

  • margin 3% 너무 큰거 아니냐?
  • 실제로 3% 차이면 기존 fixed therapy가 더 나은것이라는 의견. 임상적 설명 필요.

분석결과 제시

Difference with upper 97.5(or 95)% CI

library(survival)
fit <- summary(survfit(Surv(time, status) ~ sex, data = colon), times = 365)

## Surv2 - Surv1 = Inci1 - Inci2
kmdiff <- diff(1 - fit$surv) 

## Var(Surv1 + Surv2) = Var(Surv1) + Var(Surv2)
sediff <- sqrt(sum(fit$std.err^2))

c(Diff = kmdiff, LCI = kmdiff - 1.96 * sediff, UCI = kmdiff + 1.96 * sediff)
        Diff          LCI          UCI 
-0.024406273 -0.058041994  0.009229448 

Upper limit < margin 이면 유의한 결과.

margin <- 3
pv <- 1 - pnorm(abs((margin - kmdiff)/sediff))
pv
[1] 0

KM estimate vs Proportion

KM 발생률을 대부분 썼으나 proportion 을 요구하는 경우도 있음.

  • RACING: Kaplan-meier estimate 쓰지말고 그냥 proportion 써라.
  • KM: 9.3 vs 10.3% -> 9.1 vs 9.9% (Event/N)

CI for proportion difference

RACING: Wilson CI 써라

  • For the primary and key secondary outcome I would question if the Normal approximation does not hold.

  • 보수적인 지표

Primary outcome만 비열등성 검정

Analyses of secondary endpoints were not adjusted for multiplicity, and findings should be interpreted as exploratory because of the potential for type I error

LOADSTAR 리비전

Some lack of clarity regarding the exact statistical test used to establish the confidence interval around the difference in event rate

  • Kaplan meier 방법으로 두 군의 Incidence 값과 standard error 를 각각 얻습니다.
  • 이것을 이용해 difference 값과 upper 97.5% 신뢰구간을 구합니다.
  • 본 연구에서 Cox 분석은 없습니다.

Uncertainty regarding the responsiveness of the endpoint over the duration of follow up.

  • 3년 F/U 기간 이후엔 어떠냐. 더 오래 관찰해야할 수도 있다는 의견.
  • 3년 이후 outcome이 혹시 있다면 보여주고, 아니면 3년관찰로 충분히 의미있다고 설명.

HR 필요?

LOADSTAR: HR 지표가 아예 없음. Rate difference 만 보여줌.

  • 따라서 Cox 분석도 없음. simple is the best

RENOVATE-COMPLEX-PCI: 반대로 고급분석요구

  • Stratified Cox: 각 병원의 고유 특성을 고려
  • Competing risk analysis: Other cause death 고려

RENOVATE-COMPLEX-PCI 리비전

  1. P10 L53. Earlier you stated that the randomization was stratified by clinical presentation and by treatment center. Did you apply a stratified logrank test to incorporate this design feature, and include these covariates in the Cox model estimation? Please clarify.
  • 다기관 고려한 stratified 분석해라.
  • R coxph 식에 +strata(기관변수) 추가
coxph(Surv(time, status) ~ group + strata(hospital), data = data)

RENOVATE-COMPLEX-PCI 리비전

  1. P13 L36-39. The analyses that properly account for competing risks should be the primary analyses here; the standard K-M or Cox model estimates that ignore competing risks are incorrect. Please describe the method used to accommodate competing risks in the methods section, and present these results as the primary analysis.

-> Competing risk analysis 를 primary로 해라

library(jskm);library(survival)
colon$status2 <- colon$status
colon$status2[1:400] <- 2
colon$status2 <- factor(colon$status2)
fit2 <- survfit(Surv(time,status2)~rx, data=colon)
jskm(fit2, mark = F, surv.scale = "percent", table = T, status.cmprsk = "1")

RENOVATE-COMPLEX-PCI 리비전

  1. P13 L32. What was the “per protocol” analysis? This is the first time this is mentioned; it should be explained in the methods section. Please also note that per protocol analyses that are based on an analysis dataset constructed by eliminating cases based on post-randomization events (e.g., treatment crossover) are generally biased and are not allowed. More principled methods of analyzing a trial subject to non-adherence should be used. See for instance Hernan and Robins, NEJM 2017; 377: 1391-1398.

-> Per-protocol 하면서 연구디자인 깨지는거 아니냐?

Per-protocol: Image군에서 Angio, Angio군에서 Image를 한 환자를 Protocol violation이라고 미리 정의.

PP scenario

A는 OK, B -> C -> D 로 갈수록 복잡한 보정 필요.

  • A 에 해당함을 주장.

감사합니다