Loading [MathJax]/jax/output/HTML-CSS/jax.js

Sample Size Calculation With Fixed Follow-up

Kaifeng Lu

12/15/2021

library(lrstat)

This R Markdown document illustrates the sample size calculation for a fixed follow-up design, in which the treatment allocation is 3:1 and the hazard ratio is 0.3. This is a case for which neither the Schoenfeld method nor the Lakatos method provides an accurate sample size estimate, and simulation tools are needed to obtain a more accurate result.

Consider a fixed design with the hazard rate of the control group being 0.95 per year, a hazard ratio of the experimental group to the control group being 0.3, a randomization ratio of 3:1, an enrollment rate of 5 patients per month, a 2-year drop-out rate of 10%, and a planned fixed follow-up of 26 weeks for each patient. The target power is 90%, and we are interested in the number of patients to enroll to achieve the target 90% power.

Using the Schoenfeld formula, the required number of events is 39. This requires 191 patients enrolled over 38.2 months. Denote this design as design 1.

lrsamplesize(beta = 0.1, kMax = 1, criticalValues = 1.96, 
             allocationRatioPlanned = 3, accrualIntensity = 5, 
             lambda2 = 0.95/12, lambda1 = 0.3*0.95/12, 
             gamma1 = -log(1-0.1)/24, gamma2 = -log(1-0.1)/24, 
             accrualDuration = NA, followupTime = 26/4, 
             fixedFollowup = TRUE,
             typeOfComputation = "schoenfeld")
#> $resultsUnderH1
#>                                                                        
#> Fixed design for log-rank test                                         
#> Overall power: 0.902, overall significance level (1-sided): 0.025      
#> Number of events: 39                                                   
#> Number of dropouts: 4.8                                                
#> Number of subjects: 191                                                
#> Information: 7.31                                                      
#> Study duration: 43.1                                                   
#> Accrual duration: 38.2, follow-up duration: 6.5, fixed follow-up: TRUE 
#> Allocation ratio: 3                                                    
#>                                                                        
#>                              
#> Efficacy boundary (Z)  1.960 
#> Efficacy boundary (HR) 0.484 
#> Efficacy boundary (p)  0.0250
#> HR                     0.300 
#> 
#> $resultsUnderH0
#>                                                                        
#> Fixed design for log-rank test                                         
#> Overall power: 0.025, overall significance level (1-sided): 0.025      
#> Number of events: 39                                                   
#> Number of dropouts: 2.2                                                
#> Number of subjects: 113                                                
#> Information: 7.31                                                      
#> Study duration: 22.6                                                   
#> Accrual duration: 22.6, follow-up duration: 6.5, fixed follow-up: TRUE 
#> Allocation ratio: 3                                                    
#>                                                                        
#>                              
#> Efficacy boundary (Z)  1.960 
#> Efficacy boundary (HR) 0.484 
#> Efficacy boundary (p)  0.0250
#> HR                     1.000

On the other hand, the output from the default lrsamplesize call implies that we only need 26 events with 127 subjects enrolled over 25.4 months, a dramatic difference from the Schoenfeld formula. Denote this design as design 2.

lrsamplesize(beta = 0.1, kMax = 1, criticalValues = 1.96, 
             allocationRatioPlanned = 3, accrualIntensity = 5, 
             lambda2 = 0.95/12, lambda1 = 0.3*0.95/12, 
             gamma1 = -log(1-0.1)/24, gamma2 = -log(1-0.1)/24, 
             accrualDuration = NA, followupTime = 26/4, 
             fixedFollowup = TRUE,
             typeOfComputation = "direct")
#> $resultsUnderH1
#>                                                                        
#> Fixed design for log-rank test                                         
#> Overall power: 0.902, overall significance level (1-sided): 0.025      
#> Number of events: 26                                                   
#> Number of dropouts: 3.2                                                
#> Number of subjects: 127                                                
#> Information: 4.46                                                      
#> Study duration: 31.1                                                   
#> Accrual duration: 25.4, follow-up duration: 6.5, fixed follow-up: TRUE 
#> Allocation ratio: 3                                                    
#>                                                                        
#>                              
#> Efficacy boundary (Z)  1.960 
#> Efficacy boundary (HR) 0.463 
#> Efficacy boundary (p)  0.0250
#> HR                     0.300 
#> 
#> $resultsUnderH0
#>                                                                        
#> Fixed design for log-rank test                                         
#> Overall power: 0.025, overall significance level (1-sided): 0.025      
#> Number of events: 26                                                   
#> Number of dropouts: 1.4                                                
#> Number of subjects: 80.3                                               
#> Information: 4.88                                                      
#> Study duration: 16.1                                                   
#> Accrual duration: 16.1, follow-up duration: 6.5, fixed follow-up: TRUE 
#> Allocation ratio: 3                                                    
#>                                                                        
#>                              
#> Efficacy boundary (Z)  1.960 
#> Efficacy boundary (HR) 0.412 
#> Efficacy boundary (p)  0.0250
#> HR                     1.000

To check the accuracy of either solution, we run simulations using the lrsim function.

lrsim(kMax = 1, criticalValues = 1.96,  
      allocation1 = 3, allocation2 = 1,
      accrualIntensity = 5, 
      lambda2 = 0.95/12, lambda1 = 0.3*0.95/12, 
      gamma1 = -log(1-0.1)/24, gamma2 = -log(1-0.1)/24,
      accrualDuration = 38.2, followupTime = 6.5, 
      fixedFollowup = TRUE,  
      plannedEvents = 39, 
      maxNumberOfIterations = 10000, seed = 12345)
#>                                               
#> Fixed design for log-rank test                
#> Overall power: 0.949                          
#> Expected # events: 37                         
#> Expected # dropouts: 4.4                      
#> Expected # subjects: 186.3                    
#> Expected study duration: 39.3                 
#> Accrual duration: 38.2, fixed follow-up: TRUE 
#> 

lrsim(kMax = 1, criticalValues = 1.96,  
      allocation1 = 3, allocation2 = 1,
      accrualIntensity = 5, 
      lambda2 = 0.95/12, lambda1 = 0.3*0.95/12, 
      gamma1 = -log(1-0.1)/24, gamma2 = -log(1-0.1)/24,
      accrualDuration = 25.4, followupTime = 6.5, 
      fixedFollowup = TRUE,  
      plannedEvents = 26, 
      maxNumberOfIterations = 10000, seed = 12345)
#>                                               
#> Fixed design for log-rank test                
#> Overall power: 0.833                          
#> Expected # events: 24.3                       
#> Expected # dropouts: 2.9                      
#> Expected # subjects: 124.1                    
#> Expected study duration: 27                   
#> Accrual duration: 25.4, fixed follow-up: TRUE 
#> 

The simulated power is about 95% for design 1, and 83% for design 2. Neither is close to the target 90% power.

We use the following formula to adjust the sample size to attain the target power, D=D0(Φ1(1α)+Φ1(1β)Φ1(1α)+Φ1(1β0))2 where D0 and β0 are the initial event number and the correponding type II error, and D and β are the required event number and the target type II error, respectively. For α=0.025 and β=0.1, plugging in (D0=39,β0=0.05) and (D0=26,β0=0.17) would yield D=32 and D=32, respectively. For D=32, we need about 156 patients for an enrollment period of 31.2 months,
N=Dr1+rλ1λ1+γ1(1exp((λ1+γ1)Tf))+11+rλ2λ2+γ2(1exp((λ2+γ2)Tf)) Simulation results confirmed the accuracy of this sample size estimate.

lrsim(kMax = 1, criticalValues = 1.96,  
      allocation1 = 3, allocation2 = 1,
      accrualIntensity = 5, 
      lambda2 = 0.95/12, lambda1 = 0.3*0.95/12, 
      gamma1 = -log(1-0.1)/24, gamma2 = -log(1-0.1)/24,
      accrualDuration = 31.2, followupTime = 6.5, 
      fixedFollowup = TRUE,  
      plannedEvents = 32, 
      maxNumberOfIterations = 10000, seed = 12345)
#>                                               
#> Fixed design for log-rank test                
#> Overall power: 0.905                          
#> Expected # events: 30.1                       
#> Expected # dropouts: 3.6                      
#> Expected # subjects: 152.4                    
#> Expected study duration: 32.6                 
#> Accrual duration: 31.2, fixed follow-up: TRUE 
#>