Group sequential designs for diagnostic accuracy studies

Size: px
Start display at page:

Download "Group sequential designs for diagnostic accuracy studies"

Transcription

1 Group sequential designs for diagnostic accuracy studies Oke Gerke, Mie H Vilstrup, Ulrich Halekoh, Malene G Hildebrandt, PF Høilund Carlsen Workshop on flexible designs for diagnostic studies, UMC Göttingen, November 2017

2 Overview Department of Nuclear Medicine, Odense/DK & PET/CT Group sequential trials Clinical Example Perspective References

3 Department of Nuclear Medicine Odense University Hospital Staff: around 90 Physicians, bioanalysts, secretaries, physicists, chemists, technicians, biologists, statistician Associated research unit at University of Southern Denmark Positron emission tomography (PET) center since , 2,020, and 3,065 PET/CT scans in 2006, 2007, and 2008, respectively 9,700 in 2016

4 Positron emission tomography/ computed tomography (PET/CT) Tracer principle: Chemical compound (carrier) + radioactive substance (tracer) = radiopharmaceutical 18F FDG (fludeoxyglucose ( 18 F);fluorodeoxyglucose) Hybrid imaging of PET and CT

5 PET/CT

6 Research areas

7 PET/CT/MRI in Denmark The five regions 5.5 MM Inhabitants 0.6 MM Aalborg 45 3 PET Centers with cyclotron PET/CTs Aarhus 1.2 MM Odense 1.7 MM Copenhagen 02 PET/MRI 1 Cyclotron only 1.2 MM PET/CTs scanners or PET/MRIs or 1 or per approximately 275,000 population 1 per 150,000 population 0.8 MM Similar density in Norway: 34 scanners

8 Overview Department of Nuclear Medicine, Odense/DK & PET/CT Group sequential trials Clinical Example Perspective References

9 Group sequential trials Conduct of pre planned interim analyses of accrued data Reasons usually (Jennison & Turnbull 2000): Ethical considerations: No exposure to unsafe or ineffective treatments Administrative reasons: study executed as planned? Eligibility criteria (inclusion / exclusion) are patients representative of target patient population? Trial procedures, dose regimen, treatment duration adherence? Economic constraints: Verification of critical assumptions at planning stage (e.g. sample size) Savings in sample size, time, and cost Early stopping due to efficacy ( fertility ) => earlier market entry Early stopping due to futility => no waste of resources

10 Group sequential trials (cont d) Assessment of treatment effect wrt. efficacy and/or safety prior to completion of the trial Detailed description of interim analysis in protocol Number Timing Consequences Personnel Data Monitoring Committee Trial independent interim assessment Investigational products of potentially major health significance FDA guidance 2006; Ellenberg 2002

11 Group sequential trials (cont d) Focus: confirmatory hypothesis testing Issue: control (at least approximately) of Type I error Adaptive designs Generalization of group sequential designs Data driven redesign of trial, e.g. adaptive hypotheses, dose escalation, sample size reassessment, seamless phase II/III, treatment switching, Blinded sample size reassment: one of simplest adaptive designs, usually conducted in two stages (Friede & Kieser 2006) Bauer & Köhne 1994, Chow & Chang 2012, Wassmer & Brannath 2016

12 Group sequential trials (cont d) Type I error inflation: Repeatedly testing the same hypothesis at interim and final analysis Stopping boundary/boundaries: Set of critical values that test statistics at interim analysis are compared to Boundary scales: Standardized z statistic Sample mean scale Error spending scale Sum mean scale

13 Group sequential trials (cont d) Type I error spending functions Pocock 1977 α t = α ln(1 + (e 1)t) O Brien & Fleming 1979 α t = 2 2Φ(z α/2 / t) Lan & DeMets 1983 α t = α [(1 e γt ) / (1 e γ )] for γ 0 Kim & DeMets 1987 α t = αt φ for φ>0

14 Group sequential trials (cont d) Type I error spending functions No. of IAs IA Pocock 1977 OBF 1979 Kim & DeMets Final

15 Overview Department of Nuclear Medicine, Odense/DK & PET/CT Group sequential trials Clinical Example Perspective References

16 Clinical example (Hildebrandt 2016) Suspected breast cancer recurrence (N=100) Distant, local, no recurrence? 4 point graded assessments for signs of metastasis: No / probably no / probable / definite Diagnostic accuracy of FDG PET/CT with dual time point imaging (60 and 180 min), contrast enhanced CT, and bone scintigraphy Global hypothesis on equality of the areas under the ROC curves Post hoc interim analyses at ½ One third, two thirds ¼, ½, ¾

17 Results Distant recurrence N=22 (22%) Bone N=18 (18%) Local recurrence N=19 (19%) No recurrence N=59 (59%) 20 distant biopsies 2 local biopsies 10 patients died 12 patients alive Clinical follow up: All patients alive None with later distant recurrence Clinical follow up: All patients alive 2distant recurrences after> 20 months

18 AUC ROC at the end of the study PET/CT 1h: 0.99 (95% CI: ) PET/CT 3h: 0.99 (95% CI: ) cect: 0.84 (95% CI: ) cect+bs: 0.86 (95% CI: ) Global hypothesis of equality: p=0.02

19 Post hoc interim analyses (Gerke 2017) α spending function α t = αt (Kim & DeMets 1987) t: proportion of accumulated information α t : significance level at a particular analysis time point Number of interim analyses: 1. One at N=50 2. Two at N=33, Three at N=25, 50, 75 Significance levels at interim: 1. α 0.5 = α 0.33 = , α 0.67 = , 3. α 0.25 = , α 0.5 = 0.025, α 0.75 =

20 Post hoc interim results No early termination suggested at any interim analysis Employing interim analyses would simply have led to continuation of study as planned

21 Overview Department of Nuclear Medicine, Odense/DK & PET/CT Group sequential trials Clinical Example Perspective References

22 Fryback & Thornbury classification of diagnostic trials

23 Fryback & Thornbury classification of diagnostic trials 1. Technical efficacy Resolution of line pairs, gray scale range, sharpness 2. Diagnostic accuracy efficacy Sensitivity, specificity, positive and negative predictive values, accuracy 3. Diagnostic thinking efficacy Percentage of cases in a series in which image judged helpful to making the diagnosis Difference in clinicians subjectively estimated diagnosis probabilities pre to posttest information 4. Therapeutic efficacy Percentage of times image judged helpful in planning management of the patient in a case series Percentage of times clinicians prospectively stated therapeutic choices changed after test information 5. Patient outcome efficacy Percentage of patients improved with test compared with without test Morbidity avoided after having image information Cost per quality adjusted life year saved with image information 6. Societal efficacy Cost benefit or cost effectiveness analysis from societal viewpoint

24 Application of group sequential designs in diagnostic studies in the future Level 2: Diagnostic accuracy efficacy Indirect contribution to patient outcome Simultaneous consideration on sensitivity and specificity AUC ROC analysis: one dimensional Level 3 and 4: Diagnostic thinking & therapeutic efficacy Indirect contribution to patient outcome Percentages of usefulness and paired differences (pre vs. posttest information) Level 5: Patient outcome efficacy Direct contribution to patient outcome Parallel arms (as in clinical trials evaluating the therapeutic effect of new drugs)

25 References Bauer & Köhne. Evaluation of experiments with adaptive interim analysis. Biometrics. 1994;50: Bauer et al. Twenty five years of confirmatory adaptive designs: opportunities and pitfalls. Stat Med. 2016;35: Chow & Chang. Adaptive design methods in clinical trials. 2nd ed. Boca Raton: Chapman & Hall/CRC Press, DeMets & Lan. Interim analysis: the alpha spending function approach. Stat Med. 1994;13: Ellenberg, Fleming & DeMets. Data Monitoring Committees in Clinical Trials A Practical Perspective. New York: Wiley, 2002 FDA. Guidance for Clinical Trial Sponsors. Establishment and Operation of Clinical Trial Data Monitoring Committees. The United States Food and Drug Administration, Rockville, Maryland. Available at (accessed 03 Nov 2017): Friede & Kieser. Sample size recalculation in internal pilot study designs: A review. Biometrical Journal. 2006;48: Fryback & Thornbury. The efficacy of diagnostic imaging. Med Decis Making. 1991;11: Gerke et al. Group sequential analysis may allow for early trial termination: illustration by an intra observer repeatability study. EJNMMI Res Sep 26;7(1):79.

26 References (cont d) Hildebrandt et al. [18F]Fluorodeoxyglucose (FDG) Positron Emission Tomography (PET)/Computed Tomography (CT) in suspected recurrent breast cancer: a prospective comparative study of dual time point FDG PET/CT, contrastenhanced CT, and bone scintigraphy. J Clin Oncol. 2016;34(16): Jennison & Turnbull. Group Sequential Methods with Applications to Clinical Trials. New York: Chapman and Hall/CRC Press, Kim & DeMets. Design and analysis of group sequential tests based on the type I error spending function. Biometrika. 1987;74: Lan & DeMets. Discrete sequential boundaries for clinical trials. Biometrika. 1983;70: Moyé. Statistical monitoring of clinical trials: fundamentals for investigators. New York: Springer; O'Brien & Fleming. A multiple testing procedure for clinical trials. Biometrics. 1979;35: Pocock. Sequential methods in the design and analysis of clinical trials. Biometrika 1977;64: Todd. A 25 year review of sequential methodology in clinical studies. Stat Med. 2007;26: Wang & Tsiatis. Optimal one parameter boundaries for group sequential trials. Biometrics 1987;43: Wassmer & Brannath. Group sequential and confirmatory adaptive designs in clinical trials. New York: Springer, Whitehead. The design and analysis of sequential clinical trials. 2nd ed. Chichester: Wiley; 1997.