Evaluating therapies: Varying challenges in different eras Salim Yusuf

Size: px
Start display at page:

Download "Evaluating therapies: Varying challenges in different eras Salim Yusuf"

Transcription

1 Evaluating therapies: Varying challenges in different eras Salim Yusuf

2 Pre-1960 s: RCTs uncommon Large benefits undetected, unclear or not reliably demonstrated Large harms missed, e.g. 0 2 in newborns, chloramphenicol in childhood sepsis Small randomized trials were sufficient to detect such large effects Randomization controls for biases (both small and large) between those receiving or not receiving a treatment

3 Post-1960 s: Important, but moderate sized benefits (20-30% RRR) or harms were missed because trials were too small eg. Thrombolytics in AMI 24 trials with a total of 6,000 patients - 5 statistically significant reduction in mortality - 11 non-significant benefit - 8 neutral or non-significant harm Meta-analysis indicates 25% (p<0.001) reduction in mortality (Eur Heart J 1985); but not accepted

4 SK in AMI : Meta-analysis of 24 trials v ISIS-2.

5 Plausible mortality reductions in MI A. From acute treatment No. of trials Estimated reduction meta-analysis Estimated reduction, Large trials I.V. Thrombolysis 19 22% 25% Glucose-insulin-potassium 5 23% 0% I.V. Nitrates 6 30% 5% Hyaluronidase 5 36% 0% Oral beta-blockade 22 7%? I.V. beta-blockade 27 8% 10% IV magnesium 11 50% 0% B. From long-term treatment Aspirin 6 10% 15% Sulfinpyrazone 2 15%? Anticoagulants 10 20% 20% Beta-blockade 24 22% 25% Yusuf et al, Stat Med 1984

6 Two essential principles of internal valid results 1) Minimize bias: Randomization and unbiased (not necessarily precise) ascertainment of outcomes (blinding, blinded evaluation of outcomes or death), e.g. large polio trial -Adjudication of 100,000 individuals in 10 large trials done by PHRI shows similar results of investigator reported and central adjudication of CVD events(pogue 2009) 2) Minimize random errors: -large trials of 1000 to 2000 events - unbiased meta-analysis of moderate and large trials which collectively include a few thousand events.

7 Meta-analysis vs Large Trials The standards for a good meta-analysis should be the same as for a reliable single trial -Avoidance of biases: Prospective vs retrospective selection of trials or specific outcomes or specific subgroups -Minimizing random errors: need an adequate number of events (concept of optimal information size for meta-analysis, Pogue & Yusuf, 1998)

8 Problems with current randomized trials 1. Internal validity vs External applicability -both enhanced by large numbers and wide entry and few exclusions ( Uncertainty Principle ) 2. Complex data collection (voluminous forms) 3. Complex and relatively unhelpful study procedures (strict definitions, adjudication of clinical outcomes) 4. Complex and wasteful bureaucracy (SAE reporting, on site monitoring, etc.) 5. Compensation for AE or SAE even those in the control group and those part of usual clinical course

9 Ratio of odds ratios (ORs) for adjudicated vs reported outcomes Pogue et al, Clin Trials 2009

10 Ensuring Data Quality 1. Random errors vs Systematic biases 2. Source verification of docs by onsite monitoring not very helpful unless used as a training and support tool 3. Key info/docs can be sent centrally (e.g. a hosp/ lab value/ecg) 4. Detection of fraud more efficient thru central statistical monitoring than on site visits (the latter generally detects sloppiness and poor record keeping which are anyway random )

11 Strategy for statistical approaches to detecting fraud Compare center data vs overall or vs other centers in the same country: 1. Frequency of binary data 2. Mean values of variables 3. Digit preference 4. Variance comparisons 5. Distance comparisons 6. Outcome probability 7. Repeated measures Pogue et al, Clin Trials 2013

12 Results of the POISE trial with and without centers with fraudulent data Primary: CV death, MI, cardiac arrest Without fraud data With fraud data Metoprolol versus placebo Metoprolol versus placebo HR 95% CI P-value HR 95% CI P-value Pogue et al, Clin Trials 2013

13 SAE: Term misunderstood Events that are part of the clinical course are considered AEs and SAEs. Replace reporting of individual SAEs to regulators by review of group data by independent DSMB members Relatedness generally useless Unexpected hard to assess in most situations except for liver failure or agranulocytosis, anaphylaxis, which are rare In RE-LY : 6200 SAE, 123,000 AEs Only excess were bleeds, which were predictable and accounted for <3% of SAEs and AEs

14 RCTs: Slow Death by a Thousand Unnecessary Policies? (Yusuf, CMAJ 2004) The Situation in India : -Large increase in sponsored trials by pharma in previous decade, some investigators/cros reported to make big profits. -Social activists claim (without documentation) that trials are: -exploitative (consent not obtained/informed) -processes not followed ->2500 deaths in trials done in previous few years -headline news

15 RCTs: Slow Death by a Thousand Unnecessary Policies? (Yusuf, CMAJ 2004) The Situation in India : Government responds with draconian & poorly thought out rules: -Added reviews of protocols (delay to start : mos) -compensation for medical care and any injury irrespective of active or control group or relatedness of AEs or SAEs and decided by local ethics committees. -national accreditation of centers ( 3 layers of approvals Ethics, Regulatory, and Health Ministry), and could take a year. Impact :-NIH stops 40 trials in India -Industry stops most new trials. Academic groups do not start new trials and stop enrollment into hi risk trials ( eg CABG Patients,cancer)

16 The alternatives to large trials and their meta-analysis in evaluating therapies -Clinical judgement -Using observational databases and extensive statistical analyses -matching -stratification -multivariable adjustments & regressions -propensity matched analyses -Instrumental variable analyses -Assessing harms of toxic chemicals: -regional variations is disease -Assessing harms of occupational hazards : -individual (exposure vs outcomes) -ecologic analyses -Natural experiments (For policy analyses)

17 All alternative methods to RCTs are potentially subject to uncontrolled confounding -treatment by indication bias -key features unrecorded -key features poorly recorded (systematic undercorrection) -key features partially missing ( informative ) -statistical models are iterative Even with modern tools can be subject to substantial biases and even directionally misleading results e.g. HRT: -observational studies suggest large benefits -RCTs suggest large harm

18 Propensity Matched Analyses Integrates many different predictors into a propensity score to experience an event -individuals matched -matched strata and stratum specific pooled estimates However, in practice there is no statistical difference between multivariable regression and propensity matching in 90% of cases (Shah 2005)

19 Instrumental Variable Analyses Mendelian randomization Health Systems research (most commonly geographic factors, e.g. distance from hosp vs emergency cardiac surgery) Key challenges is to find an instrumental variable that is valid That relates to the question of interest (i.e. exposure) but not to the outcome (i.e. event)

20 Effects of invasive cardiac management on AMI survivors (Stukel et al, JAMA 2007) Among 122,124 elderly AMI patients, does cardiac cath <30 days reduce mortality? Cardiac cath patients were younger and had less severe MI RR (CI) *Multivariable adjustment : 0.51 ( ) *Propensity score adjustment : 0.54 ( ) Regional cardiac cath rate as an instrumental variable : 0.84 ( ) RCTs 0.85 *Note, the first two methods are inflated two-thirds by biases and confounding

21 Future of CVD genetics: Mendelian rand as a tool to understand non-genetic influences Mendelian randomization can inform on the causality of risk factors Genetic variant Risk Factor? Outcome

22 Lp(a) and Mendelian randomization Causal role of Lp(a) in CAD suggested by genetics Clarke et al. NEJM 2009 Dec 24;361(26):

23 Lack of confirmation of causality of genes for CVD Glucose HDL CRP Uric acid Homocysteine

24 The PRECIS wheel: PRECIS: Pragmatic- Explanatory Continuum Indicator Summary Sackett. Clin Trials 2013

25 Examples of Large Simple Trials (often called Pragmatic Trials) 1. Poliomyelitis field trials of 400,000 children 2. ISIS series of trials, CREATE, DIG, HOPE 3. Trial in a registry (TASTE of mechanical aspiration in addition to PCI ) 4. Old vs Young Age of blood transfused (INFORM: 58,000 people) 5. Cluster cross-over designs (e.g. PADIT: Antibiotics to prevent pacemaker infections) 6. CHAPS (Educational strategy for BP control)

26 Conclusions 1. The reliable assessment of most therapies (which generally have moderate effects) requires large randomized trials with >1000+ events 2. External applicability is increased by wide enrollment criteria and few exclusions 3. Most of the procedures in RCTs can be simplified or eliminated with little loss in validity or study integrity (the crushing bureaucracy has skyrocketed costs, with little benefit). 4. When trials are not practical (e.g. examine toxicity of some exposures), then alternative designs (natural experiments, IVA) can be explored but need to be cautiously interpreted. Need more pragmatic (large and simple) trials at low cost

27 Implications of complexities Complexities and over-regulation of RCTs may kill RCTs completely in some countries, especially those with large disease burden(eg India). Fundamental loss is to the health of people in these countries. REVERSING THE PERVERSE COMPLEXITIES AND WASTE IN CLINICAL TRIALS IS A PUBLIC HEALTH EMERGENCY

28 Application of the PRECIS wheel for most trials Sackett. Clin Trials 2013

29 Expected effects of trial size on trial results. Relationship between no. of deaths in a trial and the probability of convincing (1P<001) results when treatment reduces death by about a quarter Total no. of deaths* (Approx no. rand if risk 10 per cent) Approx. probability of failing to achieve 1 p<0.01 significance if true risk reduction ¼ Comments before trial begins 0-50 (under 500) Over 0-9 Utterly inadequate (1000) Probably inadequate (3000) Possibly adequate, possibly not (6000) Probably adequate Over 650 (10,000) Under 0.1 Definitely adequate *About twice as many patients would be needed to achieve corresponding probabilities of detection of risk reductions of only 1/6 (instead of ¼). Conversely, only about half as many patients might be needed for risk reductions as large as 1/3

30 Actual effects of trial size on trial results of long term β-blockade Total deaths (β-bl. ± plac) (Mean no. randomized) 0-50 (255) Utterly inadequate (861) Probably inadequate (2925) Possibly adequate, possibly not (No such trials exist) Over 650 (No such trials exist) Statistical power P < 0.05 against Probably adequate Definitely adequate Total (866) Adequate only in aggregate No. of trials resulting in: Non-sigt. against Non-sigt. favourable P < 0.05 Favourable Yusuf et al, Stat Med 1984