Animal Research Ethics Committee. Guidelines for Submitting Protocols for ethical approval of research or teaching involving live animals

Size: px
Start display at page:

Download "Animal Research Ethics Committee. Guidelines for Submitting Protocols for ethical approval of research or teaching involving live animals"

Transcription

1 Animal Research Ethics Committee Guidelines for Submitting Protocols for ethical approval of research or teaching involving live animals AREC Document No: 3 Revised

2 Guidelines for Submitting Protocols for approval by the Animal Research Ethics Committee General Guidelines: Where possible, protocols should be submitted several months prior to the proposed date of commencement of research. Even if several individuals will be involved with the project it is only necessary to submit one protocol form, but the total number of animals to be used should be indicated. Protocols should be submitted on the standardised electronic protocol form to Protocols will be assigned a reference number AREC-P-. to be quoted in all future correspondence concerning the protocol. Please complete the form in the boxes provided (make the boxes larger or smaller as needed) in a font size of at least 12 point. All sections of the form must be completed (except Section 18 that is optional). Guidelines to each section are provided on the form and should be deleted in the final submitted version. Section 16. Experimental design and statistical analyses. Give a short overview of the experimental design and the statistical analyses to be used to analyse the data. In experiments were there are multiple groups, indicate which groups are to be compared. Section 17. Sample size calculation. Justification for the number of animals to be used should be provided. It is advisable to consult a statistician before completing this section. An outline of the proposed statistical analysis of results should be presented. Note: An Appendix Section is being constructed to provide some additional sample calculations for this section. Section 20. Acknowledgement. Once the application has been approved by the AREC the applicant will be required to complete this section and deliver it with original signatures to the Office of Research Ethics. Animal Research Ethics Committee Guideline - Page of 6 2

3 Appendix I Planning of experiments (Revised 10 April 2006) Why do we need statistics? Experiments that use too few animals may fail to pick up biologically important effects, while those that use too many or use them incorrectly may be subjecting them to unnecessary pain, distress or lasting harm The method of assigning treatments to the experimental units and the decision about how many units should be used are statistical issues. The method of assigning treatments to units is called experimental design. A good choice can reduce the number of animals used in the experiment. Example The aim of this research is to develop drugs that improve symptoms of disease A. Specific aim: To investigate the effects of a drug on lung functioning variables, in rats with disease A.. Experimental Design: Rats will be divided into 3 groups each of size 20, each group exposed to different levels of a carcinogen. The animals in each group will then be treated with 3 different doses of a drug, n=5 per dose. The remaining 5 will serve as the control for each group. Lung functioning variables will be measured on each rat. Terminology The experimental unit is the smallest division of the experimental material such that any 2 units may receive different treatments in the actual experiment Animals are the experimental units. Procedures, drugs etc. under comparison are the treatments Example: Suppose in order to estimate the effect of a drug on rats measurements are taken at several time points after administration of the drug. These time points are not the experimental units because the time points corresponding to the same rat always receive the same treatment. Sample size calculations will refer to the number of rats required. Continuous or discrete variables Hormone levels, the level of protein A in the blood etc. - continuous Water maze learning yes/no discrete Time to water maze learning continuous. Continuous variables typically require smaller sample sizes to show an effect as they carry far more information than a simple dichotomy such as success or failure. Animal Research Ethics Committee Guideline - Page of 6 3

4 Types of design We can reduce the effect of uncontrolled variations on the error of the treatment comparisons. The general idea is to group the units into sets, all the units in a set being as alike as possible, and then assigning the treatments so that each occurs once in each set. Blocking When planning a controlled experiment the experimenter often acquires the ability to predict roughly the behaviour of the experimental material. In identical environments young male rats are known to gain weight faster than female rats. Such knowledge can be used to increase the accuracy of an experiment. If x treatments are to be compared, experimental units are first arranged into groups of x. Units assigned to the same group should be as similar to responsiveness as possible. Each treatment is then allocated by randomization to one unit in each group. The group is called a block. The experimental plan is called a randomized block. Example: If the effects of several drugs on the growth of young animals are to be compared, the drugs could be applied by separate and independent randomization to animals born in the same litter. Then differences in responses to the drugs will be much less affected by random genetic differences between animals in their propensity to grow. Covariance analysis Precision can be increased by noting relationships between the variable under examination and some other variable that is unaffected by the treatments. This relationship can be exploited using covariance analysis Example: In an experiment on milk production, milk yield in the experimental period may be related to milk yield at the start of lactation, before the animal was on the experiment. An important source of error, is that by the luck of the draw, some treatments will have been allotted to a more productive set of cows than others. By adjusting the treatment mean yields so as to remove these differences in yielding ability, we obtain a lower experimental error and more precise comparison among the treatments. This differs in two important ways from blocking. First its use is restricted to quantitative prognostic variables whereas blocking can be used with categorical or quantitative prognostic variables. Second, it is a purely statistical method of control and does not require any special arrangement of the experimental units. Factorial design Experiments in which more than one treatment factor is studied e.g. levels of drug concentration and method of drug delivery can give great increases in efficiency and also can provide information on whether the factors interact or act independently of each other. This is the subject of the methodology given the name factorial design Example: Two drugs each at 3 doses are administered to mice. It is of interest to know if the differences between drugs are constant over the doses or Animal Research Ethics Committee Guideline - Page of 6 4

5 whether there is an interaction e.g there is a bigger difference between drug 1 and 2 at dose 1 than at dose 3 Repeated measures Repeated measures data, in which the same response variable is recorded on each observational unit on several different occasions occurs frequently and can be used in combination with a blocking or factorial structure of the units. Everitt (1995) discusses such data in detail. Example: A neurological variable is measured on rats on six consecutive days following treatment. Statistical tests First we have to decide on the significance level of the test. This is the probability of declaring there is a difference between groups when in fact the difference is zero. This is typically (denoted α) set at a suitably small probability such as.05. It can be set lower (.01) if the experimenter needs to avoid concluding that an effect exists when there is not one (as introducing a new drug with serious side effects). The lower this is set the larger the sample size will be. No matter how small the difference is between groups, samples of sufficiently large size can virtually guarantee statistical significance. Therefore the investigator must specify just what difference is of sufficient importance for him to desire to detect it, Animal Research Ethics Committee Guideline - Page of 6 5

6 and must continue with his specifying the probability he desires of actually detecting it. This probability, denoted 1-β, is called the power of the test. Choice of significance level and power (α and β). One rule of thumb is to set β = 4α. Thus when the significance level α =.01 we take power=.95; for α =.02, set power =.90; and for α =.05, set the power =.80. Sample size calculation Having specified the quantities, α, 1-β, and the minimum difference in the groups he deems important, the investigator may use Tables (Fleiss 1973 Table A.3., Snedecor and Cochran Table )to find the sample sizes necessary to assure that (1) any smaller sample sizes will reduce the chances below 1-β of detecting the specified difference and (2) any larger sample sizes may increase the chance of declaring a trivially small difference to be significant Cost Frequently an investigator is restricted to working with sample sizes dictated by a prescribed budget or by a set time limit. He will still find the values in Tables useful, for he can use them to find those differences that he has a reasonable probability of detecting and thus to obtain a realistic appraisal of the chances of success of his study Specifying a difference worth detecting An investigator will often have some idea of the order of magnitude of the variable or proportions he is studying. This knowledge may come from his own or someone else s previous research, from an accumulation of clinical /laboratory experience, or from small-scale pilot work. Given at least some information, the investigator can using his imagination and expertise, come up with an estimate of a difference between 2 groups that is scientifically or clinically important. Given no information, the investigator has no basis for designing his study intelligently, and would be hard put to justifying it at all (Fleiss, 1973) Sample size to compare 2 groups on a continuous outcome variable Let σ 2 be the assumed common variance of the variable being measured in the two groups. This value is usually obtained from a similar study reported from the literature or from a pilot study. Let μ 1 and μ 2 be the two means in the two groups judged clinically to be so different that the statistical test should give a significant result. Here μ 1 - μ 2 represents the difference worth detecting. Let z p denote the value cutting off the proportion p in the upper tail of the standard normal distribution The formula N=2σ 2 (z α/2 + z β ) 2 / (μ 1 μ 2 ) 2 Animal Research Ethics Committee Guideline - Page of 6 6

7 approximates the required number of animals in each of two groups under the assumption of a normally distributed response variable. Example Two groups are to be compared using a two-tailed test at the 5% significance level (α=.05) and a power of 80% (1-β=.8)is demanded. Suppose we have (μ 1 μ 2 )/σ = 1.3 From Normal Tables z α/2 = z = 1.96 and z β =z 0.20 =.842 yields a required sample size of n=2( ) 2 / = 15.7 / = 9.3 or 10 animals per group The difference μ 1 μ 2 is sometimes expressed as a percentage of μ 1 i.e. (μ 1 μ 2 )/ μ and denoted by d. The coefficient of variation - CV - is by definition σ/μ Therefore the formula can be written as N= 2 (CV/d) 2 ( ) 2 i.e. N =15.7 (CV/d) 2 Example. Two groups are to be compared using significance level 5% and a power of 80%. CV=15% d=10% Then the sample size per group is n=15.7(15/10) 2 =36 Note on the coefficient of variation. For data from different populations or sources, the mean and standard deviation often tend to change together so that the coefficient of variation is relatively stable or constant e.g. the coefficient of variation of male and female rats at age 12 weeks may be 4%. CV is about 10%-20% in many continuous biological responses but each scientist needs to establish its level for the response variables used by them. Sample size to compare two proportions Let p 1 and p 2 be the two proportions in the two groups judged clinically to be so different that the statistical test should give a significant result. Here p 1 - p 2 represents the difference worth detecting. Then Table A. 3 Fleiss (1973) gives the required sample size for specified α and power. An approximate formula is to use as above N=2σ 2 (z α/2 + z β ) 2 / (p 1 p 2 ) 2 with σ 2 =.25 i.e. with α =.05 and power =.8, N= 15.7(0.25)/(p 1 -p 2 ) 2. Example. The success rate p 1 associated with a standard treatment might be p 1 =.60. If the investigator will view an alternative treatment (with success rate p 2 ) as superior to the standard only if it succeeds in removing the symptoms of at least one quarter of those animals who would not otherwise Animal Research Ethics Committee Guideline - Page of 6 7

8 .8) show success, then he is in effect specifying a value p 2 = (1-.60)=.70 as one that is different to a practically important extent from p 1 =.60. From Table A.3. in Fleiss (1973) sample size per group 395 (α=.05 and power Using the approximate formula N=15.7 (0.25)/(.6-.7) 2 =393. If he can afford to study no more than a total of 400 animals, his chances of detecting the hypothesized difference become less than 50:50. N.B. In any one experiment there may be several outcome variables measured on the same experimental units to be compared over groups. In this case the primary outcome variables should be identified. Then a sample size calculation should be done for each of these outcome variables and the maximum of these sample sizes is the sample size required for the experiment. Multiple comparisons Suppose four drugs are administered to groups of rats and the response of the four groups is to be compared e.g. weight gain after 2 weeks. Suppose none of the drugs is a natural control against which the others might be compared, nor is there any natural ordering to them. Then data-snooping should be undertaken. Comparisons among the groups that were suggested by knowledge of the subject matter as well as comparisons that were suggested by the data can be made. However we need to control for multiple comparisons. If interest is in pair-wise differences between the means and there are many such differences, then Tukey s method could be used (Fleiss 1986, pg 58). If interest, is exclusively in whether any of several treatment groups differ in mean level from a control then Dunnett s method should be used (Fleiss 1986, pg ). If the number of sensible comparisons between the groups means is small (say < 10) then the Bonferroni criterion can be used (Fleiss 1986, pg ). If for example, the Bonferroni criterion is to be used and it is proposed to do 10 comparisons, then we need to set the significance level to α/10 e.g..05/10=.005. Then the value from Normal Tables is z = and the factor 15.7 in the sample size formula to compare two groups becomes 26.6 so that N=26.6(CV/d) 2. These methods control the experiment-wise error rate (at say 5%). Newer methods control the False Discovery Rate. The problem of multiple comparisons may arise in the analysis of repeated measures data. For example, a commonly used method of analysis of repeated measures that involve a number of treatment groups is to compare the groups at each time point using t-tests. These tests are not independent and so their interpretation is difficult. A more relevant approach would be to construct a summary measure of the animal s response profile (e.g. difference between last and first time-point; maximum value over time). Then the analysis of treatment differences could be based on a t-test applied to the single measure. Indeed sample size calculations could also be based on this summary measure. This and other methods of analysing repeated measures data that avoid multiple comparisons are discussed in Everitt (1995). Animal Research Ethics Committee Guideline - Page of 6 8

9 Sample size to find an association between two variables You need to specify the following: The standard deviation of the dependent variable = σ 1 The standard deviation of the independent variable = σ 2 The minimal detectable difference = s (change in the dependent variable per unit change in the independent variable) Then for a two-sided test of zero association at the 5% level and power 80% the required sample size is N=1+ (( )σ 1 /(σ 2 s)) 2 Simulation Some experiments have a complex design and the statistical analysis may not be that of simply comparing two means or proportions or detecting an association between two variables. In order to compute necessary sample sizes in such cases simulation or computer methods may have to be used. Example: It is desired to compare each of 3 treatment groups to a control group. The failure rate in the control group is expected to be 90% while that in the treatment groups to be 25%. It is proposed to use 8 in the control group and 4 in each treatment group. Find the power of this experiment to detect any difference. We will use simulation. Solution: 1.Generate 3 samples from a binomial distribution of size 4 with p=.25. i.e. 3 treatment groups 2.Generate a sample of size 8 from a binomial with p=.90 i.e. 1 control group 3.Carry out Fisher s exact test 3 times one for each treatment versus control. 4.If any of the 3 p-values is significant this counts as a significant result. 5. Repeat steps 1-4 a large number of times say 1,000 and count the proportion of significant results This is the power of the test. References Everitt B. (1995). The analysis of repeated measures: a practical review with examples. The Statistician, 44, p Fleiss J. (1973) Statistical methods for rates and proportions. Wiley. Fleiss J. (1986) The design and analysis of clinical experiments. Wiley. Snedecor and Cochran (1980) Statistical Methods. The Iowa State University Press. Animal Research Ethics Committee Guideline - Page of 6 9

10 Appendix II - Schematic of the procedures to be followed before conducting animal experiments Submit Protocol application to AREC Protocol application reviewed by AREC Apply for appropriate license from the DOH Approved: Application approved Conditionally Approved: Application approved subject to satisfying defined criteria Not Approved: Application rejected for defined reasons. If necessary, apply for additional appropriate license from the DOH Authors submit a completed section 20 Authors submit a revised protocol (including signed section 20) with letter outlining responses to criteria Authors submit a revised application and a letter outlining the responses to the reasons for rejection of initial application Protocol approved by Chairperson of AREC (or designate) Protocol not approved by Chairperson of AREC (or designate) for failing to satisfy defined criteria Receive appropriate license(s) from the DOH Author supply Designated Animal Facility with approved protocol and copy of license(s) Start Experiment Animal Research Ethics Committee Guideline - Page of 6 10