Overview. Presenter: Bill Cheney. Audience: Clinical Laboratory Professionals. Field Guide To Statistics for Blood Bankers

Field Guide To Statistics for Blood Bankers A Basic Lesson in Understanding Data and P.A.C.E. Program: 605-022-09 Presenter: Bill Cheney Audience: Clinical Laboratory Professionals Overview Statistics is the science of describing data are visual tools for presenting data Statistics: The only science that enables different experts using the same figures to draw different conclusions. Evan Esar (1899-1995) 1

Objectives Describe basic statistical terms Recognize the parts of a graph Identify different types of graphs Identify correct questions to ask regarding statistics Definitions Descriptive and Inferential Statistics What we know versus What we can guess Descriptive and Inferential Statistics Do we have all of the data? What are our assumptions? Are we using the data to evaluate past performance or to predict future performance? What we know versus What we can guess 2

Definitions Population and Sample The Whole versus A Piece Population vs. Sample Population The Whole Versus A Piece Sample Is the population clearly defined? Is the population too big to study? Does the sample represent the whole population? Can I tell if the data is the whole population or it is a sample? Definitions Continuous Data and Discrete Data Infinite choices versus Limited options 3

Continuous vs. Discrete Data Infinite Choices Versus Limited Options Is the data discreet or is it continuous data? Is there any way to measure this value instead of assigning a name? Can the values be ranked in a logical order? Were all of the measurements made following the same rules in the same way? Definitions Mean, Mode, and Median The Average versus The Popular versus The Mid-Point See Glossary Link above for terms/definitions Mean, Mode, Median The Average Versus The Popular Versus The Mid-Point Are there any extreme values that are skewing the mean to give a false picture? Is the data sorted into too many or too few groups to find the most common value? How does the median compare to the mean? 4

Definitions Variance And Standard Deviation How far or How close See Glossary Link above for terms/definitions Variance vs. Standard Deviation What is the best way to describe the variations in the data? How large is the variation compared to the mean? How Far Or How Close Symbols N n x x or x bar x ~ or x tilde s or SD 5

Precision and Accuracy Repeatable Versus On Target What standard was used to measure accuracy? How precise can our measurements be with the tools we are using? See Glossary Link above for terms/definitions PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Next Slide After user has completed quiz At any time Unlimited times Example Precision and Accuracy in Binomial Data Contaminated Product Safe Product Positive Test True Positive False Positive Negative Test False Negative True Negative 6

Example #1 Joe s Bacterial Detection Testing Kit Evaluation Example: Contaminated Product Safe Product Positive Test TP = 2 FP = 1 Negative Test FN =3 TN = 94 Accuracy - (TP + TN)/(TP+FP+FN+TN) Precision - TP/(TP+FP) Definitions Sensitivity and Specificity in Binomial Data Example: Contaminated Product Safe Product Positive Test True Positive False Positive Negative Test False Negative True Negative Example Precision and Accuracy in Binomial Data Example: Positive Test Negative Test Contaminated Product Safe Product TP = 2 FP = 1 FN =3 TN = 94 Sensitivity -: TP/(TP+FN) Specificity -TN/(FP+TN) 7

Example #1 Joe s Bacterial Detection Testing Kit Accuracy : 96% Precision: 67% Sensitivity: 40% Specificity: 99% Example #2 Mary s Bacterial Detection Testing Kit Evaluation Example: Contaminated Product Safe Product Positive Test TP = 4 FP = 4 Negative Test FN =1 TN = 91 Accuracy = (4+91)/(4+4+1+91) = 0.95 or 95% Precision = 4/(4+4) = 0.50 or 50% Sensitivity = 4/(4+1) = 0.80 or 80% Specificity = 91/(4+91) = 0.96 or 96% Example #2 Mary s Bacterial Detection Testing Kit Accuracy : 95% Precision: 50% Sensitivity: 80% Specificity: 96% 8

Compare How sensitive and how specific does the method Joe s Mary s need to be? Accuracy 96% 95% What is the effect or risk or cost of increasing Precision specificity or sensitivity? 67% 50% What is the risk Specificity of too many 40% 80% false negatives? What is the risk Sensitivity of too many 99% 96% false positives? The Visual Presentation of Data Anatomy of a Graph 9

The Anatomy of a (Bad) Graph Chart # 1 100 80 Where is zero? Are the bars visually proportional to the data they represent? x-axis data vague No legend Do I know y-axis what is included in the data? Not labeled No zero point Are the scales linear? 60 40 Is the graph straightforward, or has it been manipulated to mislead me? Is the graph more or less 1 2 3 4 helpful than looking at raw data? Histograms A Poor Example of a Histogram Are the categories No labels or or bins titles meaningful? Vague categories Span of years different Are we comparing equal category sizes? Is the graph more or less useful than the data used to create the graph? 10

The Scatter Graph Are any possible correlations visible in the graph? Outliers Could other factors be influencing the correlation? Are there outliers that need to be investigated or verified? The Normal Distribution 68% within 1 SD of the mean 95% within 2 SD of the mean 99.7% 3 SD of the mean The Normal Curve 3 in 25 to 34 minutes 13 in 34 to 43 minutes 34 in 43 to 52 minutes 52 34 in 52 to 61 minutes 13 in 61 to 70 minutes 3 in 70 to 79 minutes 11

Measurement More Normal Curves Is this a graph of actual data, or is it a graph generated by a formula using mean and standard deviation? How many points can I expect to fall with in my desired range, based on a normal distribution? How many points can I expect to fall outside my range, due only to chance? Box Plot How much data was used to generate the box plots? Are the sample sizes equal? What might the boxes be hiding? What other graphs of the same data might be useful? What can I tell about the distributions of the different groups? 50 Run Chart 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sequence 12

P - Defectives NP - Defectives Basic Control Chart 26 24 22 20 18 16 14 12 10 8 6 4 2 0 Control Chart for Attribute Data np-chart Deferred Donors per 100 Donor Appointments Counting defectives makes sense when ANY error results in an discarded product. Counting defects makes sense when individual errors may be correctable. 17.37 8.85 0.33 Control charts used for monitoring problems with products in manufacturing 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00-0.05 Control Chart for Attribute Data p-chart Deferred Donor Rate per Blood Drive 0.186 0.097 0.008 p-chart plots the proportion or % of defectives in each group 13

C - Defects (Errors per 100 Forms) Control Chart for Attribute Data c-chart Errors per 100 Shipping Forms 26 24 22 20 18 16 14 12 10 8 6 4 2 0 17.77 8.85 0.00 Control Chart for Attribute Data u-chart p-chart and u-chart: irregular control limits due to the varying sample sizes np-chart and c-chart: fixed control limits p-chart and np-chart: used to monitor defectives (binomial data) u-chart and c-chart: used to monitor defects or errors Does the chart show attribute or continuous data? Are the sample sizes equal? Are we counting each error separately, or are we counting bad units? Control Charts for Continuous Data 14

MR: Weight, lbs Individuals: Weight, lbs Hemoglobin, g/dl Levey- Jennings Control Chart How were the control limits calculated? Low Control Is the current testing method stable? 20.0 18.0 16.0 14.0 12.0 10.0 8.0 6.0 4.0 2.0 0.0 Is there enough variation in the initial data to create a useful control chart? What rules are applicable? Has anything changed in the Sequence testing process Mean that +3 would SD +2 SD require changes -2 SD to -3 the SD chart? 10.7 10.6 10.5 10.4 10.3 10.2 10.1 10.0 9.9 9.8 9.7 9.6 9.5 9.4 9.3 Advanced Control Chart Individuals and MR Charts Shipping Box Weights, April 2009 Individuals Mean CL: 9.99 10.61 9.36 0.8 Advanced Control Chart Individuals and Moving Range Charts Shipping Box Weights, April 2009 Moving Ranges 0.77 0.6 0.4 0.2 0.23 0.0-0.2 0.00 15

X-Bar: Weight, lbs R: Weight, lbs X-Bar: Weight, lbs 10.4 10.3 10.2 10.1 10.0 9.9 9.8 9.7 9.6 Advanced Control Chart X- Bar and R Charts Shipping Box Weights, April 2009 X-Bar (n=4) 10.31 9.99 9.67 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0-0.1 Advanced Control Chart X- Bar and R Charts Shipping Box Weights, April 2009 Range 1.00 0.44 0.00 10.30 Advanced Control Chart X- Bar and S Charts Shipping Box Weights, April 2009 X-Bar (n=5) 10.28 10.20 10.10 9.99 10.00 9.90 9.80 9.70 9.70 9.60 16

Time, in minutes S: Weight, lbs 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 Advanced Control Chart How were the control X- limits Bar calculated? and S Charts Were samples averaged or were individual points plotted? Is the data continuous or discrete? Shipping Box Weights, April 2009 SD (Standard Deviation) 0.42 Are the control limits greater than or narrower than 3 standard deviations? Does the process appear to be stable? How do the control limits compare to the specification limits? 0.20 0.05 0.00-0.05 Do I have any outliers that need to be investigated? Am I seeing any patterns or breaking any of my rules? 0.00 A Quick Note About Continuous versus Attribute Data Control Charts Control Chart Example 90 80 70 60 50 40 30 20 10 0 Donation times, July 2009 Mean Upper Control Limit Lower Control Limit 17

S: Time X-Bar: Time, minutes Time, in minutes Standard Deviation: Time, minutes X-Bar: Time, minutes Control Chart Example 64.00 62.00 60.00 58.00 56.00 54.00 52.00 50.00 48.00 46.00 44.00 42.00 40.00 Donation Times, July 2009, n = 5 62.85 51.75 40.65 17.00 16.00 15.00 14.00 13.00 12.00 11.00 10.00 9.00 8.00 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00-1.00 Donation Times, July 2009, n = 5 16.25 7.78 0.00 Control Chart Example Donation times, July 2009 (not random) 90 80 70 60 50 40 30 20 10 0 Mean Upper Control Limit Lower Control Limit Donation Times, July 2009, (not random) 68.00 66.00 64.00 62.00 60.00 58.00 56.00 54.00 52.00 50.00 48.00 46.00 44.00 42.00 60.92 51.75 42.58 10.00 11.00 12.00 13.00 14.00 15.00 16.00-1.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 Donation Times, July 2009 (not random) 13.42 6.42 0.00 18

Control Chart Summary Start with a stable, in-control process Collect at least 100 measurements Use software calculations to set control limits as narrowly as possible Investigate all outliers or patterns PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Next Slide After user has completed quiz At any time Unlimited times Summary Statistics and graphs are tools for describing data Questions can help make the picture clearer 19

Thank You for attending the Field Guide to Statistics for Blood Bankers Class To Receive PACE Credit Complete both the Post -Test by clicking the Post -Test link on this page Complete the Evaluation by clicking the link in My Course Status area Once you have completed the Post -Test and Evaluation Click the submit Post -Test and Evaluation link Your Post -Test will be scored automatically If you passed the test, you will be able to download your personalized P.A.C.E. certificate 20