Don t We Need to Remove the Outliers?

Similar documents
Shewhart and the Probability Approach. The difference is much greater than how we compute the limits

Individual Charts Done Right and Wrong

Statistics Digest. Message From the Chair THE NEWSLETTER OF THE ASQ STATISTICS DIVISION IN THIS ISSUE. Articles. Departments. Richard Herb McGrath

The Six Sigma Practitioner s. Guide to Data Analysis

You can t fatten a pig by weighing it!

The Tyranny of Tools, or Wait, what were we trying to do in the first place? Bill Hathaway, MoreSteam

Descriptive Statistics Tutorial

Shewhart, Deming, and Six Sigma

Chapter 3. Displaying and Summarizing Quantitative Data. 1 of 66 05/21/ :00 AM

Introduction to Statistics. Measures of Central Tendency

A History of the Chart for Individual Values. The ultimate in homogeneous subgroups

MAKING SENSE OF DATA

Control Charts for Customer Satisfaction Surveys

In the appendix of the published paper, we noted:

Describe the impact that small numbers may have on data

Homogeneity Charts. The fourth of six uses

HIMSS ME-PI Community. Quick Tour. Sigma Score Calculation Worksheet INSTRUCTIONS

Introduction to Statistics. Measures of Central Tendency and Dispersion

Your reputation is on the line, your clients deserve the best, so use the best industry comparison benchmarking data available.

Supervisor Web Services Training Guide For. Aquinas College

Summary Statistics Using Frequency

SPSS 14: quick guide

Online Student Guide Types of Control Charts

CS 147: Computer Systems Performance Analysis

CEC SDG Glossary of Terms Notes

Chapter 19. Confidence Intervals for Proportions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 8 Script. Welcome to Chapter 8, Are Your Curves Normal? Probability and Why It Counts.

Understanding and Managing Uncertainty in Schedules

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

AP Statistics Test #1 (Chapter 1)

Bar graph or Histogram? (Both allow you to compare groups.)

Operations and Supply Chain Management Prof. G. Srinivisan Department of Management Studies Indian Institute of Technology, Madras

QUICK & DIRTY GRR PROCEDURE TO RANK TEST METHOD VARIABILITY

Section 5: Production Management

Cpk. X _ LSL 3s 3s USL _ X. Cpk = Min [ Specification Width Process Spread LSL USL

CHAPTER 4. Labeling Methods for Identifying Outliers

Designing Effective Compensation Plans

Shewhart s Charts and The Probability Approach

Unit QUAN Session 6. Introduction to Acceptance Sampling

The Envision process Defining tomorrow, today

Toolkit user guide. All content ACTS version 1.0

Bringing development and operations together with real-time business metrics

QUICKBOOKS 2018 STUDENT GUIDE. Lesson 3. Working with Lists

Chapter 1. Introduction

Day 1: Confidence Intervals, Center and Spread (CLT, Variability of Sample Mean) Day 2: Regression, Regression Inference, Classification

8 Tips to Help You Improve

Lesson 3 Working with Lists

Simplifying Risk-Based Testing

Basic applied techniques

Lesson 3 Working with Lists

The E-Myth Revisited Why Most Small Businesses Don t Work And What to Do About It

ISO 13528:2015 Statistical methods for use in proficiency testing by interlaboratory comparison

What s new in D365 for Finance & Operations Reporting? Hope Enochs

Lease is more. What you need to know about why you should lease your fleet, rather than buy.

Support & Resistance

Process Behavior Charts as Report Cards. The first of six uses

I m going to begin by showing you the basics of creating a table in Excel. And then later on we will get into more advanced applications using Excel.

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test

LEADER. Develop remarkable leaders who deliver amazing results

Capability on Aggregate Processes

FACEBOOK MARKETING FOR REAL ESTATE

MARK SCHEME for the May/June 2010 question paper for the guidance of teachers 9772 ECONOMICS. 9772/02 Paper 2 (Essays), maximum raw mark 75

The Comet Assay How to recognise Good Data

Launch Store. University

have provided direction in the notes section to view specific items on the slide while reading through the speaker notes. We will reference specific

[CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK] [CLICK]

Learner Guide. Cambridge O Level Economics. Cambridge Secondary 2

MULTICHANNEL MARKETING FOR PHARMA

Kristin Gustavson * and Ingrid Borren

Untangling Correlated Predictors with Principle Components

Revision confidence limits for recent data on trend levels, trend growth rates and seasonally adjusted levels

Computer Science and Software Engineering University of Wisconsin - Platteville 3. Statistical Process Control

Teams. The case for developing cultural sensitivity in international teams DOWNLOAD MODULE ONE: RECOGNIZING DIFFERENCES

KPIs improve when you measure them.

Stylized Facts about Growth

Sample Sponsorship Proposal Simplified Version

A Software Metrics Primer

SCALING LAND-BASED INNOVATION GROUP DECISION-MAKING TOOLKIT

The Dummy s Guide to Data Analysis Using SPSS

The 10 Big Mistakes People Make When Running Customer Surveys

Modelling buyer behaviour - 2 Rate-frequency models

Activities supporting the assessment of this award [3]

Mean vs Median Analysing Stormwater Drainage Results with ARR 2016

Calculating the Standard Error of Measurement

TRADING FOREX. with POINT & FIGURE

Tips for Performance Review and Goal Setting

Chapter 1 Data and Descriptive Statistics

Report No Air & Emissions Proficiency Testing Program. Round 9. Particulate Matter on Filter Paper. March 2017

Target Management through Hour by Hour reporting for a Continuous Coating Environment

Simple Rules: Organizational DNA

R = kc Eqn 5.1. R = k C Eqn 5.2. R = kc + B Eqn 5.3

The Goals Grid. A Versatile, Multi-Purpose Tool. Fred Nickols 8/8/2011

MAS187/AEF258. University of Newcastle upon Tyne

Agile leadership for change initiatives

One of the key qualities of a successful project is the standard it adheres to for its documentation and document retention.

Exploratory Data Analysis

The Folly of Forecasting Using Forward Curves

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Topic 1: Descriptive Statistics

Transcription:

Quality Digest Daily, October 6, 2014 Manuscript 274 Characterization and estimation are different. Donald J. Wheeler Much of modern statistics is concerned with creating models which contain parameters that need to be estimated. In many cases these estimates can be severely affected by unusual or extreme values in the data. For this reason students are often taught to polish up the data by removing the outliers. Last month we looked at a popular test for outliers. In this column we shall look at the difference between estimating parameters and characterizing process behavior. ESTIMATION To illustrate how polishing the data can improve our estimates we will use the data of Figure 1. These values are 100 determinations of the weight of a ten-gram chrome steel standard known as NB10. These values were obtained once each week at the Bureau of Standards, by one of two individuals, using the same instrument each time. The weights were recorded to the nearest microgram. Since each value has the form of 9,999,xxx micrograms, the four nines at the start of each value are not shown in the table only the last three values in the xxx positions are recorded. The values are in time order by column. Weeks 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 591 602 592 597 595 596 596 588 592 599 600 597 601 600 591 594 595 594 594 593 594 593 601 590 601 593 608 591 599 588 601 598 598 599 598 595 593 600 588 625 598 599 601 593 593 589 594 592 607 591 594 601 603 577 594 590 596 596 563 594 599 600 593 594 587 590 597 599 582 602 597 599 599 594 591 590 592 596 585 594 599 595 601 598 596 599 596 592 596 597 597 598 599 595 598 598 593 594 599 596 Figure 1: NB10 Values for Weeks 1 to 100 If we compute the usual descriptive statistics we find that the average of the tabled values is 595.4 micrograms and their standard deviation statistic is 6.47 micrograms. Using these two values to define a normal distribution we would end up with the curve shown superimposed upon the histogram in Figure 2. Both the area under the curve and the area of the histogram are the same. Yet the curve does not really match up with the histogram. It is too heavy in the regions around 585 and 605, and not high enough near 595. www.spcpress.com/pdf/djw274.pdf 1 October 2014

560 570 580 590 600 610 620 Figure 2: Histogram and Normal Curve for NB10 Values The outliers in the histogram create the mismatch between the fitted model and the data. Seven values look like outliers in Figure 2. If we delete the four values below 586 and the three values above 606, and recompute our descriptive statistics we find the revised histogram has an average of 595.6 micrograms and a standard deviation statistic of 3.74 micrograms. Using these two values to define a normal distribution we end up with the curve shown in Figure 3. Now we have a much better fit between our model and the histogram. 560 570 580 590 600 610 620 Figure 3: Histogram and Normal Curve for Revised NB10 Values The whole operation of deleting outliers to obtain a better fit between the model and the data is based upon computations which implicitly assume that the data are homogeneous. However, when you have outliers, this assumption becomes questionable. If the data are homogeneous, where did the outliers come from? Thus, whether the data are homogeneous or not must be the primary question for any type of analysis. While this is the one question we do not address in our statistics classes, it is percisely the question considered by the process behavior chart. THE CHARACTERIZATION OF PROCESS BEHAVIOR What about the seven values we simply deleted in order to obtain the better fit between our assumed model and our revised data set? What were these values trying to tell us about this process? Here the question is not one of estimation, but rather one of using the data to characterize the underlying process represented by the data. Figure 4 contains the XmR Chart for the 100 values of Figure 1. The limits are based upon the Median Moving Range of 4.0 micrograms. Here we have clear evidence of at least three upsets or changes in the process of weighing NB10. Five of the seven outliers that we deleted in order to fit the model in Figure 3 are signals that reveal that this set of values is not homogeneous. This lack of homogeneity undermines the model of Figure 3 and makes it inappropriate. If you want to use your data to gain insight into the underlying process that creates the data, then the outliers are www.spcpress.com/pdf/djw274.pdf 2 October 2014

the most important values in the data set! Yet students are routinely taught to delete those pesky outliers. After all, when you are looking for iron and tin, you should not let silver and gold get in the way. 630 620 610 600 590 580 608.0 595.4 582.8 570 560 40 10 20 30 40 50 60 70 80 90 100 30 20 10 0 15.5 4.0 Figure 4: XmR Chart for 100 Weighings of NB10 DON T THE OUTLIERS DISTORT THE LIMITS? But don t we need to remove the outliers to compute good estimates of location and dispersion? No, we don t. To see why this is so it is helpful to consider the impact of outliers upon the limits of a process behavior chart. We commonly base our limits on the average and an average range. The average may be affected by some very extreme values, but this effect is usually much smaller than people think it will be. In Figure 1 some values are out of line with the bulk of the data by as much as 30 micrograms. However, the average value of approximately 595 micrograms was found by dividing 59,500 by 100. If the total of 59,500 is adjusted up or down by 30, 60, or even 90 units, it will have a very small effect upon the average. In this example deleting the outliers changed the average from 595.4 to 595.6. Thus, the average is a very robust measure of location, which is why we use it as our main statistic for location. Of course, whenever we have reason to think that the average may have been affected by the presence of several extreme values that are all on the same side, we can always use the median instead. Hence, while our primary measure of location is robust, we have an alternative for those cases where one is needed. Likewise, when we compute an average range, we are once again diluting the impact of any extreme values that are present in the set of ranges. In general, a few large ranges will not have an undue impact upon the average range. However, if they do appear to have inflated the average range, we can resort to using the median range. In Figure 4 the limits are based upon the Median Moving Range of 4.0 micrograms. This results in an estimated dispersion for the www.spcpress.com/pdf/djw274.pdf 3 October 2014

individual values of: Sigma(X) = Median Range 0.954 = 4.2 micrograms It is instructive to compare this with the two values for the standard deviation statistic computed from these data. Using all 100 values from Figure 1 we found s = 6.47 micrograms. Using only the 93 values shown in Figure 3 we found s = 3.74 micrograms. Thus, the Median Moving Range (based on all 100 values) gives an estimate for dispersion that is quite similar to the descriptive statistic coumputed after the outliers had been removed. This robustness which is built into the computations for the process behavior charts removes the need to polish the data prior to computing the limits. The computations work even in the presence of outliers and signals of exceptional variation. DON T WE NEED A PREDICTABLE RANGE CHART? The fact that the computations work even in the presence of outliers is important in the light of the advice given in some SPC texts. These texts warn the student to check the Range Chart before computing limits for the X Chart or the Average Chart. If the Range Chart is found to display evidence of unpredictable behavior, then the student is advised to avoid computing limits for the Average Chart or the X Chart. The idea being that signals on the Range Chart will corrupt the average range and hence corrupt the limits on the other chart. This advice is motivated by a desire to avoid using anything less than the best estimates possible. However, the objective of a process behavior chart is not to estimate, but rather to characterize the process as being either predictable or unpredictable. Given the conservative nature of three-sigma limits we do not need high precision in our computations. Three sigma limits are so conservative that any uncertainty in where our computed limits fall will not greatly affect the coverage of the limits. To characterize this effect Figure 5 shows the coverages associated with limits ranging from 2.8 sigma to 3.2 sigma on either side of the mean. There we see that regardless of the shape of the distribution, and regardless of the fact that there is uncertainty in our computations, our three-sigma limits are going to filter out virtually all of the routine variation. This is what allows us to compute limits and characterize process behavior without having to first delete the outliers. The computations are robust, and as a consequence, the technique is sensitive. www.spcpress.com/pdf/djw274.pdf 4 October 2014

Uniform 100% coverage 100% coverage Right Triangular Normal 99.5% to 99.9% Burr 98.9% to 99.4% Chi-Square (4 d.f.) 98.2% to 98.9% Exponential 97.8% to 98.5% Figure 5: Three-Sigma Limits Filter Out Virtually All of the Routine Variation Regardless of the Shape of the Histogram and Regardless of the Uncertainty in Our Estimates of the Limits Thus, the advice to make sure the Range Chart is predictable prior to computing limits for the Average Chart or the X Chart is just another version of the delete the outliers argument. These arguments are built on a misunderstanding of the objective of process behavior charts and a failure to appreciate that the computations are already robust. SUMMARY So, should you delete outliers before you place your data on a process behavior chart? Only if you want to throw away the most interesting and valuable part of your data! www.spcpress.com/pdf/djw274.pdf 5 October 2014

If you fail to identify the signals of exceptional variation as such, if you assume that a collection of data is homogeneous when it is not, then you are likely to have both your analysis and your conclusions undermined. The outliers are the interesting part of your data. In the words of George Box, The key to discovery is to get alignment between an interesting event and an interested observer. Outliers tell you where to look in order to learn something about the underlying process that is generating your data. When you delete your outliers you are losing an opportunity for discovery. www.spcpress.com/pdf/djw274.pdf 6 October 2014