CHAPTER 4. Labeling Methods for Identifying Outliers

Size: px
Start display at page:

Download "CHAPTER 4. Labeling Methods for Identifying Outliers"

Transcription

1 CHAPTER 4 Labeling Methods for Identifying Outliers 4.1 Introduction Data mining is the extraction of hidden predictive knowledge s from large databases. Outlier detection is one of the powerful techniques of data mining. There are many authors defined outliers in different words, Hawkins (1980) defined as An outlier is an observation that deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism. Outliers are also referred to as discordance, deviants, abnormalities or anomalies in data mining and statistics literature Aggarwal (2005). Some outlier labeling methods such as the Standard Deviation (SD), the MADe and the Median rule are commonly used. These methods are quite reasonable when the data distribution is normal.in non random distributions, outliers can decrease normality. When data depart from a normal distribution, a transformation to normality is simply a common step in order to identify outliers 92

2 Chapter 4. Labeling Methods for Identifying Outliers 93 using a method which is quite effective in a normal distribution. Quesenberry & David (1961) discussed on the rejection and location of outlying observations that there might be several ways of approaching the problem, which depended to a large extent on the object in view. One might be particularly interested in identifying the genuinely exceptional observations, in order to a new insight into the phenomena under study the basis of risks of misclassification rather than of estimation errors. Grubbs (1969) has statistically determining whether the highest observation, the highest and lowest observations, the two highest observations, the lowest observations, or more of the observations in the sample are statistical outliers. Most outlier labeling methods, informal tests, generate an interval or criterian for outlier detection instead of hypothesis testing, and any observations beyond the interval or criterian is considered as an outlier. Various location and scale parameters are mostly employed in each labeling method to define a reasonable interval or criterian for outlier detection. There are two reasons for using an outlier labeling method. 1. To find the possible outliers as a screening device before conducting a formal test. 2. To find the extreme values away from the majority of the data regardless of the distribution. while the formal tests are usually require test statistics based on the distribution assumptions and a hypothesis to determine if the largest extreme value is a true outlier of the distribution, most outier labeling methods present the interval using the

3 Chapter 4. Labeling Methods for Identifying Outliers 94 location and scale parameters of the data. Although the labeling method is usually simple to use, some observations outside the interval may turn out to be falsely identified outliers after a formal test when the outliers are defined as only observations that deviate from the assuming distribution. However, if the purpose of the outlier detection is not a preliminary step to find the extreme values violating the distribution assumptions of the main statistical analyses such as the t-test, ANOVA and regression but mainly to find the extreme values away from the majority of the data regardless of the distribution, the outlier labeling methods may be applicable. In addition, for a large data set that is statistically problematic, e.g. when it is difficult to identify the distribution of the data or transform it into a proper distribution such as the normal distribution, labeling methods can be used to detect outliers Issues of Outliers outliers. Iglewicz & Hoaglin (1993) categorized the three following issues with regards to Outlier labeling flag potential outliers erroneous data, indicative of an inappropriate distributional model for further investigation. Outlier accommodation It is used to robust statistical techniques that will not be unduly affected by outliers. That is, if we cannot determine that potential outliers are erroneous observations, do we need modify our statistical analysis to more appropriately account for these observations.

4 Chapter 4. Labeling Methods for Identifying Outliers 95 Outlier identification It used to formally test whether observations are outliers. This chapter focuses on the outlier labeling technique and issues of outlier identification. Many real-time data sets contain outliers that have unusually large or small values when compared with others in the data set. Outliers may cause a negative effect on data analyses, such as ANOVA and regression, based on distribution assumptions, or may provide useful information about data when we look into an unusual response to a given study. Thus, outlier detection is an important part of data analysis in the above two cases. Several outlier labeling methods have been developed. Some methods are sensitive to extreme values, like the SD method, and others are resistant to extreme values, like Tukey s method. Although these methods are quite powerful with large normal data, it may be problematic to apply them to non-normal data or small sample sizes without knowledge of their characteristics in these circumstances. This is because each labeling method has different measures to detect outliers, and expected outlier percentages change differently according to the sample size or distribution type of the data. Many kinds of data regarding public health are often skewed, usually to the right, and lognormal distributions can often be applied to such skewed data, for instance, surgical procedure times, blood pressure, and assessment of toxic compounds in environmental analysis.

5 Chapter 4. Labeling Methods for Identifying Outliers Methods of Analysis In this section, several outlier labeling methods are available among them four of the labeling methods such as Z-Score, Modified Z-Scores, Median Absolute Deviation (MADe) and Tukey Method (Boxplot) are used in the studies. FIGURE 4.1: Flowchart for Outlier Labeling Methods

6 Chapter 4. Labeling Methods for Identifying Outliers Z - Scores Z-Score is a statistical measurement of a score s relationship to the mean in a group of scores. Z-score of 0 means the score is same as the mean. It can also be positive or negative, indicating whether it is above or below the mean and by how many standard deviations. This method that can be used to identifying outliers in the dataset is the Z-score, using the mean and standard deviation. Z scor e (i )= x i x, (4.1) s where s= 1 n 1 n (x i x) 2 i=1 The Z-scores based on the property is that if X follows a normal distribution, N(µ,σ 2 )then Z follows a standard normal distribution, z = x µ σ N(0,1), and Z scores that exceed 3 in absolute value are generally considered as outliers. This method is simple and it is the same formula as the 3 SD method when the criterion of an outlier is an absolute value of a Z-score of at least 3. According to Shiffler (1988), a possible maximum Z-scores is dependent on sample size and it computed as(n 1)/ n. Since no z-score exceeds 3 in a sample size less than or equal to 10, the z-score method is not very good for outlier labeling, particularly in small data sets. Another

7 Chapter 4. Labeling Methods for Identifying Outliers 98 limitation of this rule is that the standard deviation can be inflated by a few or even a single observation having an extreme value. Thus it can cause a masking problem, i.e., the less extreme outliers go undetected because of the most extreme outlier(s). Although it is common practice to use Z-scores to identify possible outliers, this can be misleading (partiucarly for small sample sizes) due to the fact that the maximum Z-score is at most (n 1)/ n Interpretation of Z-Scores Here it is interpretion steps for z-scores. 1. z-score less than 0 represents an element less than the mean. 2. z-score greater than 0 represents an element greater than the mean. 3. z-score equal to 0 represents an element equal to the mean. 4. z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc. 5. z-score equal to -1 represents an element that is 1 standard deviation less than the mean; a z-score equal to -2, 2 standard deviations less than the mean; etc. 6. If the number of elements in the set is large, about 68% of the elements have a z-score between -1 and 1; about 95% have a z-score between -2 and 2; and about 99% have a z-score between -3 and 3. Standard scores are also called z-values, z-scores, normal scores, and standardized variables; the use of "Z" is because the normal distribution is also known as the "Z

8 Chapter 4. Labeling Methods for Identifying Outliers 99 FIGURE 4.2: Compares the various grading methods in a normal distribution. Includes: Standard deviations, cumulative percentages, percentile equivalents, Z-scores, T-scores, standard nine, percent in stanine distribution". They are most frequently used to compare a sample to a standard normal deviate, though they can be defined without assumptions of normality. The z-score is only defined if one knows the population parameters; if one only has a sample set, then the analogous computation with sample mean and sample standard deviation yields the Student s t-statistic. The z-score is often used in the z-test in standardized testing the analog of the Student s t-test for a population whose parameters are known, rather than estimated. As it is very unusual to know the entire population, the t-test is much more widely used. A few applications of z-scores include the following: 1. What percentage of people fall below a specific value?

9 Chapter 4. Labeling Methods for Identifying Outliers What values can be deemed extreme? For example, in an IQ test, what scores represent the top 5%? 3. What is the relative score of one distribution versus another? For example, Michael is taller than the average male and Emily is taller than the average female, but who is relatively taller in their own gender? These types of questions can be answered using a z-score. As a general rule, z-scores less than or greater than 1.96 are considered unusual and generally very interesting. They are also synonymous with being statistically significant and outliers Modified Z - Scores The previous problem of Z-Scores was used two estimators the sample mean ( x) and sample standard deviation(s), can be affected by a few extreme values or by even a single extreme value. To resolve this problem the median and the median of the absolute deviation (MAD) are employed in the modified Z - Scores instead of the mean and standard deviation of the sample, respectively (Iglewicz & Hoaglin (1993)). M AD = medi an { x i x } (4.2) where x is the sample median. M i = (x i x) M AD (4.3) where E(M AD)=0.675σ for large normal data. Iglewicz & Hoaglin (1993) suggested

10 Chapter 4. Labeling Methods for Identifying Outliers 101 that observations are labeled outliers when M i >3.5 through the simulation based on pseudo-normal observations for sample sizes of 10, 20 and 40. The M i score is effective for normal data in the same way as the Z-score Median Absolute Deviation (MADe) The Median Absolute Deviation (MADe) method is one of the basic robust methods which are largely unaffected by the presence of extreme values of the data set. This approach is similar to the SD method. However, the median and MADe are employed in this method instead of the mean and standard deviation. It is defined as follows, 2M AD e Method : Medi an± 2M AD e (4.4) 3M AD e Method : Medi an± 3M AD e (4.5) where M AD e = M AD for large normal data and is an estimator of the spread in a data similar to the standard deviation. M AD = medi an { x i medi an(x) }, i = 1,2,...,n (4.6) The MAD is scaled by a factor of it also similar to the standard deviation in normal distribution or Absolute Deviation around the Median as stated in the title is a robust measure of central tendency.

11 Chapter 4. Labeling Methods for Identifying Outliers Tukeys Method (Box Plot) Tukey (1977) method, constructing a boxplot, is well known simple graphical tool to display information about continuous univariate data, such as the median, lower quartile, upper quartile, lower extreme and upper extreme of a data set. This method for finding outliers uses the interquartile range to filter out very large or very small numbers. The formulas are: Low outlier s= Q 1 1.5(Q 3 Q 1 )= Q 1 1.5(IQR) (4.7) Hi g h outlier s= Q (Q 3 Q 1 )= Q (IQR) (4.8) Where: Q1 = first quartile, Q3 = third quartile, IQR = Interquartile range These equations gives two values, or fences. A fence that cordons off the outliers from all of the values that are contained in the bulk of the data. The given following steps for finding outliers using IQR, Step 1 Find the Interquartile Range and Median. Step 2 Find Q1 and Q3. Q1 can be thought of as a median in the lower half of the data. Q3 can be thought of as a median for the upper half of data. Subtract Q1 from Q3. Step 3 Calculate 1.5 IQR and subtract from Q1 to get lower fence Step 4 Add to Q3 to get upper fences

12 Chapter 4. Labeling Methods for Identifying Outliers 103 Step 5 Add fences to the data to identify outliers 4.3 Computation Results and Discussion In this study, the diabetes data was obtained from the primary health center in Tirunelveli. This data has 50 observations for the patient s diabetes levels. It has computed with different output on the several labeling methods. The given methods are computed by open source R software package. Several labeling methods are employed in this study, each methods has different measures for identifying outliers in the data set. It screens the different behavior of the skewness and sample size Computation of Z-Scores In Table 4.1(case - 1) with all the data has included, it appears that the value 50 is outlier, yet no observations exceed the absolute value 3. For Table 4.2 (case - 2), the most extreme value 50 has excluded in the data, 49 and 48 has considered as outliers. This is because the multiple extreme values have artificially inflated the standard deviation Computation of Modified Z-Scores For this method, the computation results are tabulated below and it is compared with z-scores. Table 4.3 shows that the computed data values of the modified Z-scores in absolute value, out of these, this 3 observations (236, 236, and 525), may well be

13 Chapter 4. Labeling Methods for Identifying Outliers 104 TABLE 4.1: Computation and Masking Problem of the Z-Scores(Case-1) Obs.No. x i Z-Score Obs.No. x i Z-Score outliers Computaion of MADe This method was computed from the data set results as follows, from the equations MADe = , Median = 110, MAD = 19. Here the 2 MADe method has identifying 6 outliers which are: 172, 525, 175, 236, 169 and 236. Also, the 3 MADe method has identifying 3 outliers which are: 525, 236 and 236.

14 Chapter 4. Labeling Methods for Identifying Outliers 105 TABLE 4.2: Computation and Masking Problem of the Z-Scores(Case-2) Obs.No. x i Z-Score Obs.No. x i Z-Score Dot Plot for MADe A dotplot is made up of dots plotted on a graph. Here is how to interpret a dotplot. 1. Each dot represents a specific number of observations from a set of data. (Unless otherwise indicated, assume that each dot represents one observation. If a dot represents more than one observation, that should be explicitly noted on the plot.) 2. The dots are stacked in a column over a category, so that the height of the

15 Chapter 4. Labeling Methods for Identifying Outliers 106 TABLE 4.3: Computation of Z-Scores compared with the Modified Z-Scores Z-Scores Modified Z-Scores Modified i x i i x i Z-Scores Z-Scores Case - 1 Case -2 Case - 1 Case column represents the relative or absolute frequency of observations in the category. 3. The pattern of data in a dotplot can be described in terms of symmetry and skewness only if the categories are quantitative. If the categories are qualitative (as they often are), a dotplot cannot be described in those terms.

16 Chapter 4. Labeling Methods for Identifying Outliers 107 Compared to other types of graphic display, dotplots are used most often to plot frequency counts within a small number of categories, usually with small sets of data. FIGURE 4.3: Dotplot for visualize the data with outliers In figure 4.3 the extreme value at x=525 has dragged x+ 2s is the outlier cutoff, above the same two points at x=236, 236. Only the point at x=525 is therefore caught as an outlier, even though the points at x=236, 236 is clearly also an outlier Computaion of Tukey Method(Box Plot) In this method obtained from the result of the dataset is, TABLE 4.4: Tukey method outlier detection using IQR Sample size 50 Lowest value Highest value Arithmetic mean Median Standard deviation Coefficient of Skewness (P<0.0001) Coefficient of Kurtosis (P<0.0001) Suspected outliers(tukey 1977) Outside values Far-out values 525

17 Chapter 4. Labeling Methods for Identifying Outliers 108 FIGURE 4.4: Box and Whisker plot for visualizing outliers The IQR (Inter Quartile Range) is the distance Q1=95.25, Q3=133.5 and IQR = Thus the inner fences is [37.875, ] and outer fence is [19.5, ]. The two extreme values are, 236 and 525 are identified as probable outliers in this method. Figure 4.4 is a boxplot for the dataset. In Figure 4.4, the central box represents the values from the lower to upper quartile (25 to 75 percentile). The middle line represents the median. The horizontal line extends from the minimum to the maximum value, excluding outside and far out values which are displayed as separate points. An outside value is defined as a value that is smaller than the lower quartile minus 1.5 times the interquartile range, or larger than the upper quartile plus 1.5 times the interquartile range (inner fences). A far out value is defined as a value that is smaller than the lower quartile minus 3 times the interquartile range, or larger than the upper quartile plus 3 times the interquartile range (outer fences).

18 Chapter 4. Labeling Methods for Identifying Outliers 109 TABLE 4.5: Number of outliers detected by different outlier labeling methods Methods Cases Cutoff value Outliers Z-Scores I 525 Zi>3 II 236, 236 Modified Z-Scores MAD Mi > , 236, 236 MAD 2MADe MAD>2 169, 172, 175, 236, 236, 525 3MADe MAD>3 525, 236, 236 Tukeys Method Outside values [37.875, ] 236,236 Far outside values [19.5, ] Conclusion The performance of the various outlier labeling methods Z-Score, Modified Z-Scores, MADe and Tukey has been studied statistically using real time dataset to evaluate which of the methods has more powerful way for detecting and handling outliers. Most intervals are used to identify the possible outliers in the outlier labeling methods that are effective under the normal distribution. Z-Scores and Tukey methods are affected by masking problem, for this reason the detection sensitivity is low. MADe is one of the most common ways for finding the outliers in one-dimensional data that is to mark as a potential outlier for any point which is more than two standard deviations. MADe and Modified Z-scores are used in the MAD method. It has identified almost three values 525, 236, 236 which are considered as the outliers. But all the methods can find that the maximum far away value is 525. In MADe method M AD > 2 is identifying six (169, 172, 175, 236, 236, 525) outliers and M AD > 3 is identifying three (525, 236 and 236) outliers. In univariate case, the Median Absolute Deviation is one of the most robust dispersion scales in the presence of outliers, and therefore we recommended the MADe method for outlier detection.

STAT 2300: Unit 1 Learning Objectives Spring 2019

STAT 2300: Unit 1 Learning Objectives Spring 2019 STAT 2300: Unit 1 Learning Objectives Spring 2019 Unit tests are written to evaluate student comprehension, acquisition, and synthesis of these skills. The problems listed as Assigned MyStatLab Problems

More information

Statistics Chapter Measures of Position LAB

Statistics Chapter Measures of Position LAB Statistics Chapter 2 Name: 2.5 Measures of Position LAB Learning objectives: 1. How to find the first, second, and third quartiles of a data set, how to find the interquartile range of a data set, and

More information

1. Contingency Table (Cross Tabulation Table)

1. Contingency Table (Cross Tabulation Table) II. Descriptive Statistics C. Bivariate Data In this section Contingency Table (Cross Tabulation Table) Box and Whisker Plot Line Graph Scatter Plot 1. Contingency Table (Cross Tabulation Table) Bivariate

More information

Classroom Simulation: Indications of Outliers in Boxplots of Normal Data

Classroom Simulation: Indications of Outliers in Boxplots of Normal Data Classroom Simulation: Indications of Outliers in Boxplots of Normal Data JSM, Seattle, August 6, 2006 Jacob B. Colvin jbcolvin@fastmail.fm Bruce E. Trumbo bruce.trumbo@csueastbay.edu Eric A. Suess eric.suess@csueastbay.edu

More information

Outliers and Their Effect on Distribution Assessment

Outliers and Their Effect on Distribution Assessment Outliers and Their Effect on Distribution Assessment Larry Bartkus September 2016 Topics of Discussion What is an Outlier? (Definition) How can we use outliers Analysis of Outlying Observation The Standards

More information

Math 1 Variable Manipulation Part 8 Working with Data

Math 1 Variable Manipulation Part 8 Working with Data Name: Math 1 Variable Manipulation Part 8 Working with Data Date: 1 INTERPRETING DATA USING NUMBER LINE PLOTS Data can be represented in various visual forms including dot plots, histograms, and box plots.

More information

Math 1 Variable Manipulation Part 8 Working with Data

Math 1 Variable Manipulation Part 8 Working with Data Math 1 Variable Manipulation Part 8 Working with Data 1 INTERPRETING DATA USING NUMBER LINE PLOTS Data can be represented in various visual forms including dot plots, histograms, and box plots. Suppose

More information

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values.

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values. Stats: Modeling the World Chapter 2 Chapter 2: Data What are data? In order to determine the context of data, consider the W s Who What (and in what units) When Where Why How There are two major ways to

More information

Chapter 1 Data and Descriptive Statistics

Chapter 1 Data and Descriptive Statistics 1.1 Introduction Chapter 1 Data and Descriptive Statistics Statistics is the art and science of collecting, summarizing, analyzing and interpreting data. The field of statistics can be broadly divided

More information

Slide 1. Slide 2. Slide 3. Interquartile Range (IQR)

Slide 1. Slide 2. Slide 3. Interquartile Range (IQR) Slide 1 Interquartile Range (IQR) IQR= Upper quarile lower quartile But what are quartiles? Quartiles are points that divide a data set into quarters (4 equal parts) Slide 2 The Lower Quartile (Q 1 ) Is

More information

Attachment 1. Categorical Summary of BMP Performance Data for Solids (TSS, TDS, and Turbidity) Contained in the International Stormwater BMP Database

Attachment 1. Categorical Summary of BMP Performance Data for Solids (TSS, TDS, and Turbidity) Contained in the International Stormwater BMP Database Attachment 1 Categorical Summary of BMP Performance Data for Solids (TSS, TDS, and Turbidity) Contained in the International Stormwater BMP Database Prepared by Geosyntec Consultants, Inc. Wright Water

More information

AP Statistics Scope & Sequence

AP Statistics Scope & Sequence AP Statistics Scope & Sequence Grading Period Unit Title Learning Targets Throughout the School Year First Grading Period *Apply mathematics to problems in everyday life *Use a problem-solving model that

More information

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION JMP software provides introductory statistics in a package designed to let students visually explore data in an interactive way with

More information

Chapter 3. Displaying and Summarizing Quantitative Data. 1 of 66 05/21/ :00 AM

Chapter 3. Displaying and Summarizing Quantitative Data.  1 of 66 05/21/ :00 AM Chapter 3 Displaying and Summarizing Quantitative Data D. Raffle 5/19/2015 1 of 66 05/21/2015 11:00 AM Intro In this chapter, we will discuss summarizing the distribution of numeric or quantitative variables.

More information

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran 1. In inferential statistics, the aim is to: (a) learn the properties of the sample by calculating statistics

More information

ISSN (Online)

ISSN (Online) Comparative Analysis of Outlier Detection Methods [1] Mahvish Fatima, [2] Jitendra Kurmi [1] Pursuing M.Tech at BBAU, Lucknow, [2] Assistant Professor at BBAU, Lucknow Abstract: - This paper presents different

More information

VQA Proficiency Testing Scoring Document for Quantitative HIV-1 RNA

VQA Proficiency Testing Scoring Document for Quantitative HIV-1 RNA VQA Proficiency Testing Scoring Document for Quantitative HIV-1 RNA The VQA Program utilizes a real-time testing program in which each participating laboratory tests a panel of five coded samples six times

More information

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B. Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institution of Technology, Madras

More information

Chapter 2 Part 1B. Measures of Location. September 4, 2008

Chapter 2 Part 1B. Measures of Location. September 4, 2008 Chapter 2 Part 1B Measures of Location September 4, 2008 Class will meet in the Auditorium except for Tuesday, October 21 when we meet in 102a. Skill set you should have by the time we complete Chapter

More information

Biostatistics 208 Data Exploration

Biostatistics 208 Data Exploration Biostatistics 208 Data Exploration Dave Glidden Professor of Biostatistics Univ. of California, San Francisco January 8, 2008 http://www.biostat.ucsf.edu/biostat208 Organization Office hours by appointment

More information

Super-marketing. A Data Investigation. A note to teachers:

Super-marketing. A Data Investigation. A note to teachers: Super-marketing A Data Investigation A note to teachers: This is a simple data investigation requiring interpretation of data, completion of stem and leaf plots, generation of box plots and analysis of

More information

Globally Robust Confidence Intervals for Location

Globally Robust Confidence Intervals for Location Dhaka Univ. J. Sci. 60(1): 109-113, 2012 (January) Globally Robust Confidence Intervals for Location Department of Statistics, Biostatistics & Informatics, University of Dhaka, Dhaka-1000, Bangladesh Received

More information

Comparison of Different Methods of Outlier Detection in Univariate Time Series Data

Comparison of Different Methods of Outlier Detection in Univariate Time Series Data Comparison of Different Methods of Outlier Detection in Univariate Time Series Data Egbo Mary Nkechinyere E-mail Address: egbomary4@yahoocom Department of Statistics, Federal University of Technology Owerri

More information

Unit 1 Analyzing One-Variable Data

Unit 1 Analyzing One-Variable Data Unit 1 Analyzing One-Variable Data So what is statistics? Statistics is the science and art of,, and from data. Statistical problem-solving process : Clarify the research problem and ask one or more valid

More information

Using Excel s Analysis ToolPak Add-In

Using Excel s Analysis ToolPak Add-In Using Excel s Analysis ToolPak Add-In Bijay Lal Pradhan, PhD Introduction I have a strong opinions that we can perform different quantitative analysis, including statistical analysis, in Excel. It is powerful,

More information

STATISTICALLY SIGNIFICANT EXCEEDANCE- UNDERSTANDING FALSE POSITIVE ERROR

STATISTICALLY SIGNIFICANT EXCEEDANCE- UNDERSTANDING FALSE POSITIVE ERROR 2017 World of Coal Ash (WOCA) Conference in Lexington, KY - May 9-11, 2017 http://www.flyash.info/ STATISTICALLY SIGNIFICANT EXCEEDANCE- UNDERSTANDING FALSE POSITIVE ERROR Arun Kammari 1 1 Haley & Aldrich,

More information

36.2. Exploring Data. Introduction. Prerequisites. Learning Outcomes

36.2. Exploring Data. Introduction. Prerequisites. Learning Outcomes Exploring Data 6. Introduction Techniques for exploring data to enable valid conclusions to be drawn are described in this Section. The diagrammatic methods of stem-and-leaf and box-and-whisker are given

More information

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test CHAPTER 8 T Tests A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test 8.1. One-Sample T Test The One-Sample T Test procedure: Tests

More information

Part 1. DATA PRESENTATION: DESCRIPTIVE DATA ANALYSIS

Part 1. DATA PRESENTATION: DESCRIPTIVE DATA ANALYSIS 22S:101 Biostatistics: J. Huang 1 Part 1. DATA PRESENTATION: DESCRIPTIVE DATA ANALYSIS Numerical Data Data Presentation I: Tables Data Presentation II: Graphs 22S:101 Biostatistics: J. Huang 2 1. Types

More information

Module 1: Fundamentals of Data Analysis

Module 1: Fundamentals of Data Analysis Using Statistical Data to Make Decisions Module 1: Fundamentals of Data Analysis Dr. Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business S tatistics are an important

More information

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT CLASS 3: DESCRIPTIVE STATISTICS & RELIABILITY AND VALIDITY FEBRUARY 2, 2015 OBJECTIVES Define basic terminology used in assessment, such as validity,

More information

Coal Combustion Residual Statistical Method Certification for the CCR Landfill at the Boardman Power Plant Boardman, Oregon

Coal Combustion Residual Statistical Method Certification for the CCR Landfill at the Boardman Power Plant Boardman, Oregon Coal Combustion Residual Statistical Method Certification for the CCR Landfill at the Boardman Power Plant Boardman, Oregon Prepared for Portland General Electric October 13, 2017 CH2M HILL Engineers,

More information

Statistics 201 Summary of Tools and Techniques

Statistics 201 Summary of Tools and Techniques Statistics 201 Summary of Tools and Techniques This document summarizes the many tools and techniques that you will be exposed to in STAT 201. The details of how to do these procedures is intentionally

More information

A note on detecting statistical outliers in psychophysical data

A note on detecting statistical outliers in psychophysical data Detecting statistical outliers page 1 of 7 Initial Draft [not peer-reviewed] A note on detecting statistical outliers in psychophysical data Pete R. Jones 1,2 1 Institute of Ophthalmology, University College

More information

Introduction to descriptive statistics

Introduction to descriptive statistics Introduction to descriptive statistics Illustrated with XLSTAT Jean Paul Maalouf jpmaalouf@xlstat.com linkedin.com/in/jean-paul-maalouf www.xlstat.com Oct. 12, 2016 1 PLAN XLSTAT: who are we? Statistics:

More information

SPSS 14: quick guide

SPSS 14: quick guide SPSS 14: quick guide Edition 2, November 2007 If you would like this document in an alternative format please ask staff for help. On request we can provide documents with a different size and style of

More information

Chapter 1. * Data = Organized collection of info. (numerical/symbolic) together w/ context.

Chapter 1. * Data = Organized collection of info. (numerical/symbolic) together w/ context. Chapter 1 Objectives (1) To understand the concept of data in statistics, (2) Learn to recognize its context & components, (3) Recognize the 2 basic variable types. Concept briefs: * Data = Organized collection

More information

Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries

Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu Chong Ma (Statistics, USC) STAT

More information

Measurement and sampling

Measurement and sampling Name: Instructions: (1) Answer questions in your blue book. Number each response. (2) Write your name on the cover of your blue book (and only on the cover). (3) You are allowed to use your calculator

More information

Influence of biological variability, assay signal, and outlier criteria on Immunogenicity cut points and clinical relevance

Influence of biological variability, assay signal, and outlier criteria on Immunogenicity cut points and clinical relevance Influence of biological variability, assay signal, and outlier criteria on Immunogenicity cut points and clinical relevance V. Devanarayan, Ph.D., FAAPS Charles River Laboratories European Bioanalytical

More information

Computing Descriptive Statistics Argosy University

Computing Descriptive Statistics Argosy University 2014 Argosy University 2 Computing Descriptive Statistics: Ever Wonder What Secrets They Hold? The Mean, Mode, Median, Variability, and Standard Deviation Introduction Before gaining an appreciation for

More information

AP Statistics Part 1 Review Test 2

AP Statistics Part 1 Review Test 2 Count Name AP Statistics Part 1 Review Test 2 1. You have a set of data that you suspect came from a normal distribution. In order to assess normality, you construct a normal probability plot. Which of

More information

Fundamental Elements of Statistics

Fundamental Elements of Statistics Fundamental Elements of Statistics Slide Statistics the science of data Collection Evaluation (classification, summary, organization and analysis) Interpretation Slide Population Sample Sample: A subset

More information

Section 9: Presenting and describing quantitative data

Section 9: Presenting and describing quantitative data Section 9: Presenting and describing quantitative data Australian Catholic University 2014 ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced or used in any form

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Univariate Statistics Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved Table of Contents PAGE Creating a Data File...3 1. Creating

More information

Lecture 10. Outline. 1-1 Introduction. 1-1 Introduction. 1-1 Introduction. Introduction to Statistics

Lecture 10. Outline. 1-1 Introduction. 1-1 Introduction. 1-1 Introduction. Introduction to Statistics Outline Lecture 10 Introduction to 1-1 Introduction 1-2 Descriptive and Inferential 1-3 Variables and Types of Data 1-4 Sampling Techniques 1- Observational and Experimental Studies 1-6 Computers and Calculators

More information

Project 2 - β-endorphin Levels as a Response to Stress: Statistical Power

Project 2 - β-endorphin Levels as a Response to Stress: Statistical Power Score: Name: Due Wednesday, April 10th in class. β-endorphins are neurotransmitters whose activity has been linked to the reduction of pain in the body. Elite runners often report a runners high during

More information

Dr. Allen Back. Aug. 26, 2016

Dr. Allen Back. Aug. 26, 2016 Dr. Allen Back Aug. 26, 2016 AP Stats vs. 1710 Some different emphases. AP Stats vs. 1710 Some different emphases. But generally comparable. AP Stats vs. 1710 Some different emphases. But generally comparable.

More information

Bar graph or Histogram? (Both allow you to compare groups.)

Bar graph or Histogram? (Both allow you to compare groups.) Bar graph or Histogram? (Both allow you to compare groups.) We want to compare total revenues of five different companies. Key question: What is the revenue for each company? Bar graph We want to compare

More information

Fraud Detection in Clinical Trials: A Graphical Tool

Fraud Detection in Clinical Trials: A Graphical Tool Fraud Detection in Clinical Trials: A Graphical Tool Data Visualization in Clinical Research Author: Giulia Zardi Milan, May 29 th 2015 Introduction A clinical trial database can never be completely free

More information

Biostat Exam 10/7/03 Coverage: StatPrimer 1 4

Biostat Exam 10/7/03 Coverage: StatPrimer 1 4 Biostat Exam 10/7/03 Coverage: StatPrimer 1 4 Part A (Closed Book) INSTRUCTIONS Write your name in the usual location (back of last page, near the staple), and nowhere else. Turn in your Lab Workbook at

More information

VIII. STATISTICS. Part I

VIII. STATISTICS. Part I VIII. STATISTICS Part I IN THIS CHAPTER: An introduction to descriptive statistics Measures of central tendency: mean, median, and mode Measures of spread, dispersion, and variability: range, variance,

More information

Review Materials for Test 1 (4/26/04) (answers will be posted 4/20/04)

Review Materials for Test 1 (4/26/04) (answers will be posted 4/20/04) Review Materials for Test 1 (4/26/04) (answers will be posted 4/20/04) Prof. Lew Extra Office Hours: Friday 4/23/04 10am-10:50am; Saturday 12:30pm-2:00pm. E- mail will be answered if you can send it before

More information

STAT/MATH Chapter3. Statistical Methods in Practice. Averages and Variation 1/27/2017. Measures of Central Tendency: Mode, Median, and Mean

STAT/MATH Chapter3. Statistical Methods in Practice. Averages and Variation 1/27/2017. Measures of Central Tendency: Mode, Median, and Mean STAT/MATH 3379 Statistical Methods in Practice Dr. Ananda Manage Associate Professor of Statistics Department of Mathematics & Statistics SHSU 1 Chapter3 Averages and Variation Copyright Cengage Learning.

More information

Learning Area: Mathematics Year Course and Assessment Outline. Year 11 Essentials Mathematics COURSE OUTLINE

Learning Area: Mathematics Year Course and Assessment Outline. Year 11 Essentials Mathematics COURSE OUTLINE Learning Area: Mathematics Year 209 Course and Assessment Outline Year Essentials Mathematics COURSE OUTLINE SEM/ TERM WEEKS LEARNING CONTENT- Unit ASSESSMENTS Topic. Basic calculations, percentages and

More information

Data Visualization. Prof.Sushila Aghav-Palwe

Data Visualization. Prof.Sushila Aghav-Palwe Data Visualization By Prof.Sushila Aghav-Palwe Importance of Graphs in BI Business intelligence or BI is a technology-driven process that aims at collecting data and analyze it to extract actionable insights

More information

David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis

David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis David M. Rocke Division of Biostatistics and Department of Biomedical Engineering University of California, Davis Outline RNA-Seq for differential expression analysis Statistical methods for RNA-Seq: Structure

More information

Basic Statistics, Sampling Error, and Confidence Intervals

Basic Statistics, Sampling Error, and Confidence Intervals 02-Warner-45165.qxd 8/13/2007 5:00 PM Page 41 CHAPTER 2 Introduction to SPSS Basic Statistics, Sampling Error, and Confidence Intervals 2.1 Introduction We will begin by examining the distribution of scores

More information

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of

Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of Students will understand the definition of mean, median, mode and standard deviation and be able to calculate these functions with given set of numbers. Also, students will understand why some measures

More information

Improved overlay control using robust outlier removal methods

Improved overlay control using robust outlier removal methods Improved overlay control using robust outlier removal methods John C. Robinson 1, Osamu Fujita 2, Hiroyuki Kurita 2, Pavel Izikson 3, Dana Klein 3, and Inna Tarshish-Shapir 3 1 KLA-Tencor Corporation,

More information

Distinguish between different types of numerical data and different data collection processes.

Distinguish between different types of numerical data and different data collection processes. Level: Diploma in Business Learning Outcomes 1.1 1.3 Distinguish between different types of numerical data and different data collection processes. Introduce the course by defining statistics and explaining

More information

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations Transformations Transformations & Data Cleaning Linear & non-linear transformations 2-kinds of Z-scores Identifying Outliers & Influential Cases Univariate Outlier Analyses -- trimming vs. Winsorizing

More information

Descriptive Statistics Tutorial

Descriptive Statistics Tutorial Descriptive Statistics Tutorial Measures of central tendency Mean, Median, and Mode Statistics is an important aspect of most fields of science and toxicology is certainly no exception. The rationale behind

More information

CEE3710: Uncertainty Analysis in Engineering

CEE3710: Uncertainty Analysis in Engineering CEE3710: Uncertainty Analysis in Engineering Lecture 1 September 6, 2017 Why do we need Probability and Statistics?? What is Uncertainty Analysis?? Ex. Consider the average (mean) height of females by

More information

ISO 13528:2015 Statistical methods for use in proficiency testing by interlaboratory comparison

ISO 13528:2015 Statistical methods for use in proficiency testing by interlaboratory comparison ISO 13528:2015 Statistical methods for use in proficiency testing by interlaboratory comparison ema training workshop August 8-9, 2016 Mexico City Class Schedule Monday, 8 August Types of PT of interest

More information

Preprocessing Methods for Two-Color Microarray Data

Preprocessing Methods for Two-Color Microarray Data Preprocessing Methods for Two-Color Microarray Data 1/15/2011 Copyright 2011 Dan Nettleton Preprocessing Steps Background correction Transformation Normalization Summarization 1 2 What is background correction?

More information

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pages 37-64. The description of the problem can be found

More information

ANALYSING QUANTITATIVE DATA

ANALYSING QUANTITATIVE DATA 9 ANALYSING QUANTITATIVE DATA Although, of course, there are other software packages that can be used for quantitative data analysis, including Microsoft Excel, SPSS is perhaps the one most commonly subscribed

More information

Sample Exam 1 Math 263 (sect 9) Prof. Kennedy

Sample Exam 1 Math 263 (sect 9) Prof. Kennedy Sample Exam 1 Math 263 (sect 9) Prof. Kennedy 1. In a statistics class with 136 students, the professor records how much money each student has in their possession during the first class of the semester.

More information

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS

More information

Chapter 3C Assignment Sheet

Chapter 3C Assignment Sheet Stat Tech Chapter 3C Assignment Sheet Percentiles Day 1 1. The reaction time to a stimulus for a certain test has a mean of 2.5 seconds and a standard deviation of 0.3 seconds. Find the corresponding z-score

More information

Introduction to Statistics. Measures of Central Tendency

Introduction to Statistics. Measures of Central Tendency Introduction to Statistics Measures of Central Tendency Two Types of Statistics Descriptive statistics of a POPULATION Relevant notation (Greek): µ mean N population size sum Inferential statistics of

More information

Session 7. Introduction to important statistical techniques for competitiveness analysis example and interpretations

Session 7. Introduction to important statistical techniques for competitiveness analysis example and interpretations ARTNeT Greater Mekong Sub-region (GMS) initiative Session 7 Introduction to important statistical techniques for competitiveness analysis example and interpretations ARTNeT Consultant Witada Anukoonwattaka,

More information

Statistics, Data Analysis, and Decision Modeling

Statistics, Data Analysis, and Decision Modeling - ' 'li* Statistics, Data Analysis, and Decision Modeling T H I R D E D I T I O N James R. Evans University of Cincinnati PEARSON Prentice Hall Upper Saddle River, New Jersey 07458 CONTENTS Preface xv

More information

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations Transformations Transformations & Data Cleaning Linear & non-linear transformations 2-kinds of Z-scores Identifying Outliers & Influential Cases Univariate Outlier Analyses -- trimming vs. Winsorizing

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 Exploratory Data Analysis Exploring data can help to determine whether the statistical

More information

Clovis Community College Class Assessment

Clovis Community College Class Assessment Class: Math 110 College Algebra NMCCN: MATH 1113 Faculty: Hadea Hummeid 1. Students will graph functions: a. Sketch graphs of linear, higherhigher order polynomial, rational, absolute value, exponential,

More information

5 CHAPTER: DATA COLLECTION AND ANALYSIS

5 CHAPTER: DATA COLLECTION AND ANALYSIS 5 CHAPTER: DATA COLLECTION AND ANALYSIS 5.1 INTRODUCTION This chapter will have a discussion on the data collection for this study and detail analysis of the collected data from the sample out of target

More information

Advanced Higher Statistics

Advanced Higher Statistics Advanced Higher Statistics 2018-19 Advanced Higher Statistics - 3 Unit Assessments - Prelim - Investigation - Final Exam (3 Hours) 1 Advanced Higher Statistics Handouts - Data Booklet - Course Outlines

More information

An Introduction to Descriptive Statistics (Will Begin Momentarily) Jim Higgins, Ed.D.

An Introduction to Descriptive Statistics (Will Begin Momentarily) Jim Higgins, Ed.D. An Introduction to Descriptive Statistics (Will Begin Momentarily) Jim Higgins, Ed.D. www.bcginstitute.org Visit BCGi Online While you are waiting for the webinar to begin, Don t forget to check out our

More information

Investigating Common-Item Screening Procedures in Developing a Vertical Scale

Investigating Common-Item Screening Procedures in Developing a Vertical Scale Investigating Common-Item Screening Procedures in Developing a Vertical Scale Annual Meeting of the National Council of Educational Measurement New Orleans, LA Marc Johnson Qing Yi April 011 COMMON-ITEM

More information

Ordered Array (nib) Frequency Distribution. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods

Ordered Array (nib) Frequency Distribution. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods Chapter Descriptive Statistics: Tabular and Graphical Methods Ordered Array (nib) Organizes a data set by sorting it in either ascending or descending order Advantages & Disadvantages Useful in preparing

More information

Chapter 4: Foundations for inference. OpenIntro Statistics, 2nd Edition

Chapter 4: Foundations for inference. OpenIntro Statistics, 2nd Edition Chapter 4: Foundations for inference OpenIntro Statistics, 2nd Edition Variability in estimates 1 Variability in estimates Application exercise Sampling distributions - via CLT 2 Confidence intervals 3

More information

ANOVA The Effect of Outliers

ANOVA The Effect of Outliers ANOVA The Effect of Outliers By Markus Halldestam Bachelor s thesis Department of Statistics Uppsala University Supervisor: Inger Persson 2016 Abstract This bachelor s thesis focuses on the effect of outliers

More information

AP Statistics Test #1 (Chapter 1)

AP Statistics Test #1 (Chapter 1) AP Statistics Test #1 (Chapter 1) Name Part I - Multiple Choice (Questions 1-20) - Circle the answer of your choice. 1. You measure the age, marital status and earned income of an SRS of 1463 women. The

More information

Continuous Improvement Toolkit. Graphical Analysis. Continuous Improvement Toolkit.

Continuous Improvement Toolkit. Graphical Analysis. Continuous Improvement Toolkit. Continuous Improvement Toolkit Graphical Analysis The Continuous Improvement Map Managing Risk FMEA Understanding Performance Check Sheets Data Collection PDPC RAID Log* Risk Assessment* Fault Tree Analysis

More information

Development of the Project Definition Rating Index (PDRI) for Small Industrial Projects. Wesley A. Collins

Development of the Project Definition Rating Index (PDRI) for Small Industrial Projects. Wesley A. Collins Development of the Project Definition Rating Index (PDRI) for Small Industrial Projects by Wesley A. Collins A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of

More information

Topic 1: Descriptive Statistics

Topic 1: Descriptive Statistics Topic 1: Descriptive Statistics Econ 245_Topic 1 page1 Reference: N.C &T.: Chapter 1 Objectives: Basic Statistical Definitions Methods of Displaying Data Definitions: S : a numerical piece of information

More information

Basic applied techniques

Basic applied techniques white paper Basic applied techniques Choose the right stat to make better decisions white paper Basic applied techniques 2 The information age has changed the way many of us do our jobs. In the 1980s,

More information

Setting the Bar and Establishing In-Study Cut Points for Immunogenicity Testing. Ron Bowsher, Ph.D. 16-May-2016

Setting the Bar and Establishing In-Study Cut Points for Immunogenicity Testing. Ron Bowsher, Ph.D. 16-May-2016 Setting the Bar and Establishing In-Study Cut Points for Immunogenicity Testing Ron Bowsher, Ph.D. 16-May-2016 B2S Consulting Team: o Rocco Brunelle, M.S. (Statistics) o Kim Krug, M.S. (Statistics) o Paula

More information

FUNDAMENTALS OF QUALITY CONTROL AND IMPROVEMENT. Fourth Edition. AMITAVA MITRA Auburn University College of Business Auburn, Alabama.

FUNDAMENTALS OF QUALITY CONTROL AND IMPROVEMENT. Fourth Edition. AMITAVA MITRA Auburn University College of Business Auburn, Alabama. FUNDAMENTALS OF QUALITY CONTROL AND IMPROVEMENT Fourth Edition AMITAVA MITRA Auburn University College of Business Auburn, Alabama WlLEY CONTENTS PREFACE ABOUT THE COMPANION WEBSITE PART I PHILOSOPHY AND

More information

Business Quantitative Analysis [QU1] Examination Blueprint

Business Quantitative Analysis [QU1] Examination Blueprint Business Quantitative Analysis [QU1] Examination Blueprint 2014-2015 Purpose The Business Quantitative Analysis [QU1] examination has been constructed using an examination blueprint. The blueprint, also

More information

Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/

Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/ Glossary of Standardized Testing Terms https://www.ets.org/understanding_testing/glossary/ a parameter In item response theory (IRT), the a parameter is a number that indicates the discrimination of a

More information

MAS187/AEF258. University of Newcastle upon Tyne

MAS187/AEF258. University of Newcastle upon Tyne MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................

More information

10.2 Correlation. Plotting paired data points leads to a scatterplot. Each data pair becomes one dot in the scatterplot.

10.2 Correlation. Plotting paired data points leads to a scatterplot. Each data pair becomes one dot in the scatterplot. 10.2 Correlation Note: You will be tested only on material covered in these class notes. You may use your textbook as supplemental reading. At the end of this document you will find practice problems similar

More information

SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY (AUTONOMOUS) :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE)

SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY (AUTONOMOUS) :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) S SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY (AUTONOMOUS) :: PUTTUR Siddharth Nagar, Narayanavanam Road 517583 QUESTION BANK (DESCRIPTIVE) Subject with Code : Course & Branch: MBA IYear I-Sem Regulation:

More information

Understanding Inference: Confidence Intervals II. Questions about the Assignment. Summary (From Last Class) The Problem

Understanding Inference: Confidence Intervals II. Questions about the Assignment. Summary (From Last Class) The Problem Questions about the Assignment Part I The z-score is not the same as the percentile (eg, a z-score of 98 does not equal the 98 th percentile) The z-score is the number of standard deviations the value

More information

Groundwater Monitoring Statistical Methods Certification

Groundwater Monitoring Statistical Methods Certification Groundwater Monitoring Statistical Methods Certification WEC Temporary Ash Disposal Area Whelan Energy Center Public Power Generation Agency/ Hastings Utilities February 9, 2018 This page intentionally

More information

Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 2 Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization

Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 2 Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 2 Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization 1) One of SiriusXM's challenges was tracking potential customers

More information

Statistics in Market Research

Statistics in Market Research Introducing Statistics in Market Research Second Edition Prepared by Leo Cremonezi Statistical Scientist January 2018 1 Introduction 3 1 Descriptive Statistics 4 2 Sampling 10 3 Tests of Significance 18

More information