BAR CHARTS. Display frequency distributions for nominal or ordinal data. Ej. Injury deaths of 100 children, ages 5-9, USA,

Similar documents
Quantitative Methods. Presenting Data in Tables and Charts. Basic Business Statistics, 10e 2006 Prentice-Hall, Inc. Chap 2-1

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values.

Why Learn Statistics?

Variables and data types

PRESENTING DATA ATM 16% Automated or live telephone 2% Drive-through service at branch 17% In person at branch 41% Internet 24% Banking Preference

Statistics Chapter 3 Triola (2014)

Test Name: Test 1 Review

Introduction to Statistics. Measures of Central Tendency and Dispersion

The Dummy s Guide to Data Analysis Using SPSS

Bars and Pies Make Better Desserts than Figures

Exploratory Data Analysis

CHAPTER 2: Descriptive Statistics: Tabular and Graphical Methods Essentials of Business Statistics, 4th Edition Page 1 of 127

1. What is a key difference between an Affinity Diagram and other tools?

Measurement and sampling

1. Contingency Table (Cross Tabulation Table)

Statistics Definitions ID1050 Quantitative & Qualitative Reasoning

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test

Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction

Overview. Presenter: Bill Cheney. Audience: Clinical Laboratory Professionals. Field Guide To Statistics for Blood Bankers

PRINCIPLES AND APPLICATIONS OF SPECIAL EDUCATION ASSESSMENT

STAT/MATH Chapter3. Statistical Methods in Practice. Averages and Variation 1/27/2017. Measures of Central Tendency: Mode, Median, and Mean

Introduction to Statistics. Measures of Central Tendency

!"#$%&'() !(4#)3%*,0*=>&83*. !"#$%&'()*+*,-./0*,-.,0*,--1*234%5"6*789:4)&"60*;6:<

Determining Effective Data Display with Charts

Outliers and Their Effect on Distribution Assessment

ANALYSING QUANTITATIVE DATA

An Introduction to Descriptive Statistics (Will Begin Momentarily) Jim Higgins, Ed.D.

Doc No.:FAD17 (2224)C Draft INDIAN STANDARD IRRIGATION EQUIPMENT RAIN GUN SPRINKLER Specification PART 2 TEST METHOD FOR UNIFORMITY OF DISTRIBUTION

Chapter Topics COMPENSATION MANAGEMENT. CHAPTER 8 (Study unit 7) Designing pay levels, mix and pay structures

Chart Recipe ebook. by Mynda Treacy

Saarland University Proseminar Human-Computer Interaction Antonia Scheidel! May 14th, 2009 USABILITY. Introducing Usability Metrics

Quality Control Charts

QI ESSENTIALS TOOLKIT

Ex. p.17. Summary: a graph of a set of data pairs used to help you recognize and make.

THE GUIDE TO SPSS. David Le

Commonwealth of Pennsylvania PA Test Method No. 529 Department of Transportation July Pages LABORATORY TESTING SECTION. Method of Test for

Pareto Charts [04-25] Finding and Displaying Critical Categories

Attachment 1. Categorical Summary of BMP Performance Data for Solids (TSS, TDS, and Turbidity) Contained in the International Stormwater BMP Database

Lab Rotation Report. Re-analysis of Molecular Features in Predicting Survival in Follicular Lymphoma

Seven Basic Quality Tools. SE 450 Software Processes & Product Metrics 1

Question 1 Compute the sample coefficient of variation for the round-trip airfares (per person) before tax

To provide a framework and tools for planning, doing, checking and acting upon audits

Quality Management (PQM01) Chapter 04 - Quality Control

Exploratory Analysis and Simple Descriptive Statistics.

DSC 201: Data Analysis & Visualization

Continuous Improvement Toolkit. Pareto Analysis. Continuous Improvement Toolkit.

Figures, tables, and equations

MSU Scorecard Analysis - Executive Summary #2

Soci Statistics for Sociologists

Identify sampling methods and recognize biased samples

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

CH 2 - Descriptive Statistics

REPORT ON ASSESSMENT OF THE QA/QC PROGRAM FOR DIAMOND DRILLING CAMPAIGNS

Process Mapping sometimes called Flowcharting or IS Maps

August 24, Jean-Philippe Mathevet IAQG Performance Stream Leader SAFRAN Paris, France. Re: 2014 Supplier Performance Data Summary

Part 1: Multiple Choice. Circle the letter corresponding to the best answer.

Summary Statistics Using Frequency

17 19 Best Fit Line.notebook September 19, a graph of a set of data pairs used to help you recognize and make.

The Mathematics of Banking and Finance. Dennis Cox and Michael Cox

STATISTICAL TECHNIQUES. Data Analysis and Modelling

CALMAC Study ID PGE0354. Daniel G. Hansen Marlies C. Patton. April 1, 2015

Statistics Year 1 (AS) Unit Test 1: Statistical Sampling

SECTION 11 ACUTE TOXICITY DATA ANALYSIS

Business Statistics. Syllabus. General Certificate of Education (Advanced Level) Grade 12 and 13 (Implemented from 2017)

Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 2 Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization

Data Analysis and Sampling

Mentor: William F. Hunt, Jr. Adjunct Professor. In this project our group set out to determine what effect, if any, an air monitor s distance

Chapter Analytical Tool and Setting Parameters: determining the existence of differences among several population means

Folia Oeconomica Stetinensia DOI: /foli FORECASTING RANDOMLY DISTRIBUTED ZERO-INFLATED TIME SERIES

Sample Energy Benchmarking Report

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

CE3502. ENVIRONMENTAL MONITORING, MEASUREMENTS & DATA ANALYSIS. Inflow. Outflo

Unit Activity Answer Sheet

80/20. It can ensure that your cause analysis and corrective actions are focused on the vital few: Pareto Analysis What is it?

reverse transcription! RT 1! RT 2! RT 3!

Statistics for EES Introduction to R and Descriptive Statistics

How to improve your AML detection? Christopher Ghenne Principal Manager Fraud & Security Intelligence EMEA

Are Flood Stages Rising? Our Fault or Mother Nature s?!

Chapter 1 INTRODUCTION TO SIMULATION

It is not essential for students to conduct an investigation for every question they generate or generate questions based on prior investigations.

Chapter 9. Regression Wisdom. Copyright 2010 Pearson Education, Inc.

Unit3: Foundationsforinference. 1. Variability in estimates and CLT. Sta Fall Lab attendance & lateness Peer evaluations

Sensitivity of Technical Efficiency Estimates to Estimation Methods: An Empirical Comparison of Parametric and Non-Parametric Approaches

Quick Start Guide (for PacifiCorp Customers) January, 2011

CHAPTER THREE. Load Curves The curve showing the variation of load on the power station (power plant) with reference to time is known as load curve.

Chapter 2 Exploring and Discovering Data

ASTEROID. Profiler Module. ASTEROID Support: Telephone

Online library of Quality, Service Improvement and Redesign tools. Run charts. collaboration trust respect innovation courage compassion

Job and Employee Actions

SPSS Instructions Booklet 1 For use in Stat1013 and Stat2001 updated Dec Taras Gula,

PART 5 Managing Growth in the Small Business

Introduction of STATA

Groundwater Statistical Analysis Plan

Bio EOC Topics for Living Things, Metric Measurement, Microscope and The Scientific Method

Question 4: How do you find cost and revenue functions?

CHAPTER 8 DESIGNING PAY LEVELS, MIX AND PAY STRUCTURES

Odor Threshold Emission Factors for Common WWTP Processes

Statistical Tools for Analysis. Monitoring Objectives. Objectives Drive the Statistics Used

Transcription:

Graphs BAR CHARTS. Display frequency distributions for nominal or ordinal data. Ej. Injury deaths of 100 children, ages 5-9, USA, 1980-85. HISTOGRAMS. Display frequency distributions for continuous or discrete data. Histogram of birthweights from 100 consecutive deliveries at a Boston hospital. Histogram of birthweights Number of injury deaths 0 10 20 30 40 Motor Drowning Fire Homicide Other Cause Frequency 0 5 10 15 20 25 30 35 50 100 150 Weights 1

Histograms To construct a frequency histogram draw two axes: a horizontal axis labeled with the class intervals and a vertical axis labeled with the frequencies. Construct a rectangle over each class interval with a height equal to the number of measurements falling in a given subinterval. In a relative frequency histogram the vertical axis is labeled as relative frequency and the rectangle is constructed over each class interval with a height equal to the class relative frequency. The two histograms will have the same shape. 2

Histograms A histogram with one major peak is called unimodal. Unimodal Frequency 0 5 10 15 20 3 2 1 0 1 2 3 3

Histograms A histogram with two major peaks is called bimodal. Bimodal Frequency 0 5 10 15 4 2 0 2 4 6 8 10 4

Histograms A histogram with roughly the same number of observations per interval is called a uniform histogram. Uniform Frequency 0 5 10 15 20 25 0.0 0.2 0.4 0.6 0.8 1.0 5

Histograms When the right side of the histogram, with the larger half of the observations, extends a greater distance than the left side, the histogram is referred to as skewed to the right. Right skewed Frequency 0 10 20 30 40 50 0 1 2 3 4 5 6 6

Histograms When the left side of the histogram extends a greater distance than the right side, the histogram is referred to as skewed to the left. Left skewed Frequency 0 5 10 15 20 25 30 4 6 8 10 12 7

Frequency Polygons FREQUENCY POLYGONS. Use same axes as histograms. Useful when comparing two data sets. Cumulative frequency polygons display cumulative relative frequencies and are used to obtain percentiles of the data. Frequency 0 5 10 15 20 25 30 35 50 100 150 Birthweights 8

Example 2.1 Example 2.1 (P & G). Assume we want to compare the serum cholesterol levels for two age groups. Relative frequency 0 10 20 30 40 Ages 25 34 Ages 55 64 Cumulative frequency 0 20 40 60 80 100 Ages 25 34 Ages 55 64 50 100 150 200 250 300 350 400 50 100 150 200 250 300 350 400 Serum cholesterol level Serum cholesterol level 9

Example 2.1 (cont.) Note that the cumulative frequency polygon for 55-64-year-old males lies to the right of the polygon for 25-34-year-old males for each value of serum cholesterol lever the distribution for older men is stochastically larger than the distribution for younger men. Ex.: the 60th percentile of the serum cholesterol levels for the group of 25-34-year-olds is approx 175 mg/100 ml while the 60th percentile for the 55-64-year-olds is about 220 mg/100 ml. 10

Percentiles Percentiles are useful for describing the shape of a distribution. For example, if the 40th and 60th percentiles lie an equal distance away from the midpoint and the same is true for the 30th and 70th percentiles, the 20th and 80th and so on, the data are symmetric. If there are a number of outlying observations on one side of the midpoint only the data are skewed. If the observations are smaller than the rest of the values the data are skewed to the left. If the observations are larger than the rest the data are skewed to the right. 11

Box Plots BOX PLOTS. Display a summary of the data as follows, the central box extends from the 25th percentile to the 75th percentile (these are the quartiles of the data), a line is drawn at the 50th percentile, lines projecting out of the box extend to adjacent values, i.e. the most extreme observations in the data that are not more than 1.5 the height of the box beyond either quartile, outliers (points outside the above range) are represented as circles. 12

Box Plots C B A A: 25th percentile. B: 50th percentile. C: 75th percentile. Adjacent values: smallest and largest observations x 1 and x 2 such that x 1 A 1.5 (C A) and x 2 C + 1.5 (C A). 13

Example 1.6 (cont.) Example 1.6 (cont.) Boxplot of birthweights in a Boston Hospital 40 60 80 100 120 140 160 Boxplot of birthweights 14

Example 2.3 Example 2.3 Goal: assess the potency of various constituents of orchard sprays in repelling honeybees. Individual cells of dry comb were filled with measured amounts of lime sulphur emulsion in sucrose solutions. Eight concentrations were used as treatments. The responses were obtained by releasing 100 bees into the chamber for 2 hours and measuring the decrease in volume of the solutions. 2 5 10 20 50 100 A B C D E F G H 15

Two-Way Scatter Plots TWO-WAY SCATTER PLOTS. Used to find relationships between two variables. Example 2.4. Speed of cars vs. distances taken to stop. dist 0 20 40 60 80 100 120 5 10 15 20 25 speed 16

Other graphs LINE GRAPHS, TIME SERIES PLOTS. Similar to the previous graphs but usually the points are connected by straight lines and the scales along the horizontal axis represents time. pairs of jeans (in 1000 s) 2000 2500 3000 1980 1981 1982 1983 1984 1985 1986 Year 17

Example 2.5 Example 2.5: (Rosner, p 39, ex 1) Infectious Disease. The data are a sample from a larger data set collected on persons discharged from a selected Pennsylvania hospital as part of a retrospective chart review of antibiotic usage in hospitals. It is of clinical interest to know if the duration of hospitalization is affected by whether or not a patient has received antibiotics. ID Duration Age Sex (1=M,2=F) Antibiotics (1=Y, 2=N) 1 5 30 2 2 2 10 73 2 2 3 6 40 2 2 4 11 47 2 2 5 5 25 2 2 6 14 82 1 1 7 30 60 1 1 8 11 56 2 2 9 17 43 2 2 18

Example 2.5 (cont.) ID Duration Age Sex (1=M,2=F) Antibiotics (1=Y, 2=N) 10 3 50 1 2 11 9 59 2 2 12 3 4 1 2 13 8 22 2 1 14 8 33 2 1 15 5 20 2 2 16 5 32 1 2 17 7 36 1 1 18 4 69 1 2 19 3 47 1 1 20 7 22 1 2 21 9 11 1 2 22 11 19 1 1 23 11 67 2 2 24 9 43 2 2 25 4 41 2 2 19

Example 2.6 (cont.) Q.What types of variables do we have? 20

Example 2.6 (cont.) Q.What types of variables do we have? The duration of hospitalization is a discrete variable; age is a discrete variable, sex is nominal (binary) and antibiotics is also binary. 20

Example 2.6 (cont.) Q.What types of variables do we have? The duration of hospitalization is a discrete variable; age is a discrete variable, sex is nominal (binary) and antibiotics is also binary. Q.Using numeric methods describe the duration of hospitalization for the 25 patients. Q.Is the duration of hospitalization affected by whether or not a patient has received antibiotics? 20

Example 2.5 (cont.) We can summarize the duration of hospitalization as follows Min. 1st Qu. Median Mean 3rd Qu. 3.0 5.0 8.0 8.6 11.0 Max. Range Int Qu. Range Variance SD 30.0 27.0 6.0 32.67 5.72 21

Histogram of Duration 0 5 10 15 20 25 30 Duration of hospitalization Example 2.5 (cont.) 22 Frequency 0 2 4 6 8 Duration of hospitalization 5 10 15 20 25 30

Example 2.5 (cont.) Duration of hospitalization for patients who received antibiotics Min. 1st Qu. Median Mean 3rd Qu. Max. 3.00 7.50 8.00 11.57 12.50 30.00 Range Int Qu. Range Variance SD 27.00 5.00 77.62 8.81 Duration of hospitalization for patients who did not received antibiotics Min. 1st Qu. Median Mean 3rd Qu. Max. 3.00 5.00 6.50 7.44 9.75 17.00 Range Int Qu. Range Variance SD 14.00 4.75 13.67 3.70 23

Example 2.5 (cont.) Antibiotics No Antibiotics 24 Duration of Hospitalization 5 10 15 20 25 30