Statistics for EES Introduction to R and Descriptive Statistics

Size: px
Start display at page:

Download "Statistics for EES Introduction to R and Descriptive Statistics"

Transcription

1 Statistics for EES Introduction to R and Descriptive Statistics Dirk Metzler April 23, 2017 Contents Contents 1 Intro: What is Statistics? 2 2 Data Visialization Histograms und Polygons Histograms: Densities or Numbers? Stripcharts and Boxplots Example: Darwin Finches Conclusions Summarizing Data Numerically Median and other Quartiles Mean, Standard Deviation and Variance Computing σ with n or n 1? Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent

2 1 Intro: What is Statistics? It is easy to lie with statistics. It is hard to tell the truth without it. What is Statistics? Andrejs Dunkels Nature is full of Variability How to make sense of variable data? Use mathematical theory of randomness:[0.5ex] Probability. Statistics = Data Analysis based on Probabilistic Models Descriptive Statistics Descriptive Statistcs is the first look at the data. Statistics Software R 2

3 2 Data Visialization Data Example Data from a biology diploma thesis, 2001, Forschungsinstitut Senckenberg, Frankfurt am Main Crustacea section Advisor: Prof. Dr. Michael Türkay Charybdis acutidens TÜRKAY 1985 The Squat Lobster Galathea intermedia Squat Lobsters, caught 6. Sept 1988 Helgoländer Tiefe Rinne, North Sea Carpace Lengths (mm): Females, not egg-carrying (n = 215) Female Galathea, not carrrying eggs, caught 6. Sept. '88, n=215 Index Carapax Length [mm] 3

4 2.1 Histograms und Polygons 6. Sept. '88, n=215 Female Galathea, not egg carrying, caught 6. Sept. 88, n=215 Number Number Carapax Length [mm] Female Galathea, not egg carrying, caught 6. Sept. 88, n=215 Female Galathea, not egg carrying, caught 6. Sept. 88, n=215 Number How many have Carapace Length between 2.0 and 2.2? Number How many have Carapace Length between 2.0 and 2.2? Female Galathea, not egg carrying, caught 6. Sept. 88, n=215 Female Galathea, not egg carrying, caught 6. Sept. 88, n=215 Number How many have Carapace Length between 2.0 and 2.2? Number How many have Carapace Length between 2.0 and 2.2? Two Months Later (3. Nov 88) 4

5 Nichteiertragende Weibchen am 3. Nov. '88, n=57 Number Carapax Length [mm] Comparing the two Distributions Nichteiertragende Weibchen Female Crabs, not egg carrying, caught 6. Sept. 88, n=215 Number Carapax Length [mm] : n = 215 Problem: different sample sizes : n = 57 axis such that each distribution has total area Idea: scale y- 5

6 Female Crabs, not egg carrying, caught 6. Sept. 88, n=215 Female Crabs, not egg carrying, caught 6. Sept ? = Proportion of Total per mm Total Area=1 Which Proporion had a length between 2.8 and 3.0 mm? ( ) 0.5 =

7 How to compare the two distributions? Nichteiertragende Weibchen Nichteiertragende Weibchen Carapax Length [mm] Carapax Length [mm] My Advice If you are a commercial artist: Impress everybody with cool 3D graphics! If you are a scientist: Visualize your data in clear and simple 2D plots. (As long as you print on 2D paper and project your slides on 2D screens) Simple and Clear: Polygons 7

8 Female Crabs, not egg carrying, caught 6. Sept. 88, n= Sept. '88, n= Carapax Length [mm] 3. Nov. '88, n= Carapax Length [mm] Convenient to show two or more Polygons in one plot Number Nov. '88 6. Sept. ' Carapax Length [mm] Biological Interpretation: What may be the reason for this shift? 8

9 2.1.1 Histograms: Densities or Numbers? Number vs. Number Number Histograms with unequal intervals should show densities, not numbers! Stripcharts and Boxplots Carapax 1.5 Carapax 9

10 Stripchart + Boxplots, horizontal Carapax 1.5 Carapax Boxplots, horizontal Boxplots, vertikal Simplify to understand Comparison of four groups Histograms and density polygons allow a comprehensive view on the data. Sometimes too comprehensive. Dichte Dichte Dichte Dichte

11 The Boxplot Boxplot, simple type Boxplot, simple type Boxplot, simple type Boxplot, simple type 50 % of the data 50 % of the data 50 % of the data 50 % of the data Median Boxplot, simple type 50 % of the data 50 % of the data Boxplot, simple type 25% 25% 25% 25% Min Median Max Min Median Max 11

12 Boxplot, simple type Boxplot, Standard Type 25% 25% 25% 25% Min 1. Quartile 3. Quartile Median Max Boxplot, Standard Type Interquartile Range Boxplot, Standard Type Interquartile Range 1.5*Interquartile Range 1.5*Interquartile Range Boxplot, Standard Type Boxplot Professional 12

13 Boxplot Professional 95 % Confidence Interval for the Median 2.3 Example: Darwin Finches Charles Robert Darwin ( ) Darwin Finches Darwin s collection of Finches References [1] Sulloway, F.J. (1982) The Beagle collections of Darwin s Finches (Geospizinae). Bulletin of the British Museum (Natural History), Zoology series 43: [2] 13

14 Wing Sizes of Darwin s Finches Wing Lengths by Island Flor_Chrl SCris_Chat Snti_Jams Flor_Chrl SCris_Chat Snti_Jams WingL Wing Lengths by Island Barplot of Wing Lengths (Numbers) Flor_Chrl SCris_Chat Snti_Jams SCris_Chat Flor_Chrl Snti_Jams Histogramm (Densities!) with transparen colors Polygons SCris_Chat Flor_Chrl Snti_Jams SCris_Chat Flor_Chrl Snti_Jams Wing Lengths Wing Lengths Beak Sizes of Darwin s Finches 14

15 Beak Sizes by Species Beak Sizes by Species Cam.par Cer.oli Geo.dif Geo.for Geo.ful Geo.sca Pla.cra Cam.par Cer.oli Geo.dif Geo.for Geo.ful Geo.sca Pla.cra 2.4 Conclusions Conclusions Histograms give detailed information. Polygons allow multiple comparisons. Boxplots can simplify large datasets. Stripcharts more appropriate for small datasets. Sophisticated graphics with 3D or semi-transperent colors do not always improve clarity. 3 Summarizing Data Numerically Idea It is often possible to summarize essential information about a sample numerically. e.g.: How large? Location Parameters How variable? Dispersion Parameters 15

16 Already known from Boxplots Location (How large?) Median Dispersion (How variable?) Inter quartile range (Q 3 Q 1 ) 3.1 Median and other Quartiles The median is the 50% quantile of the data. i.e.: half of the data are smaller or equal to the median, the other half are larger or equal. The Quartiles The first Quartile, Q 1 : A quarter of the observations are smaller than or equal to Q 1. Three quarters are larger or equal. i.e. Q 1 is the 25%-Quantile The third Quartile, Q 3 : Tree quarters of the observations are smaller than or equal to Q 3. One quarter are larger or equal. i.e. Q 3 is the 75%-Quantile 3.2 Mean, Standard Deviation and Variance NOTATION: Most frequently used Location Parameter: The Mean x Dispersion Parameter: The Standard Deviation s Given data named x 1, x 2, x 3,..., x n it is common to write x for the mean. 16

17 DEFINITION: Mean[1ex] =[1ex] Sum of oberved values Number of Observations Sum Number The formula for the mean of x 1, x 2,..., x n : x = (x 1 + x x n )/n = 1 n x i n i=1 Geometric Interpretation of the Mean Center of Gravity Mean = Center of Gravity Where is the center of gravity? x m = 1.5? m = 2? m = 1.8? 17

18 The Standard Deviation How far do typical overservations deviate from the mean? typical Deviation Mean = 2.8 = =? ,8 2.8 = 0,8 1,8 1, Die Standard Deviation σ ( sigma ) ist a slightly weired weighted mean of the deviations: σ = Sum(Deviations 2 )/n The formula for the Standard Deviation of x 1, x 2,..., x n : σ = 1 n (x i x) n 2 σ 2 = 1 n n i=1 (x i x) 2 is the Variance. i=1 18

19 Rule of Thumb for the Standard Deviation In more or less bell-shaped (i.e. single peak, symmetic) distributions: probability density ca. 2/3 are located between x σ und x+σ. x σ x x + σ Standard Deviation of Carapace lengths from females, not carrying eggs, caught 6. Sept. '88 females, not carrying eggs, caught 6. Sept. ' x = σ = 0.28 σ 2 = x = Carapax Length [mm] Carapax Length [mm] In this case 72% are between x σ and x + σ Variance of Carapace lengths from All Carace Lengths in North Sea: X = (X 1, X 2,..., X N ).Carapace Length in our Sample: S = (S 1, S 2,..., S n=215 )Sample Variance: σ 2 S = 1 n 215 i=1 (S i S) 2 0,

20 Can we use as estimation for σx 2, the variance in the whole population?yes, we can! However, σs 2 n 1 is on average by a factor of (= 214/215 n 0, 995) smaller than σx 2. Variances Variance in the Population: σ 2 X = 1 N N i=1 (X i X) 2 Sample Variance: σ 2 S = 1 n n i=1 (S i S) 2 (Corrected) Sample Variance: s 2 = = = n n 1 σ2 S n n 1 1 n 1 n 1 n (S i S) 2 i=1 n (S i S) 2 Usually, Standard Deviation (SD) of S refers to the corrected s. i=1 Example: Computing SD Given Data x =? x = 10/5 = 2 x x x (x x) s 2 = ( (x x) 2 ) /(n 1) = 16/(5 1) = 4 s = Computing σ with n or n 1? 20

21 Simulated population (N=10000 adults) Mean: Standard deviation: Length [cm] Sample from the population (n=10) M: SD with (n 1): 1.15 SD with n: Length [cm] Another sample from the population (n=10) M: SD with (n 1): 1.61 SD with n: Length [cm] 1000 samples, each of size n= SD computed with n SD computed with n Computing σ with n or n 1? 21

22 The standard deviation σ of a random variable with n equally probable outcomes x 1,..., x n (z.b. rolling a dice) is clearly defined by 1 n (x x i ) 2. n i=1 If x 1,..., x n is a sample (the usual case in statistics) you should rather use the formula 1 n (x x i ) 2. n 1 i=1 3.3 Mean values are usually nice but sometimes mean Mean and SD... characterize data well if the distribution is bell-shaped and must be interpreted with caution in other cases We will exemplify this with textbook examples from ecology, see e.g. References [BTH08] M. Begon, C. R. Townsend, and J. L. Harper. Ecology: From Individuals to Ecosystems. Blackell Publishing, 4 edition, When original data were not available, we generated similar data sets by computer simulation. So do not believe all data points example: picky wagtails Conjecture Size of flies varies. efficiency for wagtail = energy gain / time to capture and eat lab experiments show that efficiency is maximal when flies have size 7mm 22

23 References [Dav77] N.B. Davies. Prey selection and social behaviour in wagtails (Aves: Motacillidae). J. Anim. Ecol., 46:37 57, available dung flies captured dung flies dung flies: available, captured number mean= 7.99 sd= 0.96 number mean= 6.79 sd= 0.69 fraction per mm captured available length [mm] length [mm] length [mm] numerical comparison of size distributions dung flies: available, captured captured available mean 6.29 < 7.99 sd 0.69 < 0.96 fraction per mm captured available length [mm] Interpretation The birds prefer dung-flies from a relatively narrow range around the predicted optimum of 7mm. The distributions in this example were bell-shaped, and the 4 numbers (means and standard deviations) were appropriate to summarize the data example: spider men & spider women Nephila madagascariensis (c) by Bernard Gagnon 23

24 Simulated Data: 70 sampled spiders mean size: mm sd of size :12.94 mm????? Nephila madagascariensis (n=70) Nephila madagascariensis (n=70) Frequency Frequency Frequency mean= size [mm] size [mm] size [mm] Nephila madagascariensis (n=70) Nephila madagascariensis (n=70) Frequency males females Frequency males females size [mm] size [mm] Conclusion from spider example If data comes from different groups, it may be reasonable to compute mean an sd separately for each group example: copper-tolerant browntop bent References [Bra60] A.D. Bradshaw. Population Differentiation in agrostis tenius Sibth. III. populations in varied environments. New Phytologist, 59(1):92 103, [MB68] T. McNeilly and A.D Bradshaw. Evolutionary Processes in Populations of Copper Tolerant Agrostis tenuis Sibth. Evolution, 22: , Again, we have no access to original data and use simulated data. 24

25 Adaptation to copper? root length indicates copper tolerance measure root lengths of plants near copper mine take seeds from clean meadow and sow near copper mine measure root length of these meadow plants in copper environment Browntop Bent (n=50) Browntop Bent (n=50) density per cm Copper Mine Grass density per cm Grass seeds from a meadow root length (cm) root length (cm) Browntop Bent (n=50) Browntop Bent (n=50) density per cm Grass seeds from a meadow copper tolerant? density per cm meadow plants copper mine plants root length (cm) root length (cm) Browntop Bent (n=50) Browntop Bent (n=50) density per cm copper mine plants m s m m+s density per cm m s m m+s meadow plants root length (cm) root length (cm) 25

26 Browntop Bent n=50+50 copper mine plants meadow plants root length (cm) 2/3 of the data within [m-sd,m+sd]???? No! quartiles of root length [cm] min Q 1 median Q 3 max copper adapted from meadow Conclusion from browntop bent example Sometimes the two numbers m and sd give not enough information. In this example the four quartiles max, Q 1, median, Q 3, max that are shown in the boxplot are more approriate. Conclusions from this section Always visually inspect the data! Never rely on summarising values alone! 26

Plolygenic traits and the central limit theorem

Plolygenic traits and the central limit theorem 1 Plolygenic traits and the central limit theorem José Miguel Ponciano: josemi@ufl.edu University of Florida, Biology Department 2 Darwin was once also frustrated with math... Mathematics... was repugnant

More information

Chapter 3. Displaying and Summarizing Quantitative Data. 1 of 66 05/21/ :00 AM

Chapter 3. Displaying and Summarizing Quantitative Data.  1 of 66 05/21/ :00 AM Chapter 3 Displaying and Summarizing Quantitative Data D. Raffle 5/19/2015 1 of 66 05/21/2015 11:00 AM Intro In this chapter, we will discuss summarizing the distribution of numeric or quantitative variables.

More information

Advanced Higher Statistics

Advanced Higher Statistics Advanced Higher Statistics 2018-19 Advanced Higher Statistics - 3 Unit Assessments - Prelim - Investigation - Final Exam (3 Hours) 1 Advanced Higher Statistics Handouts - Data Booklet - Course Outlines

More information

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values.

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values. Stats: Modeling the World Chapter 2 Chapter 2: Data What are data? In order to determine the context of data, consider the W s Who What (and in what units) When Where Why How There are two major ways to

More information

CEE3710: Uncertainty Analysis in Engineering

CEE3710: Uncertainty Analysis in Engineering CEE3710: Uncertainty Analysis in Engineering Lecture 1 September 6, 2017 Why do we need Probability and Statistics?? What is Uncertainty Analysis?? Ex. Consider the average (mean) height of females by

More information

STAT 2300: Unit 1 Learning Objectives Spring 2019

STAT 2300: Unit 1 Learning Objectives Spring 2019 STAT 2300: Unit 1 Learning Objectives Spring 2019 Unit tests are written to evaluate student comprehension, acquisition, and synthesis of these skills. The problems listed as Assigned MyStatLab Problems

More information

Slide 1. Slide 2. Slide 3. Interquartile Range (IQR)

Slide 1. Slide 2. Slide 3. Interquartile Range (IQR) Slide 1 Interquartile Range (IQR) IQR= Upper quarile lower quartile But what are quartiles? Quartiles are points that divide a data set into quarters (4 equal parts) Slide 2 The Lower Quartile (Q 1 ) Is

More information

BAR CHARTS. Display frequency distributions for nominal or ordinal data. Ej. Injury deaths of 100 children, ages 5-9, USA,

BAR CHARTS. Display frequency distributions for nominal or ordinal data. Ej. Injury deaths of 100 children, ages 5-9, USA, Graphs BAR CHARTS. Display frequency distributions for nominal or ordinal data. Ej. Injury deaths of 100 children, ages 5-9, USA, 1980-85. HISTOGRAMS. Display frequency distributions for continuous or

More information

Chapter 1 Data and Descriptive Statistics

Chapter 1 Data and Descriptive Statistics 1.1 Introduction Chapter 1 Data and Descriptive Statistics Statistics is the art and science of collecting, summarizing, analyzing and interpreting data. The field of statistics can be broadly divided

More information

STAT/MATH Chapter3. Statistical Methods in Practice. Averages and Variation 1/27/2017. Measures of Central Tendency: Mode, Median, and Mean

STAT/MATH Chapter3. Statistical Methods in Practice. Averages and Variation 1/27/2017. Measures of Central Tendency: Mode, Median, and Mean STAT/MATH 3379 Statistical Methods in Practice Dr. Ananda Manage Associate Professor of Statistics Department of Mathematics & Statistics SHSU 1 Chapter3 Averages and Variation Copyright Cengage Learning.

More information

MAS187/AEF258. University of Newcastle upon Tyne

MAS187/AEF258. University of Newcastle upon Tyne MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................

More information

Chapter 17 Section 1: Genetic Variation

Chapter 17 Section 1: Genetic Variation Chapter 17 Section 1: Genetic Variation Section 2 Content Objective Write this down! I will be able to identify and define in my own words key terms associated with genetic variation. Section 2 Language

More information

Biostatistics 208 Data Exploration

Biostatistics 208 Data Exploration Biostatistics 208 Data Exploration Dave Glidden Professor of Biostatistics Univ. of California, San Francisco January 8, 2008 http://www.biostat.ucsf.edu/biostat208 Organization Office hours by appointment

More information

Preliminary Examination GRADE 12 PRELIMINARY EXAMINATION AUGUST 2009 HILTON COLLEGE MATHEMATICS: PAPER III. Time: 2 hours Marks: 100

Preliminary Examination GRADE 12 PRELIMINARY EXAMINATION AUGUST 2009 HILTON COLLEGE MATHEMATICS: PAPER III. Time: 2 hours Marks: 100 Preliminary Examination 009 GRADE PRELIMINARY EXAMINATION AUGUST 009 HILTON COLLEGE MATHEMATICS: PAPER III Time: hours Marks: 00 PLEASE READ THE FOLLOWING INSTRUCTIONS CAREFULLY. This question paper consists

More information

Biostat Exam 10/7/03 Coverage: StatPrimer 1 4

Biostat Exam 10/7/03 Coverage: StatPrimer 1 4 Biostat Exam 10/7/03 Coverage: StatPrimer 1 4 Part A (Closed Book) INSTRUCTIONS Write your name in the usual location (back of last page, near the staple), and nowhere else. Turn in your Lab Workbook at

More information

CHAPTER 4. Labeling Methods for Identifying Outliers

CHAPTER 4. Labeling Methods for Identifying Outliers CHAPTER 4 Labeling Methods for Identifying Outliers 4.1 Introduction Data mining is the extraction of hidden predictive knowledge s from large databases. Outlier detection is one of the powerful techniques

More information

Math 1 Variable Manipulation Part 8 Working with Data

Math 1 Variable Manipulation Part 8 Working with Data Name: Math 1 Variable Manipulation Part 8 Working with Data Date: 1 INTERPRETING DATA USING NUMBER LINE PLOTS Data can be represented in various visual forms including dot plots, histograms, and box plots.

More information

Math 1 Variable Manipulation Part 8 Working with Data

Math 1 Variable Manipulation Part 8 Working with Data Math 1 Variable Manipulation Part 8 Working with Data 1 INTERPRETING DATA USING NUMBER LINE PLOTS Data can be represented in various visual forms including dot plots, histograms, and box plots. Suppose

More information

AP Statistics Scope & Sequence

AP Statistics Scope & Sequence AP Statistics Scope & Sequence Grading Period Unit Title Learning Targets Throughout the School Year First Grading Period *Apply mathematics to problems in everyday life *Use a problem-solving model that

More information

Statistics Definitions ID1050 Quantitative & Qualitative Reasoning

Statistics Definitions ID1050 Quantitative & Qualitative Reasoning Statistics Definitions ID1050 Quantitative & Qualitative Reasoning Population vs. Sample We can use statistics when we wish to characterize some particular aspect of a group, merging each individual s

More information

Introduction to Statistics. Measures of Central Tendency and Dispersion

Introduction to Statistics. Measures of Central Tendency and Dispersion Introduction to Statistics Measures of Central Tendency and Dispersion The phrase descriptive statistics is used generically in place of measures of central tendency and dispersion for inferential statistics.

More information

Week 13, 11/12/12-11/16/12, Notes: Quantitative Summaries, both Numerical and Graphical.

Week 13, 11/12/12-11/16/12, Notes: Quantitative Summaries, both Numerical and Graphical. Week 13, 11/12/12-11/16/12, Notes: Quantitative Summaries, both Numerical and Graphical. 1 Monday s, 11/12/12, notes: Numerical Summaries of Quantitative Varibles Chapter 3 of your textbook deals with

More information

AP Statistics Part 1 Review Test 2

AP Statistics Part 1 Review Test 2 Count Name AP Statistics Part 1 Review Test 2 1. You have a set of data that you suspect came from a normal distribution. In order to assess normality, you construct a normal probability plot. Which of

More information

Core vs NYS Standards

Core vs NYS Standards Core vs NYS Standards Grade 5 Core NYS Operations and Algebraic Thinking -------------------------------------------------------------------------- 5.OA Write / Interpret Numerical Expressions Use ( ),

More information

Visual tolerance analysis for engineering optimization

Visual tolerance analysis for engineering optimization Int. J. Metrol. Qual. Eng. 4, 53 6 (03) c EDP Sciences 04 DOI: 0.05/ijmqe/03056 Visual tolerance analysis for engineering optimization W. Zhou Wei,M.Moore, and F. Kussener 3, SAS Institute Co., Ltd. No.

More information

Statistics Chapter Measures of Position LAB

Statistics Chapter Measures of Position LAB Statistics Chapter 2 Name: 2.5 Measures of Position LAB Learning objectives: 1. How to find the first, second, and third quartiles of a data set, how to find the interquartile range of a data set, and

More information

Applying the central limit theorem

Applying the central limit theorem Patrick Breheny March 11 Patrick Breheny Introduction to Biostatistics (171:161) 1/21 Introduction It is relatively easy to think about the distribution of data heights or weights or blood pressures: we

More information

Classroom Simulation: Indications of Outliers in Boxplots of Normal Data

Classroom Simulation: Indications of Outliers in Boxplots of Normal Data Classroom Simulation: Indications of Outliers in Boxplots of Normal Data JSM, Seattle, August 6, 2006 Jacob B. Colvin jbcolvin@fastmail.fm Bruce E. Trumbo bruce.trumbo@csueastbay.edu Eric A. Suess eric.suess@csueastbay.edu

More information

Statistics: Data Analysis and Presentation. Fr Clinic II

Statistics: Data Analysis and Presentation. Fr Clinic II Statistics: Data Analysis and Presentation Fr Clinic II Overview Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars

More information

GETTING READY FOR DATA COLLECTION

GETTING READY FOR DATA COLLECTION 3 Chapter 7 Data Collection and Descriptive Statistics CHAPTER OBJECTIVES - STUDENTS SHOULD BE ABLE TO: Explain the steps in the data collection process. Construct a data collection form and code data

More information

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 1 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran 1. In inferential statistics, the aim is to: (a) learn the properties of the sample by calculating statistics

More information

Applying Statistical Techniques to implement High Maturity Practices At North Shore Technologies (NST) Anand Bhatnagar December 2015

Applying Statistical Techniques to implement High Maturity Practices At North Shore Technologies (NST) Anand Bhatnagar December 2015 Applying Statistical Techniques to implement High Maturity Practices At North Shore Technologies (NST) Anand Bhatnagar December 2015 For our audience some Key Features Say Yes when you understand Say No

More information

Module 1: Fundamentals of Data Analysis

Module 1: Fundamentals of Data Analysis Using Statistical Data to Make Decisions Module 1: Fundamentals of Data Analysis Dr. Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business S tatistics are an important

More information

6. The probability that you win at least $1 both time is (a) 1/2 (b) 4/36 (c) 1/36 (d) 1/4 (e) 3/4

6. The probability that you win at least $1 both time is (a) 1/2 (b) 4/36 (c) 1/36 (d) 1/4 (e) 3/4 AP Statistics ~ Unit 3 Practice Test ANSWERS MULTIPLE CHOICE PRACTICE 1. An assignment of probability must obey which of the following? (a) The probability of any event must be a number between 0 and 1,

More information

Outliers and Their Effect on Distribution Assessment

Outliers and Their Effect on Distribution Assessment Outliers and Their Effect on Distribution Assessment Larry Bartkus September 2016 Topics of Discussion What is an Outlier? (Definition) How can we use outliers Analysis of Outlying Observation The Standards

More information

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION JMP software provides introductory statistics in a package designed to let students visually explore data in an interactive way with

More information

Chapter 1. * Data = Organized collection of info. (numerical/symbolic) together w/ context.

Chapter 1. * Data = Organized collection of info. (numerical/symbolic) together w/ context. Chapter 1 Objectives (1) To understand the concept of data in statistics, (2) Learn to recognize its context & components, (3) Recognize the 2 basic variable types. Concept briefs: * Data = Organized collection

More information

Social Studies 201 Fall Answers to Computer Problem Set 1 1. PRIORITY Priority for Federal Surplus

Social Studies 201 Fall Answers to Computer Problem Set 1 1. PRIORITY Priority for Federal Surplus Social Studies 1 Fall. Answers to Computer Problem Set 1 1 Social Studies 1 Fall Answers for Computer Problem Set 1 1. FREQUENCY DISTRIBUTIONS PRIORITY PRIORITY Priority for Federal Surplus 1 Reduce Debt

More information

-SQA-SCOTTISH QUALIFICATIONS AUTHORITY HIGHER NATIONAL UNIT SPECIFICATION GENERAL INFORMATION

-SQA-SCOTTISH QUALIFICATIONS AUTHORITY HIGHER NATIONAL UNIT SPECIFICATION GENERAL INFORMATION -SQA-SCOTTISH QUALIFICATIONS AUTHORITY HIGHER NATIONAL UNIT SPECIFICATION GENERAL INFORMATION -Unit number- 7482149 -Unit title- INTRODUCTION TO STATISTICS FOR QUALITY ANALYSIS -Superclass category- -Date

More information

Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries

Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries Elementary Statistics Lecture 2 Exploring Data with Graphical and Numerical Summaries Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu Chong Ma (Statistics, USC) STAT

More information

Probability and Statistics Cycle 3 Test Study Guide

Probability and Statistics Cycle 3 Test Study Guide Probability and Statistics Cycle 3 Test Study Guide Name Block 1. Match the graph with its correct distribution shape. The distribution shape is categorized as: A. Uniform B. Skewed to the right C. Normal

More information

SPSS 14: quick guide

SPSS 14: quick guide SPSS 14: quick guide Edition 2, November 2007 If you would like this document in an alternative format please ask staff for help. On request we can provide documents with a different size and style of

More information

Data Visualization. Prof.Sushila Aghav-Palwe

Data Visualization. Prof.Sushila Aghav-Palwe Data Visualization By Prof.Sushila Aghav-Palwe Importance of Graphs in BI Business intelligence or BI is a technology-driven process that aims at collecting data and analyze it to extract actionable insights

More information

Statistical Observations on Mass Appraisal. by Josh Myers Josh Myers Valuation Solutions, LLC.

Statistical Observations on Mass Appraisal. by Josh Myers Josh Myers Valuation Solutions, LLC. Statistical Observations on Mass Appraisal by Josh Myers Josh Myers Valuation Solutions, LLC. About Josh Josh Myers is an independent CAMA consultant and owner of Josh Myers Valuation Solutions, LLC. Josh

More information

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use k:mydirectory, Table XTMIXED Procedure in STATA with Output Systolic Blood Pressure, 2001. use "k:mydirectory,. xtmixed sbp nage20 nage30 nage40 nage50 nage70 nage80 nage90 winter male dept2 edu_bachelor median_household_income

More information

Bi/Ge105: Evolution Homework 3 Due Date: Thursday, March 01, Problem 1: Genetic Drift as a Force of Evolution

Bi/Ge105: Evolution Homework 3 Due Date: Thursday, March 01, Problem 1: Genetic Drift as a Force of Evolution Bi/Ge105: Evolution Homework 3 Due Date: Thursday, March 01, 2018 1 Problem 1: Genetic Drift as a Force of Evolution 1.1 Simulating the processes of evolution In class we learned about the mathematical

More information

How does having wings help a seed to survive?

How does having wings help a seed to survive? ACTIVITY 3 How does having wings help a seed to survive? Kristína Hudáková Barbora Trubenová 1 TEACHING NOTES 3 How does having wings help a seed to survive? In this activity, students model the process

More information

Section 9: Presenting and describing quantitative data

Section 9: Presenting and describing quantitative data Section 9: Presenting and describing quantitative data Australian Catholic University 2014 ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced or used in any form

More information

Engineering Statistics ECIV 2305 Chapter 8 Inferences on a Population Mean. Section 8.1. Confidence Intervals

Engineering Statistics ECIV 2305 Chapter 8 Inferences on a Population Mean. Section 8.1. Confidence Intervals Engineering Statistics ECIV 2305 Chapter 8 Inferences on a Population Mean Section 8.1 Confidence Intervals Parameter vs. Statistic A parameter is a property of a population or a probability distribution

More information

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches

Module - 01 Lecture - 03 Descriptive Statistics: Graphical Approaches Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B. Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institution of Technology, Madras

More information

Sea of Cortez: the Reef Community Simulation Game

Sea of Cortez: the Reef Community Simulation Game Spencer Cody 9-12 Science Infinite Campus Coordinator Edmunds Central School District, South Dakota Steinbeck Institute 2018 Sea of Cortez: the Reef Community Simulation Game Purpose: This lesson includes

More information

AP Statistics Test #1 (Chapter 1)

AP Statistics Test #1 (Chapter 1) AP Statistics Test #1 (Chapter 1) Name Part I - Multiple Choice (Questions 1-20) - Circle the answer of your choice. 1. You measure the age, marital status and earned income of an SRS of 1463 women. The

More information

Biology Toolkit. Quantifying Biodiversity =? US Department of Homeland Security

Biology Toolkit. Quantifying Biodiversity =? US Department of Homeland Security Biology Toolkit Quantifying Biodiversity + + + =? US Department of Homeland Security Goals: 1. Be able to define biodiversity 2. Be able to define species richness and species evenness 3. Be able to use

More information

Making Sense of Data

Making Sense of Data Tips for Revising Making Sense of Data Make sure you know what you will be tested on. The main topics are listed below. The examples show you what to do. Plan a revision timetable. Always revise actively

More information

Introduction to Statistics. Measures of Central Tendency

Introduction to Statistics. Measures of Central Tendency Introduction to Statistics Measures of Central Tendency Two Types of Statistics Descriptive statistics of a POPULATION Relevant notation (Greek): µ mean N population size sum Inferential statistics of

More information

Chapter 2 Part 1B. Measures of Location. September 4, 2008

Chapter 2 Part 1B. Measures of Location. September 4, 2008 Chapter 2 Part 1B Measures of Location September 4, 2008 Class will meet in the Auditorium except for Tuesday, October 21 when we meet in 102a. Skill set you should have by the time we complete Chapter

More information

Statistics Chapter 3 Triola (2014)

Statistics Chapter 3 Triola (2014) 3-1 Review and Preview Branches of statistics Descriptive Stats: is the branch of stats that involve the organization, summarization, and display of data Inferential Stats: is the branch of stats that

More information

Chapter 3: Distributions of Random Variables

Chapter 3: Distributions of Random Variables Chapter 3: Distributions of Random Variables OpenIntro Statistics, 2nd Edition Slides developed by Mine Çetinkaya-Rundel of OpenIntro The slides may be copied, edited, and/or shared via the CC BY-SA license

More information

DESIGNING A CONTROLLED EXPERIMENT

DESIGNING A CONTROLLED EXPERIMENT AP BIOLOGY NAME DESIGNING A CONTROLLED EXPERIMENT STEP 1: DEFINING THE P ROBLEM Every scientific investigation begins with the question that the scientist wants to answer. The questions addressed by scientific

More information

LAB. POPULATION GENETICS. 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium.

LAB. POPULATION GENETICS. 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium. Period Date LAB. POPULATION GENETICS PRE-LAB 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium. 2. List and briefly explain the 5 conditions that need to be met to maintain a

More information

36.2. Exploring Data. Introduction. Prerequisites. Learning Outcomes

36.2. Exploring Data. Introduction. Prerequisites. Learning Outcomes Exploring Data 6. Introduction Techniques for exploring data to enable valid conclusions to be drawn are described in this Section. The diagrammatic methods of stem-and-leaf and box-and-whisker are given

More information

Exploratory Analysis and Simple Descriptive Statistics.

Exploratory Analysis and Simple Descriptive Statistics. Exploratory Analysis and Simple Descriptive Statistics Download survey1.csv and Metadata_survey1.doc from the website survey1.csv contains data from a large farming survey administered to farmers across

More information

Dr. Allen Back. Aug. 26, 2016

Dr. Allen Back. Aug. 26, 2016 Dr. Allen Back Aug. 26, 2016 AP Stats vs. 1710 Some different emphases. AP Stats vs. 1710 Some different emphases. But generally comparable. AP Stats vs. 1710 Some different emphases. But generally comparable.

More information

Day 1: Confidence Intervals, Center and Spread (CLT, Variability of Sample Mean) Day 2: Regression, Regression Inference, Classification

Day 1: Confidence Intervals, Center and Spread (CLT, Variability of Sample Mean) Day 2: Regression, Regression Inference, Classification Data 8, Final Review Review schedule: - Day 1: Confidence Intervals, Center and Spread (CLT, Variability of Sample Mean) Day 2: Regression, Regression Inference, Classification Your friendly reviewers

More information

small medium large standard $20M $30M $50M horizontal -$20M $40M $90M (a) (5 pts) Find the mean payoffs of the two different drilling strategies.

small medium large standard $20M $30M $50M horizontal -$20M $40M $90M (a) (5 pts) Find the mean payoffs of the two different drilling strategies. Business Statistics 41000 Homework #1 1. An oil company wants to drill in a new location. A preliminary geological study suggests that there is a 20% chance of finding a small amount of oil, a 50% chance

More information

This paper is not to be removed from the Examination Halls

This paper is not to be removed from the Examination Halls This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104A ZB (279 004A) BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences, the

More information

Chapter 9 Assignment (due Wednesday, August 9)

Chapter 9 Assignment (due Wednesday, August 9) Math 146, Summer 2017 Instructor Linda C. Stephenson (due Wednesday, August 9) The purpose of the assignment is to find confidence intervals to predict the proportion of a population. The population in

More information

Preprocessing Methods for Two-Color Microarray Data

Preprocessing Methods for Two-Color Microarray Data Preprocessing Methods for Two-Color Microarray Data 1/15/2011 Copyright 2011 Dan Nettleton Preprocessing Steps Background correction Transformation Normalization Summarization 1 2 What is background correction?

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 4 Quantitative Genetics Nicholas Wheeler & David Harry Oregon

More information

Chapter 5. Statistical Reasoning

Chapter 5. Statistical Reasoning Chapter 5 Statistical Reasoning Measures of Central Tendency Back in Grade 7, data was described using the measures of central tendency and range. Central tendency refers to the middle value, or perhaps

More information

Business Quantitative Analysis [QU1] Examination Blueprint

Business Quantitative Analysis [QU1] Examination Blueprint Business Quantitative Analysis [QU1] Examination Blueprint 2014-2015 Purpose The Business Quantitative Analysis [QU1] examination has been constructed using an examination blueprint. The blueprint, also

More information

Lab 2: Mathematical Modeling: Hardy-Weinberg 1. Overview. In this lab you will:

Lab 2: Mathematical Modeling: Hardy-Weinberg 1. Overview. In this lab you will: AP Biology Name Lab 2: Mathematical Modeling: Hardy-Weinberg 1 Overview In this lab you will: 1. learn about the Hardy-Weinberg law of genetic equilibrium, and 2. study the relationship between evolution

More information

Sample Exam 1 Math 263 (sect 9) Prof. Kennedy

Sample Exam 1 Math 263 (sect 9) Prof. Kennedy Sample Exam 1 Math 263 (sect 9) Prof. Kennedy 1. In a statistics class with 136 students, the professor records how much money each student has in their possession during the first class of the semester.

More information

Online Student Guide Types of Control Charts

Online Student Guide Types of Control Charts Online Student Guide Types of Control Charts OpusWorks 2016, All Rights Reserved 1 Table of Contents LEARNING OBJECTIVES... 4 INTRODUCTION... 4 DETECTION VS. PREVENTION... 5 CONTROL CHART UTILIZATION...

More information

Copyright K. Gwet in Statistics with Excel Problems & Detailed Solutions. Kilem L. Gwet, Ph.D.

Copyright K. Gwet in Statistics with Excel Problems & Detailed Solutions. Kilem L. Gwet, Ph.D. Confidence Copyright 2011 - K. Gwet (info@advancedanalyticsllc.com) Intervals in Statistics with Excel 2010 75 Problems & Detailed Solutions An Ideal Statistics Supplement for Students & Instructors Kilem

More information

Computing Descriptive Statistics Argosy University

Computing Descriptive Statistics Argosy University 2014 Argosy University 2 Computing Descriptive Statistics: Ever Wonder What Secrets They Hold? The Mean, Mode, Median, Variability, and Standard Deviation Introduction Before gaining an appreciation for

More information

LECTURE 10: CONFIDENCE INTERVALS II

LECTURE 10: CONFIDENCE INTERVALS II David Youngberg BSAD 210 Montgomery College LECTURE 10: CONFIDENCE INTERVALS II I. Calculating the Margin of Error with Unknown σ a. We often don t know σ. This requires us to rely on the sample s standard

More information

Introduction to Control Charts

Introduction to Control Charts Introduction to Control Charts Highlights Control charts can help you prevent defects before they happen. The control chart tells you how the process is behaving over time. It's the process talking to

More information

The diagram shows two patterns of cell division. Cell division type A is used in gamete formation. Cell division type B is used in normal growth.

The diagram shows two patterns of cell division. Cell division type A is used in gamete formation. Cell division type B is used in normal growth. The diagram shows two patterns of cell division. Cell division type A is used in gamete formation. Cell division type B is used in normal growth. Name the two types of cell division, A and B, shown in

More information

"Predicting variability in coal quality parameters", Coal Indaba, Johannesburg RSA, November 1998

Predicting variability in coal quality parameters, Coal Indaba, Johannesburg RSA, November 1998 "Predicting variability in coal quality parameters", Coal Indaba, Johannesburg RSA, November 1998 Predicting variability in coal quality parameters Dr Isobel Clark, Geostokos Limited Alloa Business Centre,

More information

(31) Business Statistics

(31) Business Statistics Structure of the Question Paper (31) Business Statistics I Paper - II Paper - Time : 02 hours. 50 multiple choice questions with 5 options. All questions should be answered. Each question carries 02 marks.

More information

Chapter 19. Confidence Intervals for Proportions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 19. Confidence Intervals for Proportions. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions Copyright 2012, 2008, 2005 Pearson Education, Inc. Standard Error Both of the sampling distributions we ve looked at are Normal. For proportions For means

More information

Descriptive Statistics Tutorial

Descriptive Statistics Tutorial Descriptive Statistics Tutorial Measures of central tendency Mean, Median, and Mode Statistics is an important aspect of most fields of science and toxicology is certainly no exception. The rationale behind

More information

EVOLUTION OF POPULATIONS Genes and Variation

EVOLUTION OF POPULATIONS Genes and Variation Section Outline Section 16-1 EVOLUTION OF POPULATIONS Genes and Variation When Darwin developed his theory of evolution, he didn t know how HEREDITY worked. http://www.answers.com/topic/gregor-mendel Mendel

More information

Lecture 8: Introduction to sampling distributions

Lecture 8: Introduction to sampling distributions Lecture 8: Introduction to sampling distributions Statistics 101 Mine Çetinkaya-Rundel February 9, 2012 Announcements Announcements Due: Quiz 3 Monday morning 8am. OH change: Monday s office hours moved

More information

INTRODUCTION TO STATISTICS

INTRODUCTION TO STATISTICS INTRODUCTION TO STATISTICS Slides by Pierre Dragicevic WHAT YOU WILL LEARN Statistical theory Applied statistics This lecture GOALS Learn basic intuitions and terminology Perform basic statistical inference

More information

Process Performance and Quality Chapter 6

Process Performance and Quality Chapter 6 Process Performance and Quality Chapter 6 How Process Performance and Quality fits the Operations Management Philosophy Operations As a Competitive Weapon Operations Strategy Project Management Process

More information

Section 7-3 Estimating a Population Mean: σ Known

Section 7-3 Estimating a Population Mean: σ Known Section 7-3 Estimating a Population Mean: σ Known Created by Erin Hodgess, Houston, Texas Revised to accompany 10 th Edition, Tom Wegleitner, Centreville, VA Slide 1 Key Concept Use sample data to find

More information

Unit3: Foundationsforinference. 1. Variability in estimates and CLT. Sta Fall Lab attendance & lateness Peer evaluations

Unit3: Foundationsforinference. 1. Variability in estimates and CLT. Sta Fall Lab attendance & lateness Peer evaluations Announcements Unit3: Foundationsforinference 1. Variability in estimates and CLT Sta 101 - Fall 2015 Duke University, Department of Statistical Science Lab attendance & lateness Peer evaluations Dr. Monod

More information

Computer Simulation of Biological Processes at the High School

Computer Simulation of Biological Processes at the High School Computer Simulation of Biological Processes at the High School Olena V. Komarova 1[0000-0002-3476-3351] and Albert A. Azaryan 2[0000-0003-0892-8332] 1 Kryvyi Rih State Pedagogical University, 54, Gagarina

More information

Unit III: Summarization Measures:

Unit III: Summarization Measures: Unit III: Summarization Measures: a. Measures of Central Tendencies: 1. Definition of Average, Types of Averages: Arithmetic Mean, Combined and Weighted mean. 2. Median, and Mode for grouped as well as

More information

Process Performance and Quality

Process Performance and Quality Process Performance and Quality How Process Performance and Quality fits the Operations Management Philosophy Chapter 6 Operations As a Competitive Weapon Operations Strategy Project Management Process

More information

STAT 225 Fall 2009 Exam 1

STAT 225 Fall 2009 Exam 1 STAT 5 Fall 009 Exam 1 Your name: Your Instructor: Your class time (circle one): 7:0 8:0 9:0 10:0 11:0 1:0 1:0 :0 :0 :0 Note: Show your work on all questions. Unsupported work will not receive full credit.

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) = the number obtained when rolling one six sided die once. If we roll a six sided die once, the mean of the probability distribution is P( = x) Simulation: We simulated rolling a six sided die times using

More information

Distinguish between different types of numerical data and different data collection processes.

Distinguish between different types of numerical data and different data collection processes. Level: Diploma in Business Learning Outcomes 1.1 1.3 Distinguish between different types of numerical data and different data collection processes. Introduce the course by defining statistics and explaining

More information

Curtis Banks Limited

Curtis Banks Limited Curtis Banks Limited Introduction You may recall that a year ago, for the first time, all organisations of 250 employees or more, were required by law to publish a report on their Gender Pay Gap. This

More information

DESIGNING A CONTROLLED EXPERIMENT

DESIGNING A CONTROLLED EXPERIMENT Name DESIGNING A CONTROLLED EXPERIMENT Period STEP 1: DEFINING THE PROBLEM Every scientific investigation begins with the question that the scientist wants to answer. The questions addressed by scientific

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section A: Population Genetics

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section A: Population Genetics CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section A: Population Genetics 1. The modern evolutionary synthesis integrated Darwinian selection and Mendelian inheritance 2. A population s gene pool is defined

More information

PG. DIPLOMA IN RURAL BANKING (PGDRBI) Term-End Examination December, 2011 MCQ-033 : RURAL RESEARCH METHODS AND QUANTITATIVE TECHNIQUES

PG. DIPLOMA IN RURAL BANKING (PGDRBI) Term-End Examination December, 2011 MCQ-033 : RURAL RESEARCH METHODS AND QUANTITATIVE TECHNIQUES No. of Printed Pages : 6 MCQ-033 PG. DIPLOMA IN RURAL BANKING (PGDRBI) 00063 Term-End Examination December, 2011 MCQ-033 : RURAL RESEARCH METHODS AND QUANTITATIVE TECHNIQUES Time : 3 hours Maximum Marks

More information

Chapter 4: Foundations for inference. OpenIntro Statistics, 2nd Edition

Chapter 4: Foundations for inference. OpenIntro Statistics, 2nd Edition Chapter 4: Foundations for inference OpenIntro Statistics, 2nd Edition Variability in estimates 1 Variability in estimates Application exercise Sampling distributions - via CLT 2 Confidence intervals 3

More information