statistical computing in R
|
|
- Victoria Greer
- 6 years ago
- Views:
Transcription
1 statistical computing in R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop MCS 507 Lecture 37 Mathematical, Statistical and Scientific Software Jan Verschelde, 20 November 2013 Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
2 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
3 R R is a language and environment for statistical computing and graphics, released under the GNU General Public License. R is an implementation of the S language developed at Bell Labs by Rick Becker, John Chambers, and Allan Wilks. We run R explicitly in Sage via 1 the class R, do help(r); or 2 opening a Terminal Session with R, type r_console(); in a Sage terminal. rpy2 is a Python interface to R, at sourceforge, developed by Laurent Gautier. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
4 no multiprecision R is NOT a computer algebra package: > choose(5,2) [1] 10 > choose(52,5) [1] > choose(100,30) [1] e+25 Instead of displaying an exact integer of 25 decimal places, the outcome of choose switched to a float. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
5 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
6 estimating π We estimate x 2 dx = π 4 > u <- runif( ,min=0,max=1) > 4*mean(sqrt(1-u^2)) [1] To check, we can use integrate: using 1,000,000 samples: > integrand <- function(x) { sqrt(1-x^2) } > integrate(integrand,lower=0,upper=1) with absolute error < > pi/4 [1] Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
7 Monte Carlo integration We estimate Applied to b a π/2 0 f(x)dx by (b a) 1 n 4 sin(2x)e x2 dx: x [a,b] f(x). > u <- runif( ,min=0,max=pi/2) > pi/2*mean(4*sin(2*u)*exp(-u^2)) [1] Checking: > integrand <- function(x) { 4*sin(2*x)*exp(-x^2) } > integrate(integrand,lower=0,upper=pi/2) with absolute error < 2.4e-14 Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
8 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
9 r.plot in Sage After generating 10 points with coordinates x, y [0, 1], uniformly distributed, we make a plot: sage: filename = tmp_filename() +.png sage: r.png(filename= "%s" %filename) NULL sage: x = r("runif(10)") sage: y = r("runif(10)") sage: r.plot(x,y) null device 1 sage: filename /Users/jan/.sage//temp/getafix.local/333/\ /tmp_0.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
10 the plot in tmp_0.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
11 plotting to a specific file Plotting 10 random points: sage: filename = /Users/jan/Pictures/points.png sage: r.png(filename= "%s" % filename) NULL sage: x = r("runif(10)") sage: y = r("runif(10)") sage: r.plot(x,y) null device 1 sage: filename /Users/jan/Pictures/points.png We can see the picture as specified by filename. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
12 a density function Plotting the density function of the normal distribution: sage: filename = tmp_filename() +.png sage: r.png(filename= "%s" %filename) NULL sage: r.plot( dnorm,-3,+3) null device 1 sage: filename /Users/jan/.sage//temp/getafix.local/650/\ /tmp_0.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
13 the plot in tmp_0.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
14 a distribution function With pnorm we get the distribution function of the normal distribution, to plot it with Sage: sage: filename = tmp_filename() +.png sage: r.png(filename= "%s" %filename) NULL sage: r.plot( pnorm,-3,+3) null device 1 sage: filename /Users/jan/.sage//temp/getafix.local/650/\ /tmp_1.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
15 the plot in tmp_1.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
16 using rpy2 in python Plotting 10 random points: $ python Python (v2.7.3:70274d53c1dd, Apr , 20:52:43) [GCC (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more i >>> import rpy2.robjects as robjects >>> r = robjects.r >>> r.x11() rpy2.rinterface.null >>> r.plot(r.runif(10),r.runif(10)) rpy2.rinterface.null After r.x11() a new window pops up and the figure gets show in this window by the r.plot. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
17 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
18 sampled data We take 100 samples of the normal distribution. > R = rnorm(100) > R [1] [7] > sort(r) [1] [7] A summary of sorted data is in a stem and leaf plot. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
19 stem and leaf plot > stem(r) The decimal point is at the Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
20 histogram > h = hist(r) > h $breaks [1] [7] $counts [1] $intensities [1] [7] $density [1] [7] $mids [1] [7] Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
21 in a Sage session sage: r.quartz() # for OS X NULL sage: data = r.call( rnorm,100) sage: type(data) <class sage.interfaces.r.relement > sage: r.call( hist,data) $breaks [1] \ $counts [1] Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
22 the histogram plot Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
23 making a boxplot sage: r.quartz() NULL sage: r.call( boxplot,data) $stats [,1] [1,] [2,] [3,] [4,] [5,] Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
24 the boxplot Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
25 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
26 calling lm For x we generate 50 points in [0, 10] and y equals x plus some random noise: > x <- 10*runif(50) > y <- x + rnorm(50) > gx <- lm(y ~ x) lm is used to fit linear models, the method applied is QR decomposition. > gx$coefficients (Intercept) x Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
27 plotting the data sage: filename = tmp_filename() +.png sage: r.png(filename= "%s" %filename) NULL sage: x = r("x <- 10*runif(50)") sage: y = r("y <- x + rnorm(50)") sage: r.plot(x,y) null device 1 sage: filename /Users/jan/.sage//temp/getafix.local/1312/\ /tmp_0.png Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
28 the data points Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
29 the output of lm > summary(gx) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) x <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 48 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 48 DF, p-value: < 2.2e-16 Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
30 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
31 the package datasets Typing data() at the R prompt shows a list of data sets in the package datasets. The last one in the list is women, which contains "Average Heights and Weights for American Women" The output of help(women) shows that the data set appears to have been taken from the American Society of Actuaries for some unknown year earlier than Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
32 printing the entire data set > print(women) height weight Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
33 a summary of the data > summary(women) height weight Min. :58.0 Min. : st Qu.:61.5 1st Qu.:124.5 Median :65.0 Median :135.0 Mean :65.0 Mean : rd Qu.:68.5 3rd Qu.:148.0 Max. :72.0 Max. :164.0 The correlation matrix: > cor(women) height weight height weight To make a scatterplot: > pairs Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
34 a scatterplot Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
35 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
36 data about tornadoes NOAA = National Oceanic and Atmospheric Administration NOAA s National Weather Service provides free data. Downloaded from the file 2011_torn.csv is a comma separated file with data about tornadoes in The data file comes without a header in its first row. Information about the data in the columns is provided in a separate.pdf file. The file 2011_torn.csv with an extra header row is renamed into tornadoes.csv. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
37 reading data into R > torn <- read.csv("tornadoes.csv") > attributes(torn) $names [1] "number" "year" "month" [4] "day" "date" "time" [7] "timezone" "state" "statefips" [10] "statenum" "scale" "injuries" [13] "fatalities" "loss" "croploss" [16] "startinglatitude" "startinglongitude" "endinglatitude" [19] "endinglongitude" "length" "width" [22] "statesaffected" "statenum.1" "tornsegment" [25] "county1fips" "county2fips" "county3fips" [28] "county4fips" These names are the headers for the 28 columns. There are 1778 tornadoes recorded on file. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
38 selecting data > torn[,8] shows all states in column 8 > torn[8,] shows row 8. > torn[8,8] shows the state: row 8 and column 8. To select specific rows and columns: > torn[c(5,20),c(3,8,11)] month state scale 5 1 MS HI 0 > hist(torn[,11]) Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
39 a histogram of the scales of tornadoes Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
40 correlating scale, injuries, fatalities, and loss > cor(torn[,c(11,12,13,14)]) scale injuries fatalities loss scale injuries fatalities loss Plotting starting latitude and longitudes: > plot(torn[,c(16,17)]) Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
41 starting latitudes and longitudes Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
42 statistical computing with R 1 R in Sage the language and environment R Monte Carlo integration making plots with R in Sage 2 Statistical Computing with R histograms of data fitting linear models 3 Data Sets exploring available data sets looking at a downloaded data set 4 Big Data R and Hadoop Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
43 Large Data Files Process big data with parallel and distributed computing: MySQL scales well. overview of Hadoop: Hadoop has become the de facto standard for processing big data. Hadoop is used by Facebook to store photos, by LinkedIn to generate recommendations, and by Amazon to generate search indices. Hadoop works by connecting many different computers, hiding the complexity from the user: work with one giant computer. Hadoop uses a model called Map/Reduce. RHadoop by Antonio Piccolboni is a project for R and Hadoop, hosted at Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
44 the Map/Reduce model Goal: help with writing efficient parallel programs. Common data processing tasks: filtering, merging, aggregating data and many (not all) machine learning algorithms fit into Map/Reduce. Two steps: Map: Tasks read in input data as a set of records, process the records, and send groups of similar records to reducers. The mapper extracts a key from each input record. Hadoop will then route all records with the same key to the same reducer. Reduce: Tasks read in a set of related records, process the records, and write the results. The reducer iterates through all results for the same key, processing the data and writing out the results. Strength: the map and reduce steps run well in parallel. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
45 example: predicting user behavior Problem: how likely is a user to purchase an item from a website? Suppose we have already computed (maybe using Map/Reduce) a set of variables describing each user: most common locations, the number of pages viewed, the number of purchases made in the past. Calculate forecast using random forests: Random forests work by calculating a set of regression trees and then averaging them together to create a single model. It can be time consuming to fit the random trees to the data, but each new tree can be calculated independently. One way to tackle this problem is to use a set of map tasks to generate random trees, and then send the models to a single reducer task to average the results and produce the model. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
46 Summary + Exercises Available at Joseph Adler: R in a Nutshell. 2nd edition, O Reilly, Richard Cotton: Learning R. A Step-by-Step Function Guide to Data Analysis. O Reilly, Write a wrapper around r.plot which takes two arguments: a string with the argument for r.plot and the name of the file for the plot. 2 Annual U.S. Killer Tornado Statistics are listed at Use python to get the annual data from the web page and into R data sets, making one data set for every year. Project Three due Wednesday 27 November at 9AM. Homework Six is due on Wednesday 4 December at 9AM: ex 3 of L-27; ex 2 of L-28; ex 1 of L-29; ex 1 of L-30; ex 2 of L-31; ex 3 of L-32; ex 1 of L-33; ex 2 of L-34; ex 1 of L-35; and ex 1 of L-36. Scientific Software (MCS 507 L-37) statistical computing with R 20 November / 46
= = Intro to Statistics for the Social Sciences. Name: Lab Session: Spring, 2015, Dr. Suzanne Delaney
Name: Intro to Statistics for the Social Sciences Lab Session: Spring, 2015, Dr. Suzanne Delaney CID Number: _ Homework #22 You have been hired as a statistical consultant by Donald who is a used car dealer
More information= = Name: Lab Session: CID Number: The database can be found on our class website: Donald s used car data
Intro to Statistics for the Social Sciences Fall, 2017, Dr. Suzanne Delaney Extra Credit Assignment Instructions: You have been hired as a statistical consultant by Donald who is a used car dealer to help
More informationSPSS 14: quick guide
SPSS 14: quick guide Edition 2, November 2007 If you would like this document in an alternative format please ask staff for help. On request we can provide documents with a different size and style of
More informationSTAT 2300: Unit 1 Learning Objectives Spring 2019
STAT 2300: Unit 1 Learning Objectives Spring 2019 Unit tests are written to evaluate student comprehension, acquisition, and synthesis of these skills. The problems listed as Assigned MyStatLab Problems
More informationMore Multiple Regression
More Multiple Regression Model Building The main difference between multiple and simple regression is that, now that we have so many predictors to deal with, the concept of "model building" must be considered
More informationAnalytical Capability Security Compute Ease Data Scale Price Users Traditional Statistics vs. Machine Learning In-Memory vs. Shared Infrastructure CRAN vs. Parallelization Desktop vs. Remote Explicit vs.
More information1. Contingency Table (Cross Tabulation Table)
II. Descriptive Statistics C. Bivariate Data In this section Contingency Table (Cross Tabulation Table) Box and Whisker Plot Line Graph Scatter Plot 1. Contingency Table (Cross Tabulation Table) Bivariate
More informationStatistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models
Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Modelling categorical variables using logit models Software commands
More informationSPSS Guide Page 1 of 13
SPSS Guide Page 1 of 13 A Guide to SPSS for Public Affairs Students This is intended as a handy how-to guide for most of what you might want to do in SPSS. First, here is what a typical data set might
More informationStatistical Modelling for Business and Management. J.E. Cairnes School of Business & Economics National University of Ireland Galway.
Statistical Modelling for Business and Management J.E. Cairnes School of Business & Economics National University of Ireland Galway June 28 30, 2010 Graeme Hutcheson, University of Manchester Luiz Moutinho,
More information4.3 Nonparametric Tests cont...
Class #14 Wednesday 2 March 2011 What did we cover last time? Hypothesis Testing Types Student s t-test - practical equations Effective degrees of freedom Parametric Tests Chi squared test Kolmogorov-Smirnov
More informationChecklist before you buy software for questionnaire based surveys
Checklist before you buy for questionnaire based surveys Continuity, stability and development...2 Capacity...2 Safety and security...2 Types of surveys...3 Types of questions...3 Interview functions (Also
More informationCore vs NYS Standards
Core vs NYS Standards Grade 5 Core NYS Operations and Algebraic Thinking -------------------------------------------------------------------------- 5.OA Write / Interpret Numerical Expressions Use ( ),
More informationLab 1: A review of linear models
Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need
More informationData from a dataset of air pollution in US cities. Seven variables were recorded for 41 cities:
Master of Supply Chain, Transport and Mobility - Data Analysis on Transport and Logistics - Course 16-17 - Partial Exam Lecturer: Lidia Montero November, 10th 2016 Problem 1: All questions account for
More informationUsing R for Introductory Statistics
R http://www.r-project.org Using R for Introductory Statistics John Verzani CUNY/the College of Staten Island January 6, 2009 http://www.math.csi.cuny.edu/verzani/r/ams-maa-jan-09.pdf John Verzani (CSI)
More informationAP Statistics Scope & Sequence
AP Statistics Scope & Sequence Grading Period Unit Title Learning Targets Throughout the School Year First Grading Period *Apply mathematics to problems in everyday life *Use a problem-solving model that
More informationCreative Commons Attribution-NonCommercial-Share Alike License
Author: Brenda Gunderson, Ph.D., 2015 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:
More informationLoblolly Example: Correlation. This note explores estimating correlation of a sample taken from a bivariate normal distribution.
Math 3080 1. Treibergs Loblolly Example: Correlation Name: Example Feb. 26, 2014 This note explores estimating correlation of a sample taken from a bivariate normal distribution. This data is taken from
More informationJMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING
JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION JMP software provides introductory statistics in a package designed to let students visually explore data in an interactive way with
More informationMathematical Computing
Mathematical Computing IMT2b2β/MSP3b9β Department of Mathematics University of Ruhuna A.W.L. Pubudu Thilan Department of Mathematics University of Ruhuna, Mathematical Computing, IMT2b2β/MSP3b9β 1/77 Method
More informationMathematical Computing
Mathematical Computing IMT2b2β Department of Mathematics University of Ruhuna A.W.L. Pubudu Thilan Department of Mathematics University of Ruhuna, Mathematical Computing, IMT2b2β 1/76 Method of evaluation
More informationIncorporating Predictive Models for Operational Intelligence
Incorporating Predictive Models for Operational Intelligence Presented by Curt Hertler Partner Solutions Architect, OSIsoft Copyright 2015 OSIsoft, LLC History The only has thing a way new of repeating
More informationADVANCED DATA ANALYTICS
ADVANCED DATA ANALYTICS MBB essay by Marcel Suszka 17 AUGUSTUS 2018 PROJECTSONE De Corridor 12L 3621 ZB Breukelen MBB Essay Advanced Data Analytics Outline This essay is about a statistical research for
More informationLIR 832: MINITAB WORKSHOP
LIR 832: MINITAB WORKSHOP Opening Minitab Minitab will be in the Start Menu under Net Apps. Opening the Data Go to the following web site: http://www.msu.edu/course/lir/832/datasets.htm Right-click and
More informationMachine Learning Models for Sales Time Series Forecasting
Article Machine Learning Models for Sales Time Series Forecasting Bohdan M. Pavlyshenko SoftServe, Inc., Ivan Franko National University of Lviv * Correspondence: bpavl@softserveinc.com, b.pavlyshenko@gmail.com
More informationUsing Excel s Analysis ToolPak Add-In
Using Excel s Analysis ToolPak Add-In Bijay Lal Pradhan, PhD Introduction I have a strong opinions that we can perform different quantitative analysis, including statistical analysis, in Excel. It is powerful,
More informationRisk Simulation in Project Management System
Risk Simulation in Project Management System Anatoliy Antonov, Vladimir Nikolov, Yanka Yanakieva Abstract: The Risk Management System of a Project Management System should be able to simulate, to evaluate
More informationREPORTING CATEGORY I NUMBER, OPERATION, QUANTITATIVE REASONING
Texas 6 th Grade STAAR (TEKS)/ Correlation Skills (w/ STAAR updates) Stretch REPORTING CATEGORY I NUMBER, OPERATION, QUANTITATIVE REASONING (6.1) Number, operation, and quantitative reasoning. The student
More informationUsing SPSS for Linear Regression
Using SPSS for Linear Regression This tutorial will show you how to use SPSS version 12.0 to perform linear regression. You will use SPSS to determine the linear regression equation. This tutorial assumes
More informationBackground Information. Instructions. Problem Statement. HOMEWORK HELP PROJECT INSTRUCTIONS Homework #3 Help Charleston Federal Grant Problem
Background Information As one might expect, it is generally in a city s best financial interest to have more residents. A larger population generally yields a larger tax base and more income. There is
More informationUnravelling Airbnb Predicting Price for New Listing
Unravelling Airbnb Predicting Price for New Listing Paridhi Choudhary H John Heinz III College Carnegie Mellon University Pittsburgh, PA 15213 paridhic@andrew.cmu.edu Aniket Jain H John Heinz III College
More informationLinear model to forecast sales from past data of Rossmann drug Store
Abstract Linear model to forecast sales from past data of Rossmann drug Store Group id: G3 Recent years, the explosive growth in data results in the need to develop new tools to process data into knowledge
More informationMATH 2070 Final Exam Mixed Practice (MISSING 6.5 & 6.6)
The profit for Scott s Scooters can be estimated by P(x, y) = 0.4x + 0.1y + 0.3xy + 2y + 1.5x hundred dollars, where x is the number of mopeds sold and y is the number of dirt bikes sold. (Check: P(3,2)
More informationSTAT 350 (Spring 2016) Homework 12 Online 1
STAT 350 (Spring 2016) Homework 12 Online 1 1. In simple linear regression, both the t and F tests can be used as model utility tests. 2. The sample correlation coefficient is a measure of the strength
More informationStatistics, Data Analysis, and Decision Modeling
- ' 'li* Statistics, Data Analysis, and Decision Modeling T H I R D E D I T I O N James R. Evans University of Cincinnati PEARSON Prentice Hall Upper Saddle River, New Jersey 07458 CONTENTS Preface xv
More informationQuantitative Methods
THE ASSOCIATION OF BUSINESS EXECUTIVES DIPLOMA PART 2 QM Quantitative Methods afternoon 4 June 2003 1 Time allowed: 3 hours. 2 Answer any FOUR questions. 3 All questions carry 25 marks. Marks for subdivisions
More informationIntroduction of STATA
Introduction of STATA News: There is an introductory course on STATA offered by CIS Description: Intro to STATA On Tue, Feb 13th from 4:00pm to 5:30pm in CIT 269 Seats left: 4 Windows, 7 Macintosh For
More informationWeka Evaluation: Assessing the performance
Weka Evaluation: Assessing the performance Lab3 (in- class): 21 NOV 2016, 13:00-15:00, CHOMSKY ACKNOWLEDGEMENTS: INFORMATION, EXAMPLES AND TASKS IN THIS LAB COME FROM SEVERAL WEB SOURCES. Learning objectives
More informationOpening SPSS 6/18/2013. Lesson: Quantitative Data Analysis part -I. The Four Windows: Data Editor. The Four Windows: Output Viewer
Lesson: Quantitative Data Analysis part -I Research Methodology - COMC/CMOE/ COMT 41543 The Four Windows: Data Editor Data Editor Spreadsheet-like system for defining, entering, editing, and displaying
More informationBusiness Data Analytics
MTAT.03.319 Business Data Analytics Lecture 4 Marketing and Sales Customer Lifecycle Management: Regression Problems Customer lifecycle Customer lifecycle Moving companies grow not because they force people
More informationChapter 5 Regression
Chapter 5 Regression Topics to be covered in this chapter: Regression Fitted Line Plots Residual Plots Regression The scatterplot below shows that there is a linear relationship between the percent x of
More informationBusiness Quantitative Analysis [QU1] Examination Blueprint
Business Quantitative Analysis [QU1] Examination Blueprint 2014-2015 Purpose The Business Quantitative Analysis [QU1] examination has been constructed using an examination blueprint. The blueprint, also
More informationProblem Points Score USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT
STAT 512 EXAM I STAT 512 Name (7 pts) Problem Points Score 1 40 2 25 3 28 USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT WRITE LEGIBLY. ANYTHING UNREADABLE WILL NOT BE GRADED GOOD LUCK!!!!
More informationAdvanced analytics at your hands
2.4 Advanced analytics at your hands Today, most organizations are stuck at lower-value descriptive analytics. But more sophisticated analysis can bring great business value. TARGET APPLICATIONS Business
More informationHadoop Course Content
Hadoop Course Content Hadoop Course Content Hadoop Overview, Architecture Considerations, Infrastructure, Platforms and Automation Use case walkthrough ETL Log Analytics Real Time Analytics Hbase for Developers
More informationData Analytics with MATLAB Adam Filion Application Engineer MathWorks
Data Analytics with Adam Filion Application Engineer MathWorks 2015 The MathWorks, Inc. 1 Case Study: Day-Ahead Load Forecasting Goal: Implement a tool for easy and accurate computation of dayahead system
More informationEFFICACY OF ROBUST REGRESSION APPLIED TO FRACTIONAL FACTORIAL TREATMENT STRUCTURES MICHAEL MCCANTS
EFFICACY OF ROBUST REGRESSION APPLIED TO FRACTIONAL FACTORIAL TREATMENT STRUCTURES by MICHAEL MCCANTS B.A., WINONA STATE UNIVERSITY, 2007 B.S., WINONA STATE UNIVERSITY, 2008 A THESIS submitted in partial
More informationSpreadsheets for Accounting
Osborne Books Tutor Zone Spreadsheets for Accounting Practice material 1 Osborne Books Limited, 2016 2 s p r e a d s h e e t s f o r a c c o u n t i n g t u t o r z o n e S P R E A D S H E E T S F O R
More information1. A/an is a mathematical statement that calculates a value. 2. Create a cell reference in a formula by typing in the cell name or
Question 1 of 20 : Select the best answer for the question. 1. A/an is a mathematical statement that calculates a value. A. argument B. function C. order of operations D. formula Question 2 of 20 : Select
More informationBusiness Analytics using R
Business Analytics using R 64 hours 12 weekends 3 hours/day www.greycampus.com Program Overview The future to decision making is data and no industry will remain untouched by it. However, data comes with
More informationSolving Transportation Logistics Problems Using Advanced Evolutionary Optimization
Solving Transportation Logistics Problems Using Advanced Evolutionary Optimization Transportation logistics problems and many analogous problems are usually too complicated and difficult for standard Linear
More informationISE480 Sequencing and Scheduling
ISE480 Sequencing and Scheduling INTRODUCTION ISE480 Sequencing and Scheduling 2012 2013 Spring term What is Scheduling About? Planning (deciding what to do) and scheduling (setting an order and time for
More informationConnected Mathematics 2, 7th Grade Units 2009 Correlated to: Washington Mathematics Standards for Grade 7
Grade 7 7.1. Core Content: Rational numbers and linear equations (Numbers, Operations, Algebra) 7.1.A Compare and order rational numbers using the number line, lists, and the symbols , or =. 7.1.B
More informationChapter 10 Regression Analysis
Chapter 10 Regression Analysis Goal: To become familiar with how to use Excel 2007/2010 for Correlation and Regression. Instructions: You will be using CORREL, FORECAST and Regression. CORREL and FORECAST
More informationGetting Started with HLM 5. For Windows
For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 About this Document... 3 1.2 Introduction to HLM... 3 1.3 Accessing HLM... 3 1.4 Getting Help with HLM... 3 Section 2: Accessing
More informationConnected Mathematics 2, 7th Grade Units 2009 Correlated to: Washington Mathematics Standards for Grade 7
Grade 7 Connected Mathematics 2, 7th Grade Units 2009 7.1. Core Content: Rational numbers and linear equations (Numbers, Operations, Algebra) 7.1.A Compare and order rational numbers using the number line,
More informationGUIDED PRACTICE: SPREADSHEET FORMATTING
Guided Practice: Spreadsheet Formatting Student Activity Student Name: Period: GUIDED PRACTICE: SPREADSHEET FORMATTING Directions: In this exercise, you will follow along with your teacher to enter and
More informationStatistics 201 Summary of Tools and Techniques
Statistics 201 Summary of Tools and Techniques This document summarizes the many tools and techniques that you will be exposed to in STAT 201. The details of how to do these procedures is intentionally
More informationSolutions Implementation Guide
Solutions Implementation Guide Salesforce, Winter 18 @salesforcedocs Last updated: November 30, 2017 Copyright 2000 2017 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of
More informationTable. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,
Table XTMIXED Procedure in STATA with Output Systolic Blood Pressure, 2001. use "k:mydirectory,. xtmixed sbp nage20 nage30 nage40 nage50 nage70 nage80 nage90 winter male dept2 edu_bachelor median_household_income
More informationCreating Simple Report from Excel
Creating Simple Report from Excel 1.1 Connect to Excel workbook 1. Select Connect Microsoft Excel. In the Open File dialog box, select the 2015 Sales.xlsx file. 2. The file will be loaded to Tableau, and
More informationRiskyProject Lite 7. Getting Started Guide. Intaver Institute Inc. Project Risk Management Software.
RiskyProject Lite 7 Project Risk Management Software Getting Started Guide Intaver Institute Inc. www.intaver.com email: info@intaver.com Chapter 1: Introduction to RiskyProject 3 What is RiskyProject?
More informationSTA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 2 Organizing Data
STA 2023 Module 2 Organizing Data Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Classify variables and data as either qualitative or quantitative. 2. Distinguish
More informationSuper-marketing. A Data Investigation. A note to teachers:
Super-marketing A Data Investigation A note to teachers: This is a simple data investigation requiring interpretation of data, completion of stem and leaf plots, generation of box plots and analysis of
More informationRiskyProject Lite 7. User s Guide. Intaver Institute Inc. Project Risk Management Software.
RiskyProject Lite 7 Project Risk Management Software User s Guide Intaver Institute Inc. www.intaver.com email: info@intaver.com 2 COPYRIGHT Copyright 2017 Intaver Institute. All rights reserved. The information
More informationAccelerating Microsoft Office Excel 2010 with Windows HPC Server 2008 R2
Accelerating Microsoft Office Excel 2010 with Windows HPC Server 2008 R2 Technical Overview Published January 2010 Abstract Microsoft Office Excel is a critical tool for business. As calculations and models
More informationWho Are My Best Customers?
Technical report Who Are My Best Customers? Using SPSS to get greater value from your customer database Table of contents Introduction..............................................................2 Exploring
More informationExpressMaintenance Release Notes
ExpressMaintenance Release Notes ExpressMaintenance Release 9 introduces a wealth exciting features. It includes many enhancements to the overall interface as well as powerful new features and options
More informationoptislang 5 David Schneider Product manager
optislang 5 David Schneider Product manager 1. Postprocessing 2. optislang for ANSYS 6. Algorithms 3. Customization 5. SPDM 4. Workflows 2 Update optislang Dynardo GmbH Postprocessing Predefined modes
More informationECDL / ICDL Spreadsheets Sample Part-Tests
The following are sample part-tests for. This sample part-test contains 16 questions giving a total of 16 marks. The actual certification test contains 32 questions giving a total of 32 marks. The candidate
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/318/5853/1107/dc1 Supporting Online Material for Hurricane Katrina s Carbon Footprint on U.S. Gulf Coast Forests Jeffrey Q. Chambers,* Jeremy I. Fisher, Hongcheng Zeng,
More informationSAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other
The New Era of Credit Risk Modeling and Validation Presenter Boaz Galinson, V.P, Credit Risk Manager, Leumi Bank Boaz Galinson is Head of the Group Credit Risk Modeling and Measurement in the Risk Management
More informationHardness Example: Tukey s Multiple Comparisons
Math 3080 1. Treibergs Hardness Example: Tukey s Multiple Comparisons Name: Example January 25, 2016 This R c program explores Tukey s Honest Significant Differences test. This data was taken from An investigation
More informationBrian Macdonald Big Data & Analytics Specialist - Oracle
Brian Macdonald Big Data & Analytics Specialist - Oracle Improving Predictive Model Development Time with R and Oracle Big Data Discovery brian.macdonald@oracle.com Copyright 2015, Oracle and/or its affiliates.
More informationmbusa Instagram account report Jan 01, May 28, 2016
mbusa Instagram account report Jan 01, 2016 - May 28, 2016 mbusa / Audience Total Followers Count 1,116,679 25.37% Followers Change +245,517 Max. Followers Change 19,132 Jan 04 Avg. Followers Change +11,691.29
More informationTransforming Analytics with Cloudera Data Science WorkBench
Transforming Analytics with Cloudera Data Science WorkBench Process data, develop and serve predictive models. 1 Age of Machine Learning Data volume NO Machine Learning Machine Learning 1950s 1960s 1970s
More informationPredictive Modeling Using SAS Visual Statistics: Beyond the Prediction
Paper SAS1774-2015 Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction ABSTRACT Xiangxiang Meng, Wayne Thompson, and Jennifer Ames, SAS Institute Inc. Predictions, including regressions
More informationIf you want to flag a question for later review, select the "Mark for review" button.
Exam Number: 584002RR Lesson Name: Microsoft Excel 2016 Exam Guidelines: This exam is now available only in the online assessment system. If your study guide still contains an exam, that exam is no longer
More informationA is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values.
Stats: Modeling the World Chapter 2 Chapter 2: Data What are data? In order to determine the context of data, consider the W s Who What (and in what units) When Where Why How There are two major ways to
More informationACCOUNTING SOFTWARE Certification Program
FINAL CERTIFICATION AWARDED BY IMRTC - USA ACCOUNTING SOFTWARE Certification Program This training program is highly specialized with the duration of 72 Credit hours, where the program covers all the major
More informationData Description U.S. Air Pollution Dataset: data("usairpollution", package = "HSAUR2")
MÀSTER DE SUPPLY CHAIN, TRANSPORT I MOBILITAT (UPC). CURS 14-15 Q1 FINAL EXAM Anàlisi de Dades de Transport i Logística (ADTL). (Data: 8/7/2015 17:00-19:00 h Lloc: Aula H2.5 Professor responsable: Lídia
More informationSage 300 ERP Sage 300 ERP Intelligence Release Notes
Sage 300 ERP Intelligence Release Notes The software described in this document is protected by copyright, and may not be copied on any medium except as specifically authorized in the license or non disclosure
More informationUnderstanding Total Dissolved Solids in Selected Streams of the Independence Mountains of Nevada
Understanding Total Dissolved Solids in Selected Streams of the Independence Mountains of Nevada Richard B. Shepard, Ph.D. January 12, 211 President, Applied Ecosystem Services, Inc., Troutdale, OR Summary
More informationBiostat Exam 10/7/03 Coverage: StatPrimer 1 4
Biostat Exam 10/7/03 Coverage: StatPrimer 1 4 Part A (Closed Book) INSTRUCTIONS Write your name in the usual location (back of last page, near the staple), and nowhere else. Turn in your Lab Workbook at
More informationIntegrating Market and Credit Risk Measures using SAS Risk Dimensions software
Integrating Market and Credit Risk Measures using SAS Risk Dimensions software Sam Harris, SAS Institute Inc., Cary, NC Abstract Measures of market risk project the possible loss in value of a portfolio
More informationGRADE 7 MATHEMATICS CURRICULUM GUIDE
GRADE 7 MATHEMATICS CURRICULUM GUIDE Loudoun County Public Schools 2010-2011 Complete scope, sequence, pacing and resources are available on the CD and will be available on the LCPS Intranet. Grade 7 Mathematics
More informationTIM/UNEX 270, Spring 2012 Homework 1
TIM/UNEX 270, Spring 2012 Homework 1 Prof. Kevin Ross, kross@soe.ucsc.edu April 11, 2012 Goals: Homework 1 starts out with a review of essential concepts from statistics in Problem 1 we will build on these
More informationAdd Sophisticated Analytics to Your Repertoire with Data Mining, Advanced Analytics and R
Add Sophisticated Analytics to Your Repertoire with Data Mining, Advanced Analytics and R Why Advanced Analytics Companies that inject big data and analytics into their operations show productivity rates
More informationData Viz Helping Your Staff Become a Data Wiz. Charlie Farah Director Market Development, Healthcare & Public Sector, Qlik APAC August 2017
Data Viz Helping Your Staff Become a Data Wiz Charlie Farah Director Market Development, Healthcare & Public Sector, Qlik APAC August 2017 65%?% Organizations where managers would follow their gut feel
More informationPractices of Business Intelligence
Tamkang University Practices of Business Intelligence II Tamkang University (Descriptive Analytics II: Business Intelligence and Data Warehousing) 1071BI05 MI4 (M2084) (2888) Wed, 7, 8 (14:10-16:00) (B217)
More informationSTA Module 2A Organizing Data and Comparing Distributions (Part I)
STA 2023 Module 2A Organizing Data and Comparing Distributions (Part I) 1 Learning Objectives Upon completing this module, you should be able to: 1. Classify variables and data as either qualitative or
More informationLearn What s New. Statistical Software
Statistical Software Learn What s New Upgrade now to access new and improved statistical features and other enhancements that make it even easier to analyze your data. The Assistant Let Minitab s Assistant
More informationICTCM 28th International Conference on Technology in Collegiate Mathematics
MULTIVARIABLE SPREADSHEET MODELING AND SCIENTIFIC THINKING VIA STACKING BRICKS Scott A. Sinex Department of Physical Sciences & Engineering Prince George s Community College Largo, MD 20774 ssinex@pgcc.edu
More informationDATA ANALYTICS WITH R, EXCEL & TABLEAU
Learn. Do. Earn. DATA ANALYTICS WITH R, EXCEL & TABLEAU COURSE DETAILS centers@acadgild.com www.acadgild.com 90360 10796 Brief About this Course Data is the foundation for technology-driven digital age.
More informationPOST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)
OUTLINE FOR THE POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) Module Subject Topics Learning outcomes Delivered by Exploratory & Visualization Framework Exploratory Data Collection and
More informationForecast Scheduling Microsoft Project Online 2018
Forecast Scheduling Microsoft Project Online 2018 Best Practices for Real-World Projects Eric Uyttewaal, PMP President ProjectPro Corporation Published by ProjectPro Corporation Preface Publishing Notice
More informationOracle Hyperion Planning for Interactive Users ( )
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Hyperion Planning 11.1.2 for Interactive Users (11.1.2.3) Duration: 3 Days What you will learn This course will teach you
More informationCEE3710: Uncertainty Analysis in Engineering
CEE3710: Uncertainty Analysis in Engineering Lecture 1 September 6, 2017 Why do we need Probability and Statistics?? What is Uncertainty Analysis?? Ex. Consider the average (mean) height of females by
More informationOnline appendix for THE RESPONSE OF CONSUMER SPENDING TO CHANGES IN GASOLINE PRICES *
Online appendix for THE RESPONSE OF CONSUMER SPENDING TO CHANGES IN GASOLINE PRICES * Michael Gelman a, Yuriy Gorodnichenko b,c, Shachar Kariv b, Dmitri Koustas b, Matthew D. Shapiro c,d, Dan Silverman
More information