Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Similar documents
Introducing Analytics with SAS Enterprise Miner. Matthew Stainer Business Analytics Consultant SAS Analytics & Innovation practice

Deep Dive into High Performance Machine Learning Procedures. Tuba Islam, Analytics CoE, SAS UK

PROVEN PRACTICES FOR PREDICTIVE MODELING

Brian Macdonald Big Data & Analytics Specialist - Oracle

Building the In-Demand Skills for Analytics and Data Science Course Outline

Sascha Schubert Product Manager Data Mining SAS EMEA Copyright 2005, SAS Institute Inc. All rights reserved.

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here.

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.

SPM 8.2. Salford Predictive Modeler

DATA SCIENCE: HYPE AND REALITY PATRICK HALL

USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP

SAS Machine Learning and other Analytics: Trends and Roadmap. Sascha Schubert Sberbank 8 Sep 2017

IBM SPSS Modeler Personal

IBM SPSS Modeler Personal

SAS Enterprise Miner 5.3 for Desktop

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

IBM SPSS & Apache Spark

Approaching an Analytical Project. Tuba Islam, Analytics CoE, SAS UK

From Profit Driven Business Analytics. Full book available for purchase here.

Predictive Analytics

The real life of Data Scientists. or why we created the Data Science Studio

WELCOME TO SAS FOR MARKETING

Preface to the third edition Preface to the first edition Acknowledgments

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos

OILFIELD ANALYTICS: OPTIMIZE EXPLORATION AND PRODUCTION WITH DATA-DRIVEN MODELS

USING SAS HIGH PERFORMANCE STATISTICS FOR PREDICTIVE MODELLING

Data Science Training Course

Ensemble Modeling. Toronto Data Mining Forum November 2017 Helen Ngo

New Features in Enterprise Miner

Data Analytics with MATLAB Adam Filion Application Engineer MathWorks

How to build and deploy machine learning projects

Achieve Better Insight and Prediction with Data Mining

Customer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara

1/27 MLR & KNN M48, MLR and KNN, More Simple Generalities Handout, KJ Ch5&Sec. 6.2&7.4, JWHT Sec. 3.5&6.1, HTF Sec. 2.3

Machine Learning 101

DATA ANALYTICS WITH R, EXCEL & TABLEAU

2015 The MathWorks, Inc. 1

GSAW 2018 Machine Learning

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Analytics for Banks. September 19, 2017

Data Analytics for Semiconductor Manufacturing The MathWorks, Inc. 1

SAS Business Knowledge Series

SAP Predictive Analytics Suite

Predictive Analytics Using Support Vector Machine

Data Science in a pricing process

A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG 1, Hong-Yu FU 1

SmartCare. SPSS Workshop. Rick Durham - North American Advanced Analytics Channel Team IBM Corporation. Date: 5/28/2014

Data mining and Renewable energy. Cindi Thompson

Machine Learning Based Prescriptive Analytics for Data Center Networks Hariharan Krishnaswamy DELL

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other

Enabling Foresight. Skills for Predictive Analytics

Big Data Analytics met Hadoop

Predictive Modelling for Customer Targeting A Banking Example

Data Science End to End

26wcas 10 th CarLab Advisory Board Meeting after the storm

DASI: Analytics in Practice and Academic Analytics Preparation

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

Data Mining Applications with R

2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Neural Connection s four powerful neural networks give you better performing models

BIG DATA SKILLS: CHALLENGES FOR THE UNIVERSITY WORLD CREATING A NEW GENERATION OF DATA SCIENTISTS. Massimiliano Marcellino Bocconi University

MATLAB and the Internet of Things (IoT): Collecting and Analysing IoT Data Gabriele Bunkheila Technical Marketing, MathWorks

Model-Based Fiber Network Expansion Using SAS Enterprise Miner and SAS Visual Analytics

CSE 255 Lecture 3. Data Mining and Predictive Analytics. Supervised learning Classification

Copyright 2013, SAS Institute Inc. All rights reserved.

GOVERNMENT ANALYTICS LEADERSHIP FORUM SAS Canada & The Institute of Public Administration of Canada. April 26 The Shaw Centre

SAP Predictive Analytics Hands-On. Andreas Forster December 2015

In silico prediction of novel therapeutic targets using gene disease association data

Churn Prediction with Support Vector Machine

Analytics in Action transforming the way we use and consume information

Predicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest

W H I T E P A P E R. How to Credit Score with Predictive Analytics

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)

Azure ML Studio. Overview for Data Engineers & Data Scientists

Knowledge Solution for Credit Scoring

Integrated Predictive Maintenance Platform Reduces Unscheduled Downtime and Improves Asset Utilization

SAS BIG DATA ANALYTICS INCREASING YOUR COMPETITIVE EDGE

Who Is Likely to Succeed: Predictive Modeling of the Journey from H-1B to Permanent US Work Visa

Using SAS Enterprise Guide, SAS Enterprise Miner, and SAS Marketing Automation to Make a Collection Campaign Smarter


IBM SPSS Statistics. Editions. Get the analytical power you need for better decision making. Why use IBM SPSS Statistics? IBM Analytics Solution Brief

Data Mining and Applications in Genomics

Sales prediction in online banking

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner

STATE OF THE ART ANALYTICS

One Year Executive Program in Applied Business Analytics

SAP Predictive Analysis

VDMML Enablement Session Data Science Jam Sessions

New restaurants fail at a surprisingly

Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction

Predicting Corporate Influence Cascades In Health Care Communities

When to Book: Predicting Flight Pricing

Microsoft Developer Day

A Decision Making Model for Human Resource Management in Organizations using Data Mining and Predictive Analytics

Business Analytics using R

DATA MINING AND BUSINESS ANALYTICS WITH R

Evaluation of Machine Learning Algorithms for Satellite Operations Support

Transcription:

ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

ANALYTICAL MODEL DEVELOPMENT AGENDA Enterprise Miner: Analytical Model Development The session looks at: - Supervised and Unsupervised Modelling - Classification and Prediction Techniques - Tree Based Learners

ROLE OF SAS ENTERPRISE MINER

THE ANALYTICS LIFECYCLE PREDICTIVE ANALYTICS AND DATA MINING BUSINESS MANAGER Domain Expert Makes Decisions Evaluates Processes and ROI EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION BUSINESS ANALYST Data Exploration Data Visualization Report Creation DEPLOY MODEL DATA EXPLORATION IT SYSTEMS / MANAGEMENT Model Validation Model Deployment Model Monitoring Data Preparation VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT DATA MINER / STATISTICIAN Exploratory Analysis Descriptive Segmentation Predictive Modeling

THE ANALYTICS LIFECYCLE PREDICTIVE ANALYTICS AND DATA MINING BUSINESS MANAGER Domain Expert Makes Decisions Evaluates Processes and ROI EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION BUSINESS ANALYST Data Exploration Data Visualization Report Creation DEPLOY MODEL DATA EXPLORATION IT SYSTEMS / MANAGEMENT Model Validation Model Deployment Model Monitoring Data Preparation VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT DATA MINER / STATISTICIAN Exploratory Analysis Descriptive Segmentation Predictive Modeling

THE ANALYTICS LIFECYCLE PREDICTIVE ANALYTICS AND DATA MINING BUSINESS MANAGER Domain Expert Makes Decisions Evaluates Processes and ROI EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION BUSINESS ANALYST Data Exploration Data Visualization Report Creation DEPLOY MODEL DATA EXPLORATION IT SYSTEMS / MANAGEMENT Model Validation Model Deployment Model Monitoring Data Preparation VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT DATA MINER / STATISTICIAN Exploratory Analysis Descriptive Segmentation Predictive Modeling

SAS ENTERPRISE MINER SAS ENTERPRISE MINER Modern, collaborative, easy-to-use data mining workbench Sophisticated set of data preparation and exploration tools Modern suite of modeling techniques and methods Interactive model comparison, testing and validation Automated scoring process delivers faster results Open, extensible design for ultimate flexibility

SAS ENTERPRISE MINER SAS ENTERPRISE MINER MODEL DEVELOPMENT PROCESS Sample Explore Modify Model Assess

SAS ENTERPRISE MINER Utility SAS ENTERPRISE MINER MODEL DEVELOPMENT PROCESS Apps. Time Series HPDM Credit Scoring

SAS ENTERPRISE MINER SAS ENTERPRISE MINER SEMMA IN ACTION REPEATABLE PROCESS

SUPERVISED AND UNSUPERVISED MODELLING

SUPERVISED AND UNSUPERVISED MODELLING Machine Learning TAXONOMY Supervised Unsupervised Classification Prediction Clustering Affinity Analysis

SUPERVISED AND UNSUPERVISED MODELLING LEARNING METHODS Supervised: Discover patterns in the data that relate attributes to labels. Patterns are used to predict the values of the label in future data instances. Unsupervised: The data have no label attribute. Goal is to explore the data to find some intrinsic structures in them.

SUPERVISED AND UNSUPERVISED MODELLING SUPERVISED LEARNING (CLASSIFICATION & PREDICTION) Logistic Regression Decision Trees, CART Decision Trees, CHAID Gradient Boosting Random Forests kth Nearest Neighbor Neural Networks Nonlinear SVMs Regression, least square Generalized Linear Models LASSO, LAR Splines, MARS

SUPERVISED AND UNSUPERVISED MODELLING K-means Fuzzy K-means UNSUPERVISED LEARNING Multidimensional Scaling Principal Components Assocations, Apriori Hierarchical Clustering Nonnegative Matrix Factorization Vector Quantization

CLASSIFICATION AND PREDICTION TECHNIQUES

CLASSIFICATION AND PREDICTION TECHNIQUES SAS ENTERPRISE MINER SEMMA PROCESS Sample Explore Modify Model HPDM

CLASSIFICATION AND PREDICTION TECHNIQUES Regression MODELING ALGORITHMS Linear Logistic Computes a forward stepwise least-squares regression Optionally computes all 2-way interactions of classification variables Optionally uses AOV16 variables to identify non-linear relationships between interval variables and the target variable. Optionally uses group variables to reduce the number of levels of classification variables.

CLASSIFICATION AND PREDICTION TECHNIQUES Generalized Linear Models MODELING ALGORITHMS Uses the high-performance HPGENSELECT procedure to fit a generalized linear model in a threaded or distributed computing environment. Several response probability distributions and link functions are available. Provides model selection methods.

CLASSIFICATION AND PREDICTION TECHNIQUES Neural Networks MODELING ALGORITHMS x1 x2 x3 h 1 h 2 y Non-linear relationship between inputs and output Prediction more important than ease of explaining model Requires a lot of training data

CLASSIFICATION AND PREDICTION TECHNIQUES Support Vector Machines MODELING ALGORITHMS Enables the creation of linear and non-linear support vector machine models. Constructs separating hyperplanes that maximize the margin between two classes. Enables the use a variety of kernels: linear, polynomial, radial basis function, and sigmoid function. The node also provides Interior point and active set optimization methods.

CLASSIFICATION AND PREDICTION TECHNIQUES Ensemble MODELING ALGORITHMS Creates new models by combining the posterior probabilities (for class targets) or the predicted values (for interval targets) from multiple predecessor models. 3 Methods Average Maximum Voting

CLASSIFICATION AND PREDICTION TECHNIQUES Model Import MODEL IMPORT Reads all model details from Metadata Repository Applies models to new data and generates all fit statistics Compatible with model selection tools Useful for sharing models with other users Useful testing old models with updated data Importing already scored records/cases Importing registered SAS Model Package Importing SAS Score Code

TREE BASED LEARNERS

TREE BASED LEARNERS SAS EM TREE ALGORITHMS 3 key tree based learning algorithms: 1. Decision Trees 2. Gradient Boosting 3. Random Forests

DECISION TREES

Decision Tree TREE BASED LEARNERS MODELING ALGORITHMS Classify observations based on the values of nominal, binary, or ordinal targets Predict outcomes for interval targets Easy to interpret Interactive Trees available CART, CHAID, C4.5 approximate

GRADIENT BOOSTING

TREE BASED LEARNERS Gradient Boosting MODELING ALGORITHMS Sequential ensemble of many trees Extremely good predictions Very effective at variable selection

TREE BASED LEARNERS GRADIENT BOOSTING Approach that resamples the analysis data set several times to generate results that form a weighted average of the re-sampled data set. Tree boosting creates a series of decision trees which together form a single predictive model. A tree in the series is fit to the residual of the prediction from the earlier trees in the series. The residual is defined in terms of the derivative of a loss function. The successive samples are adjusted to accommodate previously computed inaccuracies.

RANDOM FORESTS

TREE BASED LEARNERS RANDOM FOREST NODE What is a Random Forest?

TREE BASED LEARNERS HPFOREST HP node provides increased processing speed Random Forest ensemble methodology Samples without replacement Random selection of variables for each tree Uses measures of association to select variable Creates a prediction that is aggregated across the value in the leaf of each tree

TREE DEMONSTRATION

SUMMARY

DATA EXPLORATION AND VISUALISATION SUMMARY EM supports a variety of both supervised and unsupervised modelling algorithms Linear / Non-Linear modelling Benefits from Tree based learning algorithms include: Interoperability Model performance Outliers/ Missing Values

QUESTIONS AND ANSWERS www.sas.com