USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP
INTRODUCTION PAT VALENTE, MA Solution Specialist, Data Sciences at SAS. Training in Economics and Statistics. 20 years experience in business areas including Finance, marketing and logistics. Well versed in analytics and data challenges that exist throughout large organizations. pat.valente@sas.com
AGENDA SAS AND OPEN SOURCE Open Source analytics in Business Open Source Integration Node Output modes Workflow examples to incorporate R models Careful considerations Questions
OPEN SOURCE INTEGRATION THIS IS ACHIEVED WITH SAS ANALYTICS IN ACTION SAS ANALYTICS IN ACTION = Data is about gathering data from the different data sources and locations, unifying it and making it ready for modeling Discovery is about having the flexibility to prototype analytical models to uncover business value Deployment is about engineering enterprise level solutions from those prototypes with governance measures to ensure quality
Extend Integrate OPEN SOURCE INTEGRATION SAS DOES IT BY INTEGRATING AND EXTENDING IT Where do we integrate? Where do we extend?
USING R IN SAS ENTERPRISE MINER THE OPEN SOURCE INTEGRATION NODE Enables the execution of R code within an Enterprise Miner workflow. Transfers data, metadata, and results automatically between Enterprise Miner and R
USING R IN SAS ENTERPRISE MINER THE OPEN SOURCE INTEGRATION NODE Facilitates multitasking in R Generates text and graphical output from R Integrates both supervised and unsupervised learning tasks
USING R IN SAS ENTERPRISE MINER PMML OUTPUT Predictive modeling markup language (PMML) is an open standard enabling certain R models to be translated into SAS DATA step code Currently supported R models include: Linear Models (lm) Multinomial Log-Linear Models (multinom (nnet)) Generalized Linear Models (glm (stats)) Decision Trees (rpart) Neural Networks (nnet) k-means Clustering (kmeans (stats))
USING R IN SAS ENTERPRISE MINER PMML MODE
USING R IN SAS ENTERPRISE MINER MERGE OUTPUT MODE Merge output mode enables integration with thousands of R packages that are not supported in PMML output mode. Variables created in R are merged with SAS Enterprise Miner data sources by the user. SAS DATA step code is not created.
USING R IN SAS ENTERPRISE MINER MERGE MODE
USING R IN SAS ENTERPRISE MINER SOME PRECAUTIONS Some items to consider when running R models in Open Source note: Missing Values may be an issue Ensure Categorical Variables are not high in cardinality Memory issues
USE SAS TO INTEGRATE R INTEGRATE R MODELS Why? Model Comparison Leverage R for new algorithms Ensemble Modelling Generate Score Code Deploy R models SAS MODELS Copyr i g ht 2016, SAS Ins titut e Inc. All rights res er ve d. 18
WHY BRING OPEN SOURCE TO SAS? EXTEND Model comparisons Copyr i g ht 2016, SAS Ins titut e Inc. All rights res er ve d. 19
QUESTIONS sas.com
USING R IN SAS ENTERPRISE MINER SUMMARY OF BENEFITS Model Building in SAS Enterprise Miner Use the latest R packages for model building and comparison Multi-Threaded Processing of Workflows SAS Enterprise Miner handles multi-threaded execution Use Open Source Node in SAS Enterprise Miner in various flows simultaneously Collaboration Many users can access the same Enterprise Miner diagram Reusable data processing and pre-analysis Using the EM functionality in prior nodes (i.e. data prep, pre-processing) of R models Scoring Create supported models in R that can be converted into scoring code for operational deployment (i.e. in-database)