SAP Predictive Analytics Suite Tania Pérez Asensio
Where is the Evolution of Business Analytics Heading? Organizations Are Maturing Their Approaches to Solving Business Problems Reactive Wait until a problem is obvious and then solve it without addressing the underlying causes Visualize Determine warning signs and KPIs that can signal areas to investigate Hypothesize Continuously observe situation and react when pre-determined criteria are met Predict Apply advanced analytics of operational and business data to identify issues, root causes, and potential solutions Companies are moving from a reactive to a proactive approach to problem solving. Improve the Business How can I improve my service quality while reducing costs? Improve Customer Satisfaction How can I choose the best course of action at the right time? Sense and Respond Quicker How can I prioritize costly activities while reducing operational risks? 2 2
Transforming Enterprise Data Into Business Value Machines Aiding Humans in Decision Making Traditional Analytics (BI) Approach Aggregate Visualize Users manually analyze aggregated data with visualization tools and then must choose the best course of action without additional help from the system Train Model Prepare Data Apply Model Machines automatically detect conditions through continuous analysis and can prescribe contextually relevant actions directly into applications and processes Monitor Predictive Analytics / Machine Learning Approach 3
SAP s Approach to Solving Problems with Predictive Analytics An End-to-End Platform to Meet The Needs of All Stakeholders Managers (Executive Sponsors) Business Analysts & Data Scientists (Producers) Driver of the business & its KPIs Defines strategy, problems, success metrics Evaluating longer-term solution viability 02 Domain experts focusing on solving business problems Target users: looking for solutions to help w/ problem Evaluating automation to help scale to more problems 01 03 Data / IT Administrators (Enablers) Controls for access, integrations, and enterprise landscape Looking to link solutions together to meet user requirements Evaluating enterprise-wide applicability 04 Business Users (Consumers) Downstream consumers of business insights for decision-making Expects insights directly in existing apps/workflows Evaluating ease of consumption & integration of final results 4
Various components of SAP Predictive Analytics Data Manager Expert Analytics Automated Analytics Predictive Factory Predictive Factory Predictive Service Business Problem Ad-Hoc Developments & Embed ML assets SAP Predictive Analytics 3.2 Predictive Service SDK/ API 5
SAP Predictive Analytics Core workflows Prepare Data with Data Manager Build robust Predictive Models quickly with Automated Modeler Build complex Predictive Pipelines with Expert Analytics Build analytical datasets with clicks, not code Create thousands of derived features to increase predictive accuracy Automate dataset production & create reusable transformations Identify which variables are changing over time with timestamped populations Generate automatically reusable SQL code with associated documentation Automate Predictive Modelling with Classification, Regression, Clustering, Time Series Forecast, Association Rules Identify automatically of key contributing variables on very wide datasets Automate executive and operational reports In Database Execution Automated Predictive Library (APL) on SAP HANA & Native Spark Modelling on Hadoop Easy to Use - Drag-and-drop data selection, preparation and predictive modelling Use the predictive models in SAP HANA such as Unified Demand Forecast (UDF), Predictive Analytics Library (PAL) & APL Leverage 8000+ existing R functions and libraries Embed the models in external SAP applications 6
SAP Predictive Analytics Core workflows Link Analysis & Recommendation Scoring Operationalization with Predictive Factory Extract variables for enhanced link analysis and prediction Identify communities amongst your customers Find influencers within communities to focus efforts where they count the most Create personalized recommendations for each visitor In-database scoring using SQL Interface with business applications using scoring equations and code: SQL, Java, PMML, SAS, C, C++ Real Time Scoring on SAP HANA and Spark Streaming environments Manage lifecycle of thousands of models in parallel, whatever their origin (Automated Modeler & Expert Analytics) Schedule model automated application to new data Detect data deviation & retrain model automatically when required Event and time based scheduling Segmented Time Series Modelling 7
SAP AUTOMATED ALGORITHMS: A FOUNDATION FOR AUTOMATED MODELER Automated algorithms is not only a powerful mathematical algorithm it automates: Variable Selection Data Preparation Variable Encoding Missing Value Handling Outlier Handling Binning and banding Regression/Classification Structural Risk Minimization (SRM) Vapnik Theory Non parametric approach Model Testing and best model selection Optimal balance between simple (under-trained) model and complex (over-trained) model https://en.wikipedia.org/wiki/vapnik%e2 %80%93Chervonenkis_theory 8
Performance Indicator: Predictive Power (KI) Description Calculation Predictive Power (KI) Quality Indicator, estimator of the training error. How close is model to the perfect model KI = Range 0 to 1 Area of model performance on validation data (blue bump) Area of wizard model performance (green triangle) Interpretation Higher values indicate higher quality KI > 0 is better than random model To improve Add predictors 9
Performance Indicator: Predictive Confidence (KR) Description Calculation Predictive Confidence (KR) Robustness Indicator, estimator of the generalization error. How similar is performance on estimation and validation data. KR = 1 Range 0 to 1 Interpretation Higher values indicate higher robustness KR > 0.95 is considered a robust model Area between model performance on estimation and validation data Area of wizard model performance (green triangle) To improve Add rows of data with positive cases 10
SAP Predictive Analytics: Automated Analytics 3 Months Data aggregation Sampling Predictive model creation Testing Data preprocessing Interpretation Application to business Automated and simplified by SAP Predictive Analytics Optimal model selected Simple GUI Automated Automated Simplified automatically Application to business 1 Week Real Life Example: 1 person x 7 days = 400 models vs. 6 people x 8 weeks = 20 models 11
Flexibility and power of Expert Analytics Rich Pre-built Modelling Functionality Classification Regression Anomaly Detection Association Rules Clustering Time Series Analysis Data preparation functions Advanced Visualization Direct access to Advanced Visualizations Superset solution includes SAP Lumira library Stunning visualizations Ease of Use Drag & drop data selection, preparation, processing Easy sharing /collaboration of findings Built for business analysts Reusable models In HANA models With R language Share with colleagues Use in external applications Extensions using R scripts Native installer included ~12 R algorithms included 5000+ R Model library and growing Custom R components Easily share custom R components Integration Native integration with SAP HANA (PAL & APL) Analyse data from Universes and BW Publish actionable results to mobile & BI clients 12
SAP HANA as a machine learning platform Data Preparation Binning Filter Normalisation Partition Sample Scaling Range Sentiment Analysis Classification C4.5 CHAID ABC Analysis Auto classification KNN Naïve Bayes Support Vector Machine Weighted Score Analysis Regression Auto Regression Exponential Regression Geometric Regression Logarithmic Regression Logistic Regression Multiple Linear Regression Polynomial Regression Y X Z Clustering Auto clustering DBScan K Means K Mediods Self-Organizing maps Hierarchical clustering Time series algorithm Double Exponential Smoothing Triple exponential smoothing Single exponential smoothing ARIMA Demand Forecasting Association Apriori FP Growth Outliners Detection Anomaly Detection Inter-quartile Range test Variance Test Model Performance Comparison Model Compare Model Statistics Optimizations 13
Predictive Factory Full predictive lifecycle from data preparation, model building/rebuilding, model evaluation, model deployment and monitoring, versioning Modeling automation Ability to automate models for multiples segments by creating the first one and letting Predictive Analytics complete the task Manage and monitor all of your Automated and Expert Models from a single interface No Coding, Just Configuration! Multi-user collaborative experience Designed for operations Notifications, alerts Historical reports on model performance IT-Governed Enterprise grade platform meets IT needs for governance, ease of use, security and complex deployment. Single install and configuration 14
SAP Predictive Factory tasks Secure Models performance & accuracy SAP Predictive Factory tasks combination to retrain automatically a propensity score and apply it to a new customers dataset in order to target the best customers and improve the marketing action ROI Every 15 th of the month at 1:00am If there is NO deviation If there are deviations No coding! 15
Data Science & Machine Learning Portfolio Data Scientists & Citizen Data Scientists Line of Business User SAP Analytics Cloud Data Manager Automated Modeler Expert Modeler (Visual Composition Framework) SAP Predictive Analytics Predictive Factory SAP Fraud Management SAP Applications DB2 Oracle Teradata etc Hadoop / Spark Vora Spatial On Premise Text Analytics Streaming Analytics Graph Series Data HANA Predictive & Machine Learning HANA DBaaS Predictive (PAL/APL) Big Data Services Developers and Data Scientists Functional services Business services Leonardo Machine Learning Foundation DB Hadoop SAP HANA SAP Cloud Platform 16
Customer SAP Predictive Analytics and Big Data 1. Support for end-to-end operational predictive lifecycle on Hadoop 2. Business Analyst Friendly No coding required with Automated Analytics 3. Data Scientist Friendly Hive/Vora Connectivity 4. Spark Specific Push the data intensive modeling workload to Native Spark SQL for Analytical Dataset definition Real Time Scoring via Spark Streaming API Advanced Analytics Execution Layer Analytics Dataset Definition Layer Model Lifecycle Manager (Factory) Modeler - Training Native Spark Hive (SQL) In-DB scoring (Spark /Hive QL) Spark SQL HDFS Hadoop Cluster Scorer Spark Streaming (Java Export) Predictive Analytics Data Manager SAP VORA 17
Customer Big Data in SAP Predictive Analytics End-to-End 18
Predictive Flows using Spark MLlib Planned Innovation Data Scientists can build Expert Models on Hadoop using Spark ML library (like R) Complex pipeline can be build and executed on Spark Predictive models will be stored and managed within Hadoop Many open source tools today but end to end operationalization will be our key differentiator 19
20
Customer SAP Predictive with HANA Vora : Training on Vora & Scoring on HANA SAP Predictive Analytics Scoring in HANA using IDBA or SQL with PAL/APL using HANA connectivity Native Spark Modeling for Training using SparkSQL connectivity In-Memory Store Application Services Processing Services Spark Data-source API enhancement Vora Spark Vora Spark Vora Spark YARN Database Services Integration Services SAP HANA Platform HANA Smart Data Access, UDFs, Others Files Files Files HDFS *Note: Currently not feasible due to technical limitations in connecting to Vora. 21
Predictive Analytics in SCP Big Data Services Fast time to value On Premise Client / Server Mode PA Client Data Manager Modeler Scorer Predictive Factory Easier, faster scalability Operations support SAP Cloud Platform Big Data Services Lower TCO Cloud Spark Executer Native Spark Modeling Workbench PA Server Automated Machine learning 22
SAP Predictive Factory Vision Machine Learning for operations through a collaborative enterprise solution Machine Learning for Operations Models deployment & lifecycle management is the pilar of Predictive Factory workflows More models, more accurate, through automation of models creation and lifecycle management Collaborative Enterprise solution Open Machine Learning process to non-expert through guided workflows Improve collaboration between Data Scientists - Business Analysts - Data Administrator IT Automate as much as possible No coding! 23
Thanks for attending this session. Contact information: tania.perez.asensio@sap.com 24