SAP Machine Learning for Hadoop Customer
SAP BusinessObjects Predictive Analytics and Big Data 1. Support for end-to-end operational predictive lifecycle on Hadoop 2. Business Analyst Friendly No coding required with Automated Analytics 3. Data Scientist Friendly Hive Connectivity Support for Hive Support for Wide datasets Supports data preparation & scoring directly in Hadoop - No data transfer 4. Spark Specific Push the data intensive modelling workload to Native Spark SQL still used for Analytical Dataset definition but minimal data transfer requirement Real Time Scoring via Spark Streaming API Advanced Analytics Execution Layer Analytics Dataset Definition Layer Hive (SQL) Model Manager (Predictive Factory) Modeler - Training Native Spark In-Database Scoring (Spark SQL/Hive QL) Spark SQL Scorer Predictive Analytics Data Manager HDFS (Hadoop Distributed File System) Spark Streaming (Java Export) Direct to HDFS 2016 SAP SE or an SAP affiliate company. All rights reserved. 2
Model Lifecycle in Automated Analytics -Hadoop OBDC/CSV Connectivity Support Simplified Modelling Process Prepare Data Generate Model Visualize Model Apply Model (Scoring) Automation - Industrial data mining Automated Analytics Engine (C++ Kernel) Data Connectivity Layer (ODBC, csv) Hadoop connectivity = Hive or Spark SQL over ODBC Hadoop 2016 SAP SE or an SAP affiliate company. All rights reserved. 3
With Native Spark Modeling - Predictive Modeling Steps pushed down to Spark SAP Automated Analytics Modeler (orchestrates + process results) Automated Analytics Modeler <<Desktop or AA Server>> Spark Modeling Service Data Connectivity (ODBC and ) SQL odbc JSON sql hdfs->table JSON Run in parallel { (Scala) step1 Cross Statistics step2 Encoding data step3 Matrix (MLLib) step4 Scoring Equations step5.1 Cross Statistics step5.2 Performa nce Spark results cache shared between steps 2016 SAP SE or an SAP affiliate company. All rights reserved. 4
Traditional Tiered Architecture vs. Native Spark Modelling Full dataset brought to application for processing Limited Performance, Scalability Limited Data Processing on a single server SAP BusinessObjectsPredictive Analytics - Automated Data processing beside data Performance and scalability built-in 1000s of Nodes designed for cost effective Data Processing SAP BusinessObjects Predictive Analytics - Automated FULL Data Transfer No Data Transfer JSON Stats Native Spark Modeling SQL Database 2016 SAP SE or an SAP affiliate company. All rights reserved. 5
Case Study: Demo Company - Find New customers for Financial Services From 7 hours to 26 mins - with Native spark modelling for training Business Challenges Each model takes 1 day to train Configuration is single threaded; Limited to 20 models per month Very hard to perform manipulations when not using ADS in SAP PA Interactive data manipulation - after connectivity to Hadoop in Data manager Technical Requirements Training on 500K rows X 2000 vars, Scoring on 4 mil rows From xxx DB to Hadoop-Spark 40 node Hadoop cluster, plenty of free space today Ultimately higher profitability and growth - learning more models, hence better insights - targeting right customers Benefits Model training now available on Spark No data transfer from data source platform High scalability and better performance 2016 SAP SE or an SAP affiliate company. All rights reserved. 6
Roadmap Today Coming Soon Future 2016 SAP SE or an SAP affiliate company. All rights reserved. 7
2016 SAP SE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE s or its affiliated companies strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forwardlooking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions. 2016 SAP SE or an SAP affiliate company. All rights reserved. 8