Schaffen von Kundenwerten

Size: px
Start display at page:

Download "Schaffen von Kundenwerten"

Transcription

1 Schaffen von Kundenwerten Mit Cloud Native Apps & Analytics GLOBAL SPONSORS

2 Schaffen von Kundenwerten mit Cloud Native Apps & Analytics September 2017 Martin Keller Copyright 2017 Pivotal Software, Inc. All rights Reserved.

3 DIE DIGITALISIERUNG ÄNDERT ALLES. WANN ÄNDERT SICH DIE POLITIK?

4 Delivering information in context....personalized....in real-time Companies need to learn how to catch people or things in the act of doing something and affect the outcome

5 Great software companies leverage analytics and insights how do they accomplish that? Parallel Processing Data Science and Machine Learning Loosely-coupled Microservices Cloud Native Continuous Delivery Open Source Innovation

6 Smart Data Driven Apps Logistics Logistics 2016 Pivotal Software, Inc. All rights reserved. 6

7 Important Capabilities Ability to store and integrate volumes of data from multiple sources Moving beyond basic business intelligence and reporting to more sophisticated data science and predictive modeling techniques System must deliver insights about likely next actions in ways advisors can consume and take action on them Results of these actions must be fed back into the system to continually improve the predictive models Pivotal 2017

8 Data Architecture Pivotal Inc. DATA FEEDS DATASOURCES ANALYTIC APPS Fast Ingest / Pipelining Pipelines to consume streaming and batch data from various endpoints Distributed Memory-based Processing Realtime Data Insights Statistical Tools Expert System Machine Learning Advanced Analytics / MPP Raw Data Landing Zone

9 Pivotal Data Suite In-Memory Data Grid High Speed Ingestion Massively Parallel Architecture Hadoop Data Lakes Parallel Configurable Data Load In-DB Predictive Analytics Predefined Libraries GPText Programmatic Parallel Data Load and External Tables Analytical Data to cache Public Cloud Data Lakes PIVOTAL GEMFIRE PIVOTAL GREENPLUM (Data Warehouse) Cold Warm Hot Data Temperature

10 Pivotal Data Suite Open source data management portfolio PIVOTAL GREENPLUM Data warehouse database based on open source Greenplum Database PIVOTAL GEMFIRE Open source application and transaction data grid based on Apache Geode Complete Platform Mission Critical Deployment Options Open Source Flexible Licensing Advanced Data Analytics

11 Pivotal Data Suite Pivotal Data Suite Open data management portfolio Complete platform Based on open source OSS Support Spring XD & Spring Cloud Data Flow OSS Support PostgreSQL Deployment options Hadoop native SQL Flexible licensing Advanced data services

12 Spielwiese.. Connected Cars Anwendungsbeispiel

13 Connected Car Demo youtube link CONNECTED CAR PREDICT THE DESTINATION PREDICT THE RANGE Copyright 2015 Pivotal. All rights reserved. 13

14 Real-time car telematics Driving data from in-car OBD2 port In-depth view on driving Framework to train models on batch data and using for real-time prediction Predict journey destination and fuel consumption Build app in collaboration with Pivotal Labs Roads 1 many Cars many 1 Copyright 2013 Pivotal. All rights reserved.

15 Pivotal Offerings Companies need to learn how to catch people or things in the act of doing something and affect the outcome Data Suite: Spring Cloud Data Flow - open source data management GemFire: In-Memory Data Grid Greenplum: Data Warehouse Pivotal Cloud Foundry (PCF) Industry s Leading Cloud-Native Platform Pivotal Container Service (PKS) - Production-Grade Kubernetes Spring Boot, Cloud and Data Flow Modern-Java microservices framework Pivotal Labs & Data Science Build a smart app end-to-end Focus on a specific analytical model / data-microservice

16 2016 Pivotal Software, Inc. All rights reserved. 16

17 Pivotal Cloud Cache Explained In-memory caching as an on-demand, managed service on PCF

18 Pivotal Cloud Cache: In-Memory Performance In-memory performance with cloud-native scalability and availability Blazing fast reads x faster than disk High volume transactions High Availability Across application and caching layers Horizontally scalable architecture

19 Microservices Need Performance and Scalability Microservices with large, frequently accessed data sets need a cache layer App Instance In-memory cloud-native cache Performance and scalability of data Add servers to a shared Pivotal Cloud Cache cluster Reduces the pressure to scale rigid backing stores Enables availability and resilience 1

20 Prepackaged for Simple Consumption Developers get self-service access to Pivotal Cloud Cache on Pivotal Cloud Foundry Easy accessibility through Marketplace Instant Provisioning Single Sign-On Services Marketplace Pivotal Cloud Cache RabbitMQ Bind to apps through easy to use interface Circuit Breaker Service Directory Config Server Common access control and audit trails across services MySQL Signal Sciences New Relic Crunchy PostgreSQL AND MORE Redis 20

21 Rio São Paulo Web Application Web Application GemFire Cluster WAN Data Sync GemFire Cluster Oracle RAC Mainframe Oracle RAC Mainframe 21

22 GemFire GemFire Distributed, in-memory NoSQL data grid for big data apps that need: Scale-out performance Consistent database operations across globally distributed nodes High availability, resilience, and global scale Powerful & Standards-based developer features Easy administration of distributed nodes Based on Apache Geode (incubating) 2015 Pivotal Software, Inc. All rights reserved. 22

23 Pivotal GemFire Usage By the Numbers China Railways 5,700 train stations 4.6 million tickets per day 20 million daily users 3TB operational data in-memory 40,000 visits per second >1,500,000,000 Hits per day Indian Railways 7,000 stations 23 million passengers daily 120,000 concurrent users 10,000 transactions per minute >1,200,000,000 Hits per day World: ~7,349,000,000 ~37% of the world population 23

24 Pivotal Greenplum Run Anywhere, Mature, OSS, Analytical MPP Copyright 2017 Pivotal Software, Inc. All rights Reserved.

25 WHAT IS GREENPLUM? AN OPEN SOURCE DATA WAREHOUSE BATTLE TESTED IN PRODUCTION BUILT FOR DIVERSE ANALYTICAL USE CASES AVAILABLE ANYWHERE YOU NEED IT

26 The Pivotal Greenplum Database is A Highly-Scalable, Shared- Nothing Database Leading MPP architecture, including a patented nextgeneration optimizer Optimized architecture and features for loading and queries Start small, scale as needed Polymorphic storage, compression, partitioning A Platform for Advanced Analytics on Any (and All) Data Rich ecosystem (SAS, R, BI & ETL tools) In-DB Analytics (MADlib, Custom, languages: R, Java, Python, PERL, C, C++) High degree of SQL completeness so analysts can use a language they know Domain: Geospatial, Text processing (GPText) An Enterprise Ready Platform Capable of Flexing With Your Needs Available as needed either as an appliance or software Secures data in-place, in flight, and with authentication to suit Capable of managing a variety of mixed workloads Copyright 2013 Pivotal. All rights reserved. 26

27 Functions Generalized Linear Models Linear Regression Logistic Regression Multinomial Logistic Regression Ordinal Regression Cox Proportional Hazards Regression Elastic Net Regularization Robust Variance (Huber-White), Clustered Variance, Marginal Effects Matrix Factorization Singular Value Decomposition (SVD) Low Rank Linear Systems Sparse and Dense Solvers Linear Algebra Graph Single Source Shortest Path Page Rank Other Machine Learning Algorithms Principal Component Analysis (PCA) Association Rules (Apriori) Topic Modeling (Parallel LDA) Decision Trees Random Forest Conditional Random Field (CRF) Clustering (K-means) Cross Validation Naïve Bayes Support Vector Machines (SVM) Prediction Metrics K-Nearest Neighbors Time Series ARIMA Path Functions Operations on Pattern Matches Descriptive Statistics Sketch-Based Estimators CountMin (Cormode-Muth.) FM (Flajolet-Martin) MFV (Most Frequent Values) Correlation and Covariance Summary Inferential Statistics Hypothesis Tests Utility Modules Array and Matrix Operations Sparse Vectors Random Sampling Probability Functions Data Preparation PMML Export Conjugate Gradient Stemming Sessionization Pivot Jan 2017

28 Procedural Languages User Defined Types User Defined Functions User Defined Aggregates Import of libraries from open source

29 Greenplum Geospatial Big Data Current Key Features: Points, Lines, Polygons, Perimeter, Area, Intersection, Contains, Distance, Long/Lat Raster Support Round earth calculations Spatial Indexes & Bounding Boxes

30 Integrated Text Analytics GPText: SQL Warehousing + Text Leveraging Apache Solr and GPDB 5 years commercial production experience Madlib integration for machine learning on text data PL/Python and PL/Java integration for Natural Language Processing Use Cases Communications compliance and monitoring Customer Sentiment analysis Document Search and Query Social Media Processing, etc.

31 DANKE! Fragen?

32 Backup

33 Greenplum Hadoop & Cloud Connectors Operational Analytics & SQL SLA Driven & Iterative War Hot m Parallel High Speed SQL Transfer Data Temperature Data Lake & Cold Storage Batch & AdHoc Public & Private Data Lakes War Cold m

34 Gemfire Greenplum Connector (GGC) Hot App 1App 1App 1 App 2App 2App 2 Push Updates Transactional Native API Rest / HTTP Custom Apps Transactional data Write behind Parallel Configurable Data Load Analytical Data to cache Data Temperature Analytical ANSI SQL Data science, analytics & ML War m