Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Similar documents
SAP Real-time Data Platform 9 th October Matteo Losi Head of Presales and Business Development Italy Italy EMEA

What s making SAP HANA the most powerful platform? Andrew Tao, SAP July 26, 2016

SAP experience Day SAP BW/4HANA. 21 marzo 2018

SAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services

Cloud Based Analytics for SAP

IBM Spectrum Scale. Advanced storage management of unstructured data for cloud, big data, analytics, objects and more. Highlights

Connect the unconnected

Making BI Easier An Introduction to Vectorwise

Predictive Analytics Reimagined for the Digital Enterprise

Active Analytics Overview

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

Von anwendungsspezifischen Datenbanken zur integrierten «SAP Realtime Data Platform»

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

How In-Memory Computing can Maximize the Performance of Modern Payments

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

SAP BW/4HANA. Next Generation Data Warehouse. Simon Iglesias Analytics Solution Sales. Internal

Big Data The Big Story

Safe Harbor Statement

Hortonworks Connected Data Platforms

Ensuring Trust in Big Data with SAP EIM Solutions. Scott Barrett Senior Director, Information Management Database & Technology Centre of Excellence

S/4 HANA Introduction & Roadmap. Dr. Bjoern Ganzhorn Enterprise Architect - SAP Americas Inc.

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

REPENSEZ VOTRE STRATÉGIE SAP ET ENTREZ DANS LE CLOUD HYBRIDE

Bringing the Power of SAS to Hadoop Title

MR TIGER KIU. Leading New ICT, Building A Better Connected World

Microsoft Azure Essentials

Let s distribute.. NOW: Modern Data Platform as Basis for Transformation and new Services

IBM PureData System for Analytics Overview

SAP HANA MADE SIMPLE WITH VALIDATED SOLUTIONS & CONVERGED SYSTEMS. Joakim Zetterblad, Director SAP Practice, EMEA

Transforming SAP Landscapes and HANA Analytics

5th Annual. Cloudera, Inc. All rights reserved.

SAP and Hadoop, better together

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Hybrid Data Management

Uwe Grigoleit, SAP SE. Run simple with our next-generation business suite

ETL on Hadoop What is Required

Analytics in Action transforming the way we use and consume information

SAP Cloud Platform Big Data Services EXTERNAL. SAP Cloud Platform Big Data Services From Data to Insight

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Oracle Integrates Virtual Tape Storage with Public Cloud Economics

DataAdapt Active Insight

SAP HANA erfolgreich im Einsatz. Jürgen Karnstädt HP SAP Competence Center Walldorf Mai 2014

Painless Migration from Centera to ECS

SAP and Cloud & Lifecycle Management Making the Cloud happen. Andreas Holzapfel, Product Manager Alketa Ramaj, Developer

Data-Centric Innovation How customers are building competitive advantage around data Martin Guther VP Digital Enterprise Platform, SAP

SAP BW 7.5 SP1 powered by SAP HANA Overview & Roadmap

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

SAP Predictive Analytics Suite

A Paradigm shift of Data modeling in HANA based SAP BW environment

Alessandra Brasca Il p unto di vista I BM B

2014 SAP SE or an SAP affiliate company. All rights reserved. 1

Microsoft reinvents sales processing and financial reporting with Azure

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

IBM Analytics Unleash the power of data with Apache Spark

Evolution to Revolution: Big Data 2.0

SAP Product Road Map SAP Identity Management

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved.

Spark and Hadoop Perfect Together

Conquer Dark Data by Unleasing the Power of SAP Analytics

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

Insights-Driven Operations with SAP HANA and Cloudera Enterprise

Hadoop Stories. Tim Marston. Director, Regional Alliances Page 1. Hortonworks Inc All Rights Reserved

WELCOME TO. Cloud Data Services: The Art of the Possible

The missing link in your BI strategy. Cloud Analytics

VCE VBLOCK SYSTEMS. The Leading Converged Infrastructure. Copyright 2013 EMC Corporation. All rights reserved.

DATENBANK TRANSFORMATION WARUM, WOHIN UND WIE? 15. Mai 2018

Copyright 2015 EMC Corporation. All rights reserved. STRATEGIC FORUM 2015 PAUL MARITZ CEO, PIVOTAL SOFTWARE

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Building data-driven applications with SAP Data Hub and Amazon Web Services

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Evolving Your Infrastructure to Cloud

Smart Content Navigation with the SAP HANA Platform. Georg Nold (Springer Science+Business Media) and Philipp Scholl (SAP AG) May 15, 2013

SAP Adaptive Server Platform Edition Deployment Strategy for the Certainty of Change

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Quinnox BI OBIEE Solution. For more information, visit.

Use Emerging Technologies to Gain a Competitive Advantage in the Market. Jason Glenn - Dell, Inc.

Actionable Insights with PI Integrators

Managing Data Warehouse Growth in the New Era of Big Data

Spotlight Sessions. Nik Rouda. Director of Product Marketing Cloudera, Inc. All rights reserved. 1

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

S/4HANA und was nun? Mit SAP Consultingpaketen sicher und schnell in die Realtimewelt. SAP IT Summit 2015

Leveraging smart meter data for electric utilities:

SAP BusinessObjects Lumira Product Availability Matrix (PAM)

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

Hadoop Integration Deep Dive

How CloudEndure Works

Leveraging smart meter data for electric utilities:

zenterprise Update and Positioning with PureSystems

The SAP BusinessObjects Difference

SAP BusinessObjects Lumira Expert Webinar Series

Analytics in the Digital Economy data, experience, ideas & people. Juergen Hagedorn, Viktor Kehayov Product Management, SAP Analytics March 2017

StackIQ Enterprise Data Reference Architecture

10 Ways Oracle Cloud Is Better Than AWS

SAP Simple Finance The Future of Finance. Angélica Bedoya, Center of Excellence, SAP LAC Abril, 2015

Transcription:

Analyze Big Data Faster and Store it Cheaper Dominick Huang CenterPoint Energy Russell Hull - SAP

ABOUT CENTERPOINT ENERGY, INC. Publicly traded on New York Stock Exchange Headquartered in Houston, Texas Over 5000 square miles of electric transmission and distribution service area Assets total more than $22 billion Over 8,700 plus employees CNP & its predecessor companies in business for over 130 years Domestic Energy Delivery Operate, Serve, and Grow Smart Grid Enabled Twenty-Eight State Geography Over Five Million Metered Customers 2.3 million Smart Meters 4000 Miles of Transmission 47,000 Miles of Distribution Electric Transmission & Distribution Natural Gas Distribution Competitive Natural Gas Sales and Services CenterPoint Energy Proprietary and Confidential

AGENDA Key Drivers and Strategy of HANA Initiative Use Case Smart Meter Big Data Analytics Technology Overview POC Results Value and Comparison

KEY DRIVERS FOR HANA INITIATIVES SAP HANA as CNP strategic platform for critical transactional applications and Analytics Cost effective solution to manage and contain data storage growth Analytics platform simplification and consolidation to HANA Key technology enabler for future business solutions Maximize CNP investment on HANA license (40TB) Enable business resiliency implementation for CRM/ECC/BPC Leverage HANA in-memory capability for real time analytics

STRATEGY 3 YEAR HANA ROADMAP Technical Migration and Consolidation Migrate critical business applications (SAP and Mainframe) Consolidate Analytics solutions (BW, ISAS, ema, etc.) onto HANA HANA Platform Optimization Enhance performance of core business process and mass business functions Enable real-time reporting from the HANA (in-memory) database HANA Platform Innovation Innovative solutions to align with long-term business strategy and roadmap SIMPLE Finance, Predictive Asset Health Analytics, Situational Awareness, Internet of Things, Predictive Analytics for customer services, etc.

USE CASE SMART METER BIG DATA ANALYTICS

BUSINESS CHALLENGE 1+ PB of SmartMeter Data 2.3MM SmartMeters taking readings every 15 minutes creating 225MM Readings per day, or over 800 Billion Readings in a Year. Regulatory requirements require historical readings to be available for 10 years. Uncompressed Data Growth of 8TB per month and over 1PB in a 10 year period. Current DW technology is approaching End of Life Massive amounts of data stored in proprietary vendor solution, was hard to manage and has a significantly high total cost of ownership. Need a cost effective solution for today's analytics, regulatory requirements and preparation for future use cases. CenterPoint Energy Proprietary and Confidential

DATA TIER SOLUTION DATA VOLUME MANAGEMENT: MULTI TEMPERATURE DATA APPROACH hot Data is read and/or written frequently In memory No restrictions, all features available warm cold Non-Active Data Concept Infrequent access On disk, no need to keep in memory all the time No restrictions, all features available NLS Management for read-only data Sporadic access Not stored in HANA DB; stored in Near-line Storage Restricted to NLS capabilities Providing lower TCO by optimized data volume management

BUSINESS CASE CAPEX & OPEX SAVINGS 1400 1200 1000 800 600 400 280 380 Projected Data Capacity (TB) 480 580 680 780 880 980 1080 1180 Millions 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 $25 $20 $15 $10 $5 Projected Total Spend (Cumulative & Estimated) Capex Saving O&M Saving 200 0 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 $0 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 HANA O&M HANA Capital NZ O&M NZ Capital Projected Growth Business as usual Move to HANA/Hadoop Projected Savings 75% Capex and Opex saving Smart Meter Data grows more than 100TB/year, 1PB+ in 10 years CenterPoint Energy Proprietary and Confidential Information 9

SOLUTION BENEFITS Cost effective HOT+WARM+COLD data management strategy leveraging HANA data compression and data tiering technology Simplified Big Data ownership by combining SAP HANA, Dynamic Tiering and Hadoop into a single landscape. Single Database Experience. Query Execution utilizes SDA and automatically accesses data stored in HANA, Dynamic Tiering and Hadoop/Vora depending on location of data. Data Movement automated between storage tiers using the Database Lifecycle Manager (DLM). Foundation for advanced predictive analytics and future business capabilities Instant Real time Analytics via HANA 75% savings in storage cost compared to current solution. Data tiering technology (Dynamic Tiering, Hadoop) to manage data size and growth. Seamless integration with Hadoop integration allows for data scientist to use HANA toolset to access and manage Hadoop data Ability to charge business based on the data being stored and performance requirements

TECHNOLOGY REVIEW

SAP Big Data Platform CenterPoint Energy Proprietary and Confidential

NEW SMART METER ANALYTICS ARCHITECTURE Current Architecture Planned Architecture Application Business Objects / SAS / Custom Application Storage Tiers (Costs and Performance) Aggregation Aging Tier 0 (Memory) Speed Layer Tier 1 (SAN,..) Tier 2 (Hadoop) Batch Layer Netezza zos DLM 36TB HANA EDW 50TB Dynamic Tiering Extended Storage Hadoop (Vora) 750TB 1 2 3 1 2 3 13 months of data are stored in HANA for fast analytics 26 months of data are stored in DT (Sybase IQ) 10 years of meter data is stored in Hadoop. The plan is to use SAP HANA Vora to access the data

DYNAMIC TIERING SAP Dynamic Tiering is a warm store traditional disk based database system fully integrated into HANA. Based upon Sybase IQ: Column Store & Disk based Reduced TCO by lowering HANA memory footprint All HANA functions are available. Read/Write/Update Single Database experience: All DB access requests are managed through the HANA platform. Centralized operation control: All administration tasks are handled through the HANA interface.

SAP HANA DYNAMIC TIERING DISK-BACKED COLUMN STORE EXTENSION TO HANA FOR WARM DATA MANAGEMENT

WHAT IS APACHE HADOOP?

HADOOP TECHNICAL ARCHITECTURE HADOOP CLUSTER

SAP VORA - HANA/HADOOP INTEGRATION WHAT S INSIDE AND WHAT DOES IT DO? Drill Downs on HDFS Mashup API Enhancements Compiled Queries HANA-Spark Adapter Unified Landscape Open Programming SAP HANA Vora is an in-memory query engine which leverages and extends the Apache Spark execution framework to provide enriched interactive analytics on Hadoop. Make Precision Decisions Democratize Data Access Simplify Big Data Ownership Any Hadoop Clusters

SAP DATA LIFECYCLE MANAGER (DLM)

SAP DATA TIERING ARCHITECTURE HANA Index Server Hadoop Spark Processing Engines Spark SQL In-Memory Stores SDA (Virtual Table) HANA Spark Controller DLM Reads Data from HANA Vora Upload Table into Vora Dynamic Tiering XS Engine HDFS Data Lifecycle Manager Extended Storage (DLM) Files Files Files DLM Writes Data DLM Writes Data to ORC File

POC REVIEW

POC OBJECTIVES Research and test SAP HANA Data Tiering technology, i.e. DLM (Data Life Cycle Management), Dynamic Tiering, Vora Hadoop Integration Evaluate Hadoop technology, understand Hadoop ecosystem and TCO Test SAP VORA - HANA and Hadoop integration technology Develop and validate solution options for several critical 2016 projects: Smart Meter Analytics, customer document repository for Mainframe Migration Build CNP in-house expertise in Hadoop and SAP HANA/Hadoop integration technology Identify use case and innovation opportunities at CNP

POC ENVIRONMENT AND TEST CASES POC Team CenterPointEnergy (Lead and Architects); SAP (CoE, PE, Global ITP); HP (Hardware); IBM(IBM Hadoop and Cloud) Environment Test Cases Hardware HP Lab: Hadoop 12 nodes cluster, CS500 HANA, HANA Dynamic Tiering Node IBM BigInsights Cloud Software SAP HANA SPS 10, DLM, Dynamic Tiering, VORA Hortonworks HDP Hadoop, RedHat Linux IBM Apache Hadoop with BigSQL Data Load - Extract 800GB, 7 Billion Smart Meter records from Netezza and ISAS, load data into HANA (Meter data scrambled to protect data security) DLM Use DLM tool to move data from HANA to Dynamic Tiering Extended Storage and Hadoop Run queries across all data tiers and measure performance Load, query and display 19 million PDFs of Customer Bills (Dummy PDF files used, no real customer data)

POC SUCCESS CRITERIA Data Tiering Move data among different tiers including HANA, DT and Hadoop Run SQL queries within and across data tiers Performance Measure response time for each data tier Data Compression evaluate compression ratio of HANA, DT and Hadoop SAP DLM Utilize the tool to move data from Hot to Warm and Cold tier Customer document storage Store and retrieve PDF documents with one second Comparison of storage costs: HANA, DT (Dynamic Tiering Extended Storage) and Hadoop

POC TEST RESULTS Hadoop HANA / DT / Spark/ Vora DLM HDP Customer Bill Store and Retrieval 40ms response time to search and display a document from 19 million PDFs HDP Batch data load via SQOOP into Hadoop 4 min 24s to load 2.5 million records (single thread);1 min 10s (10 threads) Data load from HANA to HDP Hadoop via VORA Total of 6.2GB ORC files stored in HDFS against original size of 172GB. Compression Rate: 9 (3 copies in HDFS) Run aggregation query across SAP HANA, HDP Hadoop & DT (~4 billion records): Response Time [s] 400 350 300 250 200 150 100 50 0 Query Response Time [s] 0.2 2.6 360 19 Move data from HANA to DT 289 million records moved from HANA to DT 670K records per minute Move data from HANA to Hadoop via VORA into HDFS 1.57 billion records moved from HANA to Hadoop 22 million records per minute

VALUE AND COMPARISON BETWEEN DATA TIERS

COMPARISON BETWEEN DATA TIERS Component Performance Cost Factor Volume Processing HANA $$$$ Up to 10s TBs (no technical limit) ACID compliant SQL, SQLscript, graph, time series, spatial, text, Dynamic Tiering or Sybase IQ $$ 100s of TB integrated in HANA Several PBs with Sybase IQ ACID compliant SQL Hadoop Spark/Vora $ 100s of PB or more ANSI SQL compliant Read-only SQL when used from HANA via SDA 15 times less expensive than T1 storage Transformations and Actions Performance can be improved significantly by increasing compute nodes and using SSD with higher cost Hadoop Vora in Memory $$ 100s of TB (depending on available memory in Hadoop cluster) Data loaded in memory to achieve better performance Read-only SQL when used from HANA via SDA

RECOMMENDED USE CASES SHORT TERM Component HANA Dynamic Tiering Hadoop - Spark Hadoop - Vora Recommended Use Case Managing up to several TBs of high value data Very high processing performance required SAP HANA native processing features (PAL,..) required OLTP with many fine-granular updates needed Managing up to several PBs of data at T2/T3 storage cost High performance for complex queries required Deep SAP HANA integration required (single database experience) Updates and deletes required Managing up to 100s PBs of data at T4 storage cost, 15 times less expensive than T1 storage Read-only sufficient (bulk load, no fine granular writes) Comparatively low-cost storage important Loose integration of administration and life-cycle management acceptable High OLAP query performance on Hadoop Additional query features (hierarchies)

THANK YOU Contact information: Dominick Huang Sr. Manager, Enterprise Technology & Architecture CenterPoint Energy Yong.huang@centerpointenergy.com Tel 713-207-6659 Russell Hull Chief Support Architect SAP America Russell.hull@sap.com

FOLLOW US Thank you for your time Follow us on at @ASUG365

APPENDIX

CNP HANA LANDSCAPE - ANALYTICS (BW + OW) Analytics (BW + OW) ES(NLS/DT/Hadoop) ES(NLS/DT/Ha doop) 0.5TB 0.5TB 0.5TB 0.5TB 0.5TB 0.5TB 0.5TB 0.5TB 0.5TB 0.25TB 0.5TB Existing blade New HP Node 2 TB Failover blade ES Extended Storage (NLS/DT/Hadoop) HIP(PRD) 36TB (Memory) HIQ(QA) HID(DEV) 1 Situation Awareness, MfM Testing & other Apps 4.5TBs HIS (SBX)

HADOOP ARCHITECTURE