Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Similar documents
ETL on Hadoop What is Required

Microsoft Azure Essentials

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Operational Hadoop and the Lambda Architecture for Streaming Data

Microsoft Big Data. Solution Brief

Oracle Big Data Cloud Service

Hybrid Data Management

Bringing the Power of SAS to Hadoop Title

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

T H E B O T T O M L I N E

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Hadoop and Analytics at CERN IT CERN IT-DB

Oracle Big Data Discovery The Visual Face of Big Data

Table of Contents: FOREWORD EXECUTIVE SUMMARY EVOLUTION OF ETL ARCHITECTURE: CONCLUSION CUSTOMER CASE STUDIES:

Integrating MATLAB Analytics into Enterprise Applications

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

Big Data Job Descriptions. Software Engineer - Algorithms

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Simplifying Your Modern Data Architecture Footprint

Big Data The Big Story

Oracle Engineered Systems and WeDo Technologies

Cloud Based Analytics for SAP

Oracle Autonomous Data Warehouse Cloud

ARCHITECTURES ADVANCED ANALYTICS & IOT. Presented by: Orion Gebremedhin. Marc Lobree. Director of Technology, Data & Analytics

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

E-guide Hadoop Big Data Platforms Buyer s Guide part 3

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

Oracle Autonomous Data Warehouse Cloud

In-Memory Analytics: Get Faster, Better Insights from Big Data

Berkeley Data Analytics Stack (BDAS) Overview

Fast Start Business Analytics with Power BI

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Oracle's Cloud Strategie für den Geschäftserfolg Alles Neue von der OOW

How to Build Your Data Ecosystem with Tableau on AWS

Microsoft Dynamics 365 and Columbus

SAS & HADOOP ANALYTICS ON BIG DATA

Evolution to Revolution: Big Data 2.0

Building Your Big Data Team

Oracle Big Data Discovery Cloud Service

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

Corporate Overview CRM. the Cloud

InfoSphere Warehousing 9.5

Going beyond today: Extending the platform for cloud, mobile and analytics

How to sell Azure to SMB customers. Paul Bowkett Microsoft NZ

IBM Big Data Summit 2012

Sanoma Big Data Migration Sander Kieft

IBM SPSS & Apache Spark

Architecture Overview for Data Analytics Deployments

MIGRATING AND MANAGING MICROSOFT WORKLOADS ON AWS WITH DATAPIPE DATAPIPE.COM

The Mainframe s Relevance in the Digital World

Communications in the Cloud:

Hadoop in the Cloud. Ryan Lippert, Cloudera Product Cloudera, Inc. All rights reserved.

IBM Systems for Oracle Fusion Middleware

Increase Value and Reduce Total Cost of Ownership and Complexity with Oracle PaaS

DELL EMC HADOOP SOLUTIONS

Data Lake or Data Swamp?

Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy

Transforming Big Data to Business Benefits

Embracing Disruption: Financial Services and the Microsoft Cloud

Real-Time Streaming: IMS to Apache Kafka and Hadoop

1. Intoduction to Hadoop

Delivering Data Warehousing as a Cloud Service

An Overview of the AWS Cloud Adoption Framework

Building a Data Lake with Spark and Cassandra Brendon Smith & Mayur Ladwa

Cask Data Application Platform (CDAP) The Integrated Platform for Developers and Organizations to Build, Deploy, and Manage Data Applications

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

Ventana Research Big Data and Information Management Research in 2017

Sr. Sergio Rodríguez de Guzmán CTO PUE

Let s distribute.. NOW: Modern Data Platform as Basis for Transformation and new Services

THE DATA WAREHOUSE EVOLVED: A FOUNDATION FOR ANALYTICAL EXCELLENCE

Hortonworks Powering the Future of Data

IoT Enabled Technologies Delivering Actionable Insights for the Telecom Industry

The Evolution of Big Data

GE Intelligent Platforms. Proficy Historian HD

Data Engineer. Purpose of the position. Organisational position / Virtual Team. Direct Reports: Date Created: July 2017

EMC ATMOS. Managing big data in the cloud A PROVEN WAY TO INCORPORATE CLOUD BENEFITS INTO YOUR BUSINESS ATMOS FEATURES ESSENTIALS

EBOOK: Cloudwick Powering the Digital Enterprise

Big Data: Essential Elements to a Successful Modernization Strategy

Machina Research White Paper for ABO DATA. Data aware platforms deliver a differentiated service in M2M, IoT and Big Data

A Cloud Migration Checklist

SAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services

: Boosting Business Returns with Faster and Smarter Data Lakes

Data Center Transformation. Human Centric Innovation In Action

Smart Mortgage Lending

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

WebFOCUS: Business Intelligence and Analytics Platform

Brian Macdonald Big Data & Analytics Specialist - Oracle

Stuck with Power BI? Get Pyramid Starting at $0/month. Start Moving with the Analytics OS

More information for FREE VS ENTERPRISE LICENCE :

Microsoft FastTrack For Azure Service Level Description

IBM Analytics Six reasons to upgrade your database

NFLABS SIMPLIFYING BIG DATA. Real &me, interac&ve data analy&cs pla4orm for Hadoop

Application Migration to Cloud Best Practices Guide

Transcription:

Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica

Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud and Hybrid Machine Learning Internet of Things

Replacing Legacy Solutions that Can Better Scale Increasing data volumes More concurrent users High cost of legacy solutions Analytics demand increase

Legacy Analytical Platform Raw data Meaningful Information Connected Systems Data Warehouse Information Consumers Benefits: It s mature and it works, as long as I manage data and users I know it well and have staff on hand to run it The analytical functions are mature It would be costly to replace Operational Data Stores Limitations: It s expensive and becomes more so as I increase my data It has real limitations on data size, number of users My users want more access to analytics, but resources limited I spend a lot on consultants who are tuning it

Modernized Analytical Platform Data Warehouse Connected Systems Information Consumers Benefits: Lowers cost by delivering analytics with matching SLA Faster analytics and more users supported with MPP Data science supported MPP Database Data Lake

HPE Vertica Analytics Platform Fast Boost performance by 500% or more Scalable Handles huge workloads at high speeds Standard No need to learn new languages or add complexity Costs Significantly lower cost over legacy platforms 6

Freedom to Deploy Anywhere On Commodity Hardware, Across Multiple Cloud Platforms, and Natively on Hadoop At the Edge On Premise In the Clouds On Hadoop

Secrets to Achieving Performance Increases Columnar Storage Compression MPP Scale-Out Distributed Query Projections Speeds Query Time by Reading Only Necessary Data Lowers costly I/O to boost overall performance Provides high scalability on clusters with no name node or other single point of failure Any node can initiate the queries and use other nodes for work. No single point of failure Combine high availability with special optimizations for query performance A B D C E A CPU Memory Disk 8

The Appeal of Vertica Requirement Extreme Optimization Proof Columnar design for high performance analytics Aggressive compression Scalable to petabyte scale Total Cost of Ownership Simply and predictable pricing No penalty for additional hardware or connected users Ready for your Enterprise Open and Compatible SQL compliant to 100% of the TPC-DS benchmark queries Secure and ACID compliant No single point of failure Open platform Standards compliant SQL, Python, Java Working with open source community on Spark, Hadoop, Kafka, etc. 9

10 Advanced, In-Database Analytics SQL 99 SQL Extensions SDKs In-database Analytics Aggregate Analytical Window functions Graph Monte Carlo Statistical Geospatial Allows for: Standard functionality that performs at scale Pattern matching Event series joins Time series Event-based windows Allows for: Sessionization Conversion analysis Fraud detection Fast Aggregates (LAP) Analytics Java C++ R Connection ODBC/JDBC HIVE Hadoop Flex zone Allows for: Machine learning Custom data mining Specialized parsers Regression testing K-means Statistical modeling Classification algorithms Page rank Text mining Geospatial Allows for: Statistical modeling Cluster analysis Predictive analytics

Building Machine Learning into the Core of Vertica Node 1 Node 2. Node n - Machine Learning algorithms, such as k-means and regression, built into the core of Vertica - Advanced predictive modeling runs within the database eliminating all data duplication typically required of alternative vendor offerings Machine learning functions run in parallel across hundreds of nodes in a Vertica cluster - Traditional approaches can t handle many data points forcing data scientists to down-sample leading to less accurate predictions - A single system for SQL analytics and Machine Learning Confidential 11

Why It s an Easy Transition for Analysts Languages Used ETL Used Data Visualization Used SQL Python Java

Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud and Hybrid Machine Learning Internet of Things

Lowering Costs of Traditional Architectures with Open Source More types of data Unstructured data Storing data of mixed value The Hadoop bandwagon Open source economics

Hadoop Hype 2008 There s this new thing called Hadoop 2011 Can Hadoop replace data warehouse? 2013 Can Hadoop replace ETL? 2017 Has Hadoop failed to deliver the goods? https://www.gartner.com/document/3388326

Hadoop Status A smoking heap of cost and complexity not the technology base the world will be built on going forward. https://www.datanami.com/2017/03/13/hadoop-failed-us-tech-experts-say/

What Went Wrong? You still need a traditional data warehouse Concurrency Completeness of analytics Standards not followed: ACID, SQL Hadoop isn't fast for analytics Hadoop is difficult to use, requires new staffing https://www.syntouch.nl/node/66

Do You Need Hadoop for Big Data? LOAD One high-tech company had a service level agreement which called for: 60 TB per hour QUERY & CONCURRENCY 2,000+ Users

Reaching Across the Data Lake MPP Database Important-to-the-business data with tight SLAs Hadoop Data Lake (HDFS) Cooler Data. Unknown Value

Embracing an Open Source Architecture Vertica performs optimized load from Spark Load data from Vertica to Spark Kafka Share data between applications that support Kafka Data streaming into Vertica Spark Hadoop Read native formats like ORC and Parquet Any Hadoop Run ON the Hadoop cluster or ON Vertica cluster

The Challenge Data storage requirements increasing exponentially, but customers expecting analytic response times in seconds, not minutes or hours With their previous Oracle system, enlarging storage was complicated and time consuming. Delivering predictive network analytics for Telecommunications companies The Solution Vertica has provided Anritsu with the technology necessary to implement predictive analytics solutions that have only been theoretical until now Realized rapid ROI after implementing Vertica in place of a legacy Oracle solution: 351% ROI with a payback of just 4 months ROI: 351% Payback: 4 months Annual Benefit: $3+ million 21

The Challenge Perform analytics on over 1.5 petabytes of customer product and performance metadata to fine-tune and continue leading-edge product development and evolution Using customer analytics to eliminate level 1 and level 2 support The Solution Leverage Vertica for operational analytics to engage in ongoing communications and with customers about storage environments and optimizations Use analytics to understand distribution of customers' workloads and how customers access storage, which helps it design storage solutions that match its customers' use patterns ROI: 287% Payback: 6 months Annual Benefit: $13.6 million 22

Try Vertica Today Run Vertica in the Clouds or OnPremise 3 Node, 1 TB Community Edition Bring your own license (BYOL) Single or a multi-node cluster Choice of many Instance Types RHEL 7.0 CloudFormation template for AWS Take Test Drives on AWS & Azure Clickstream Analytics SQL on Hadoop Predictive Maintenance www.vertica.com/try Thank You! jeff.a.healey@hpe.c om 23