REDEFINE BIG DATA. Zvi Brunner CTO. Copyright 2015 EMC Corporation. All rights reserved.

Similar documents
Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Data Analytics. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

Hybrid Data Management

Architecture Overview for Data Analytics Deployments

EMC IT Big Data Analytics Journey. Mahmoud Ghanem Sr. Systems Engineer

Microsoft Azure Essentials

TRANSFORMING IT FOR THE FUTURE VIC BHAGAT, EVP AND CIO

Data: Foundation Of Digital Transformation

Business is being transformed by three trends

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

A NEW PLATFORM FOR A NEW ERA. Russell Acton, VP &GM EMEA,

BIG DATA TRANSFORMS BUSINESS. Copyright 2012 EMC Corporation. All rights reserved.

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Big and Fast Data: The Path To New Business Value

A NEW PLATFORM FOR A NEW ERA

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Emerging Business Applications of High Performance Analytics

Got Data Silos? Automate Data Ingestion Into Isilon In Support Of Analytics

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Dell EMC IT Big Data Analytics Journey. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

The Alpine Data Platform

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Boundaryless Information PERSPECTIVE

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

POWER NEW POSSIBILITIES

PRODUCT UPDATES APJ PARTNER SUMMIT - BALI. February Software AG. All rights reserved. For internal use only

20775: Performing Data Engineering on Microsoft HD Insight

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY

Pentaho 8.0 Overview. Pedro Alves

Microsoft FastTrack For Azure Service Level Description

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

5th Annual. Cloudera, Inc. All rights reserved.

Microsoft Big Data. Solution Brief

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Simplifying Your Modern Data Architecture Footprint

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Copyright 2015 EMC Corporation. All rights reserved. STRATEGIC FORUM 2015 PAUL MARITZ CEO, PIVOTAL SOFTWARE

Let s distribute.. NOW: Modern Data Platform as Basis for Transformation and new Services

Data Analytics for Semiconductor Manufacturing The MathWorks, Inc. 1

Bringing the Power of SAS to Hadoop Title

Cask Data Application Platform (CDAP)

SAP Predictive Analytics Suite

#mstrworld. A Deep Dive Into Self-Service Data Discovery In MicroStrategy. Vijay Anand Gianthomas Tewksbury Volpe. #mstrworld

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

EBOOK: Cloudwick Powering the Digital Enterprise

DELL EMC HADOOP SOLUTIONS

The Need For Speed: Fast Data Development Trends Insights from over 2,400 developers on the impact of Data in Motion in the real world

Integrating the Enterprise. How Business Leaders are Implementing Digital Integration

How to Build Your Data Ecosystem with Tableau on AWS

The Rise of Engineering-Driven Analytics

Transforming IIoT Data into Opportunity with Data Torrent using Apache Apex

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Middleware Modernization: lay the foundation to your digital success

WELCOME TO. Cloud Data Services: The Art of the Possible

Hortonworks Powering the Future of Data

Governing Big Data and Hadoop

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

PERSPECTIVE. Monetize Data

Building a Data Lake on AWS

Take a Tour of Native Hybrid Cloud & Neutrino. Modern, cloud native platforms

Discover the New Company

MapR Pentaho Business Solutions

Secure information access is critical & more complex than ever

Cloud Based Analytics for SAP

Microsoft Dynamics 365 and Columbus

巨量資料商機如何現代化您的產品及服務, 創造客戶最大的價值

MapR: Solution for Customer Production Success

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

Oracle Big Data Cloud Service

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform

Hadoop and Analytics at CERN IT CERN IT-DB

AWS Digital Innovation Program

Hadoop in the Cloud. Ryan Lippert, Cloudera Product Cloudera, Inc. All rights reserved.

Sr. Sergio Rodríguez de Guzmán CTO PUE

NEW VALUE FOR THE FUTURE

Oracle Big Data Discovery Cloud Service

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

Transforming Big Data to Business Benefits

MICROSERVICES. Prabavathy Arumugam Software AG. All rights reserved. For internal use only

Oracle Big Data Discovery The Visual Face of Big Data

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

Brian Macdonald Big Data & Analytics Specialist - Oracle

What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?

Managing explosion of data. Cloudera, Inc. All rights reserved.

The Internet of Things Wind Turbine Predictive Analytics. Fluitec Wind s Tribo-Analytics System Predicting Time-to-Failure

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

TDWI Analytics Fundamentals. Course Outline. Module One: Concepts of Analytics

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

Unstructured Data in the cloud with ECS

Oracle Enterprise Data Quality Product Roadmap and Statement of Direction. October 2016

Insights to HDInsight

InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics

ARCHITECTURES ADVANCED ANALYTICS & IOT. Presented by: Orion Gebremedhin. Marc Lobree. Director of Technology, Data & Analytics

Deloitte School of Analytics. Demystifying Data Science: Leveraging this phenomenon to drive your organisation forward

Big Data Job Descriptions. Software Engineer - Algorithms

Fujitsu Value Proposition for Manufacturing Industry. Enabling Digital with Connected Enterprise

Transcription:

1

REDEFINE BIG DATA Zvi Brunner CTO 2

2020: A NEW DIGITAL WORLD 30B DEVICES 7B PEOPLE Millions OF NEW BUSINESSES Source: Gartner Group, 2014

DIGITIZATION IS ALREADY BEGINNING PRECISION FARMING DRESS THAT DISPLAYS HOW WE FEEL THERMOSTAT THAT KNOWS YOU RE AWAY GLASSES THAT DIRECT US WHERE TO GO CONTACT LENS THAT CONTROLS BLOOD SUGAR FITNESS BAND THAT MEASURES ACTIVITY LEVEL DRONES THAT DELIVER OUR GROCERIES

THE DATA DIVIDE BIG DATA CHASM 70% of data generated by customers 80% of data stored 3% prepared for analysis 0.5% being analyzed <0.5% being operationalized 6

FIVE TRENDS ENABLING BIG DATA Data Growth Limitless Compute Dev Ops Cheap Storage Real-time Technologies

Big Data Is Not Just Alot of Data Summary, limited data Backward looking Pre-planned Reports & meetings Expansive, full data sets Predicting the future Iterative, agile Applications & products

STEPS TO HARNESS BIG DATA Build New Applications, Products, & Business Models Leverage New Analytics To Predict The Future Gather as Much Data As Possible

BRINGING IT ALL TOGETHER IS HARD MAP ANALYTICS USE CASE, ANALYTICS PLATFORMS, & STORAGE Customer Sentiment Analysis Product Performance & Reliability Supply Chain Optimization Product Recommendation Engine Competitive War Games??????

HOW CAN WE SIMPLIFY THIS?

EMC BUSINESS DATA LAKE THE INDUSTRY S FIRST FULLY-ENGINEERED ENTERPRISE-GRADE DATA LAKE SOLUTION

WHAT IS A DATA LAKE? JUST COLLECTING AND ANALYZING DATA IS NOT A DATA LAKE A Data Lake is a complex ecosystem of tools Security isn t secondary, it s core functionality Data is discoverable and its usage is traceable It should provide access to all business users It should follow business rules and policies

EVOLVE TO THE DATA LAKE CHOICE, EXTENSIBLE ECOSYSTEM Data Lake Foundations? EMC ANALYTICS STARTER PLATFORM

RELIABLE INFRASTRUCTURE-SCALABILITY Must be able to handle more traffic demand at any time Easily process big data workloads

EMC BUSINESS DATA LAKE PLATFORM DATA SERVICES DATA & ANALYTICS CATALOG MANAGEMENT DATA MANAGER ANALYTICS TOOLBOX DATA GOVERNOR (THIRD PARTY APPLICATIONS) ADVANCED ANALYTICS PIVOTAL BIG DATA SUITE GREENPLUM DATABASE HAWQ APPS AT SCALE REDIS GEMFIRE INGEST DATA PROCESSING SPRING XD SPARK PIVOTAL HD RABBITMQ BDS ON PIVOTAL POLICY MGMT HADOOP INDEX & SEARCH OPEN DATA PLATFORM SECURITY & ACCESS CONTROL VIRTUALIZATION PIVOTAL CLOUD FOUNDRY EMC II STORAGE DATA LAKE FOUNDATION: ISILON ECS VCE VBLOCK XTREMIO

FOCUS ON BUSINESS OUTCOMES NOT ON LOW LEVEL TECHNOLOGY DECISIONS INGEST Capture data from a wide range of sources, traditional and new STORE ANALYZE Use advanced algorithms to discover new, predictive patterns SURFACE ACT Build data-driven applications to meet business needs Store everything in one environment for cross data analysis Share insights with business domain experts

WHAT IS BIG DATA ANALYTICS? Value of Analytics ($) HOW CAN WE MAKE IT HAPPEN? WHAT WILL HAPPEN? Prescriptive Analytics WHY DID IT HAPPENED? Predictive Analytics WHAT HAPPENED? Diagnostic Analytics Descriptive Analytics Complexity

ADDRESS GAPS IN A TYPICAL DATA LAKE KEY CAPABILITIES FAST & EASY DEPLOYMENT in as little as one week versus months SEMANTIC CONSISTENCY and governed metadata SECURITY Access control and governance AUTOMATIC instantiation of data, analytics and applications SELF-SERVICE for all of the various users across the organization

PIVOTAL BIG DATA SUITE

WORLD S FIRST OPEN SOURCED BIG DATA PORTFOLIO BUILDING ON SUCCESS OF CLOUD FOUNDRY FOUNDATION Open sourcing all Pivotal Big Data Suite components including: Pivotal GemFire Apache Geode Pivotal Greenplum Database Apache HAWQ Pivotal HDB BUILT FOR ENTERPRISES

OPEN Common core for Hadoop ecosystem Rapidly accelerated certifications, ecosystem development and enterprise-grade quality OpenDataPlatform.org

DATA-DRIVEN ENTERPRISE JOURNEY STORE ANALYZE DEVELOP INNOVATE Structured Predictive Analytics Advanced Analytic Pipelines Agile Dev Expertise Unstructured Machine Learning Realtime Analytical Applications DevOps High Volume High Velocity Advance Data Science Realtime Analytics Global Scale Data-Driven Applications Enterprise, Consumer, and Mobile Microservice Continuous Delivery Closed Loop Applications BIG DATA PREDICTIVE ANALYTICS CLOUD NATIVE PLATFORM AGILE DEVELOPMENT

DATA-DRIVEN ENTERRPRISE JOURNEY WITH PIVOTAL BIG DATA SUITE STORE Structured Unstructured High Volume High Velocity Data Engineering ANALYZE Predictive Analytics Machine Learning Advance Data Science Realtime Analytics Data Science DEVELOP Advanced Analytic Pipelines Realtime Analytical Applications Global Scale Data-Driven Applications Enterprise, Consumer, IoT, and Mobile INNOVATE Agile Dev Expertise DevOps Pivotal Labs Microservices Continuous Delivery Closed Loop Applications Spring XD Spring XD Spring XD Spring Cloud Spark Pivotal HD & Open Data Platform Pivotal Greenplum Database Pivotal HDB Pivotal GemFire Rabbit MQ Pivotal BDS on PCF Pivotal Cloud Foundry BIG DATA PREDICTIVE ANALYTICS CLOUD NATIVE PLATFORM AGILE DEVELOPMENT

TEMP Absorbance Velocity INTERNET OF THINGS IN MANUFACTURING A pipeline of sensors and opportunities for optimizing output Input materials Mix Incubate Filter Centrifuge Final Product 30 25 20 15 10 5 Automated raw materials mixing High-Content Screens TIME Elution volume Sensors 0 0 50 100 150 200 Time

ADVANCED ANALYTICS REQUIREMENTS BENEFITS 010101010101010100 101010101010101100 1010101010101010 Massive stream processing Internet of Things use cases Rapid time to insights SQL- compliant batch and interactive queries Leverage existing skills and tools Rapid time to insights 0101010101010 1010010101010 1010101100101 010 Machine learning and advanced analytics Solve business problems Predictive insights: proactive execution

TO PRO-ACTIVE, SELF-IMPROVING, MACHINE LEARNING SYSTEMS Data Stream Pipeline In-Memory Real- Time Data Data Lake HDFS Expert System / Machine Learning Multiple Data Sources Real-Time Processing Store Everything Continuous Learning Continuous Improvement Continuous Adapting

DATA STREAM NEEDS AN AGILE, SCALABLE AND FAST SOLUTION SpringXD Ingest Transform Sink GemFire Data Lake HAWQ GPDB

SPRING XD State of the Art Data Pipeline Automation INGEST / SINK PROCESS ANALYZE No coding required Dozens of built-in connectors Seamless integration with Kafka, Sqoop Create new connectors easily using Spring Call Spark, Reactor or RxJava Built-in configurable filtering, splitting and transformation Out-of-box configurable jobs for batch processing Import and invoke PMML jobs easily Call Python, R, Madlib and other tools Built-in configurable counters and gauges

DELIVER A SEAMLESS EXPERIENCE FOR EVERYONE: BIG DATA IS A TEAM SPORT BUSINESS USERS BUSINESS ANALYSTS APPLICATION DEVELOPERS DATA PROGRAMMERS INFRASTRUCTURE DEVELOPERS DATA SCIENTISTS

PIVOTAL HDB Hadoop Native SQL Exceptional Hadoop Native SQL Performance No compatibility risks to SQL developers or SQL BI tools and applications Support query roll-ups, dynamic partitions and joins Massive MPP scalability to petabytes On premise or on the cloud Scale your cluster out, not up World class parallel loading and unloading Fast performance for complex and advanced data analytics Integrated with MADLib for advanced machine learning Powerful Cost-based Query Optimizer

WHERE ARE YOU? Let EMC Help You Discover YOUR KILLER USE-CASE with a Big Data Vision Workshop Work with EMC and Your Team to IMPROVE YOUR BUSINESS with a 8-12 Week Proof-of-Value Project Order and START TODAY by Deploying the EMC Business Data Lake

IN SUMMARY BIG DATA WILL REDEFINE EVERY BUSINESS

Big Data

Big Data זה