Big Data Trends to Watch

Similar documents
Optimal Infrastructure for Big Data

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

COMP9321 Web Application Engineering

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

BIG DATA ANALYTICS WITH HADOOP. 40 Hour Course

Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 1 An Overview of Business Intelligence, Analytics, and Data Science

By: Shrikant Gawande (Cloudera Certified )

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

Realising Value from Data

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia


E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Experience Managing Data at Scale

Special thanks to Chad Diaz II, Jason Montgomery & Micah Torres

Intro to Big Data and Hadoop

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

EMC Big Data: Become Data-Driven

SAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services

Enterprise Database Systems for Big Data, Big Data Processing, and Big Data Analytics. Sunnie Chung Cleveland State University 1

Cognitive Data Warehouse and Analytics

5th Annual. Cloudera, Inc. All rights reserved.

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

The Importance of good data management and Power BI

Analytics in Action transforming the way we use and consume information

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

Cloud Computing with Exadata & Exalogic

Table of Contents. Are You Ready for Digital Transformation? page 04. Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

E-Guide THE EVOLUTION OF IOT ANALYTICS AND BIG DATA

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Jason Virtue Business Intelligence Technical Professional

Spark, Hadoop, and Friends

Architecture Overview for Data Analytics Deployments

MapR Pentaho Business Solutions

BIG DATA TRANSFORMS BUSINESS

Ensuring Trust in Big Data with SAP EIM Solutions. Scott Barrett Senior Director, Information Management Database & Technology Centre of Excellence

Big Data in Cloud. 堵俊平 Apache Hadoop Committer Staff Engineer, VMware

Fast Innovation requires Fast IT

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena.

The Sysprog s Guide to the Customer Facing Mainframe: Cloud / Mobile / Social / Big Data

Common Customer Use Cases in FSI

APAC Big Data & Cloud Summit 2013

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Software Defined is the new Black. Craig McKenna Director, Cloud & Cognitive Data Solutions IBM Systems, Asia Pacific

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

GE Intelligent Platforms. Proficy Historian HD

The Internet of Everything and the Research on Big Data. Angelo E. M. Ciarlini Research Head, Brazil R&D Center

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Tech Trends. Big Data, IOT, Security, Machine Learning, Search engines... Anant Asthana: github.com/anantasty

Microsoft Azure Essentials

Business is being transformed by three trends

Sunnie Chung. Cleveland State University

Hybrid Data Management

Big data is hard. Top 3 Challenges To Adopting Big Data

How In-Memory Computing can Maximize the Performance of Modern Payments

2012 SNIA Analytics and Big Data Summit. Insert Your Company Name. All Rights Reserved.

BIG Data Analytics AWS Training

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics Nov.

Accelerating Computing - Enhance Big Data & in Memory Analytics. Michael Viray Product Manager, Power Systems

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

SAP Real-time Data Platform 9 th October Matteo Losi Head of Presales and Business Development Italy Italy EMEA

ISILON SCALE-OUT NAS OVERVIEW AND FUTURE DIRECTIONS

From GIS to Location Intelligence

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

Enterprise Architecture for Digital Business

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure

WELCOME TO. Cloud Data Services: The Art of the Possible

Store. Analyze. Preserve. Big Data Assets

What s making SAP HANA the most powerful platform? Andrew Tao, SAP July 26, 2016

Simplifying Hadoop. Sponsored by. July >> Computing View Point

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Knowledge Discovery and Data Mining

SAP HANA MADE SIMPLE WITH VALIDATED SOLUTIONS & CONVERGED SYSTEMS. Joakim Zetterblad, Director SAP Practice, EMEA

Cloud Computing in the Resources Sector. John Bradley Industry Development Manager Manufacturing & Resources

SAP experience Day SAP BW/4HANA. 21 marzo 2018

Hadoop Integration Deep Dive

Connect the unconnected

<Insert Picture Here> Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Copyright 2013 Oracle and/or its affiliates. All rights reserved.

ADVANCED ANALYTICS & IOT ARCHITECTURES

Capturing Value from Big Data & Analytics. Dave Bhattacharjee Director, Cisco Consulting Services

1. Intoduction to Hadoop

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

High-Performance Computing (HPC) Up-close

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY

Disrupt or Be Disrupted in the API Economy

Commercial openpdc/openhistorian and Cloud Computing

Big and Fast Data: The Path To New Business Value

Data Informatics. Seon Ho Kim, Ph.D.

Blueprints for Big Data Success. Succeeding with four common scenarios

High Performance Data Management The Impact on Oil & Gas Integrated Operations

Rapid Start with Big Data Appliance X6-2 Technical & Operational Overview

GPU ACCELERATED BIG DATA ARCHITECTURE

Cloud Based Analytics for SAP

Transcription:

Big Data Trends to Watch Bill Peterson NetApp September, 2012 1

Bill Peterson @thebillp

What I hope to accomplish today...

...and avoid this.

What is Big Data? Big Data refers to datasets whose volume, speed and complexity is beyond the ability of typical tools to capture, store, manage and analyze. Coined by Francis Diebold, professor of economics at the University of PA in 2000, when Big meant Gigabytes / day 1 Complexity Speed Volume 5

Quantifying The Big Data Challenge 5 Billion smart phones 30 Billion pieces of new content to Facebook per month 60 Zettabytes Estimated size of the digital universe in 2020 Growth Over the Next Decade: Servers (Phys/VM): 10x Data/Information: 50x #Files: 75x IT Professionals: <1.5x Source: Gantz, John and Reinsel, David, Extracting Value from Chaos, IDC IVIEW, June 2011, page 4. Sensors Video Music Location Weblogs 6

The Big Data Push High Tier 1 BP OLTP Data a Structur re Sat Ground Stations No SQL Columnar DBs FMV DVS DSS/DW Collaboration App Dev IT Infra Web Infra Tier-2 BP OLTP Low HPC Large Block, Sequential I/O 100s GB/sec Tech Comp Content Repositories Home Dirs Performance Small Block, Random I/O (100s KIOPS) 7

What Does This Mean to You? Busine ess or Missio on Velocity Inflection Point Information Becomes a Propellant to the Organzation Data Becomes a Burden to IT Infrastructure 2010 2020 You are also at an Inflection Point: You also have a decision to make, as business as usual may not cut it! 8

Dispelling the Misconceptions About Big Data 9

Big Data Is NOT New 30 PB of New Data Annually 10

Big Data = Big Analytics = Hadoop? That s What The Media Hype Implies, but it is NOT true! Traditional analytics (BI/DSS/DW) dominates the analytics market Like other technologies vying to gain broad adoption in Enterprise IT (e.g., Traditional Analytics, HPC & Cloud), it shows promise Enterprise IT $3.6T BPaaS Hadoop Analytics $87 B $77 M $35 B HPC Cloud $29 B $23 B 11

12

Why Decision Support Systems are important? Customers A Business Products & Services Management $$$$ OLTP DW DSS enables businesses to run Closed Loop, ultimately improving their business through the use of feedback mechanisms.

Big Analytics An Emerging Market NoSQL / Column DBs Legacy DBs Middleware & Apps Open Source Distributors Cloud & Cyber Integration Services Compute Network Storage 14

Analytics & Enterprise Apps Environment Reporting/Dashboard/Visualization Applications OLAP Analytics ETL Data Management ETL OLAP OLTP Storage File Systems Mobile Devices Location/GPS Logs Sensors Applications Other Data Sources Content Repositories Storage Shared Storage Infrastructure (All other storage, i.e. internal DAS) Storage Data Management 15

What Does Hadoop Look Like Today? Runs on a collection of cheap, commodity servers, in a distributed, shared nothing architecture Two key components HDFS Hadoop Distributed File System MapReduce Programming g model for processing and generating large datasets Map HDFS NameNode Secondary NameNode Reduce DataNodes / TaskTracker JobTracker : DataNodes / TaskTracker 16

100000 10000 1000 100 10 Ethernet s Relentless March Data will be growing by 50x, but bandwidth only by 10x! MB/Sec 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 iscsi FCIP iwarp FCoE pnfs 1 01 0.1 2016 Time SCSI/FCP Infiniband ATM FDDI Ethernet 17

Why Should You Care? It s the Value of your data 5 Billion Records Anywhere, Anytime Faster time to market 50% Increase in Revenue Top line revenue Leverage their data assets into business advantage Bottom Line savings Lower the cost of Over 1PB of data compliance Growth of 175% YOY 90 days of data within 24 hours of a failure y Manage ever growing data efficiently 18

AutoSupport: Hadoop Use Case at NetApp Call home service for all NetApp systems Foundation of NetApp proactive support strategies Machine generated data doubles every 16 months CHALLENGE NETAPP SOLUTION BENEFITS 4 weeks to run a query on 24 billion unstructured red records Impossible to run a query: 240 billion unstructured records 10-node Hadoop Cluster 10 Node w/ Hadoop shared Cluster Storage Time reduced from 4 weeks to 10.5 hours Previously impossible, now achievable in just 18 hours NetApp ASUP is a mission-critical application 19

Analytics of Tomorrow Traditional & Big Analytics side-by-side for years to come Hadoop moves to shared, virtualized infrastructure, for better efficiency and ease of management: Hadoop remains logically distributed, shared nothing, but runs on a virtualized shared everything architecture (e.g., FlexPod for Vmware + eseries) Same as above, except Hadoop becomes logically shared everything, as HDFS is replaced by a parallel file system (e.g., Lustre Cluster, StorNext or GPFS) Enterprise class resiliency (no SPoF) and reliability with HPC-like performance (no need for ti triplicas) Use of a single copy of data for the map phase (higher storage utilization) Natural intersection with Cloud (Analytics as a Service) 20

Summary Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private companies have struggled with Big Data for decades) d Analytics: Traditional BI/DSS analytics still dominate. Importance of newer NoSQL & Columnar DB applications, enabled by MapReduce will grow with the growth of multistructured data Big Data applications, such as Hadoop, will need to adopt shared, virtualized infrastructure (and its management benefits) if they are to be widely adopted by Enterprise IT 21

YOU VE GOT I VE GOT rambling responses that sound like @thebillp or william.peterson@netapp.com