Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Similar documents
Intro to Big Data and Hadoop

Accelerating Computing - Enhance Big Data & in Memory Analytics. Michael Viray Product Manager, Power Systems

Engaging in Big Data Transformation in the GCC

Lustre Beyond HPC. Toward Novel Use Cases for Lustre? Presented at LUG /03/31. Robert Triendl, DataDirect Networks, Inc.

Simplifying Hadoop. Sponsored by. July >> Computing View Point

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Microsoft Big Data. Solution Brief

MapR Pentaho Business Solutions

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager

BIG DATA TRANSFORMS BUSINESS. The EMC Big Data Solution

Cloud-Scale Data Platform

Harnessing the Power of Big Data to Transform Your Business Anjul Bhambhri VP, Big Data, Information Management, IBM

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

OSIsoft Super Regional Transform Your World

Spark, Hadoop, and Friends

HPC Solutions. Marc Mendez-Bermond

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

Louis Bodine IBM STG WW BAO Tiger Team Leader

High-Performance Computing (HPC) Up-close


Got Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Extending on-premise HPC to the cloud

The Applicability of HPC for Cyber Situational Awareness

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

IBM Power Systems. Bringing Choice and Differentiation to Linux Infrastructure

Bringing the Power of SAS to Hadoop Title

2009/2010 Challenges Application drives HPC Business Ready.., future for performance computing Business Ready - Radioss

IBM i2 Portfolio. Accelerating the data to decision process. James Vincent. i2 User Group Conference Solutions Specialist Safer Planet

ANY SURVEILLANCE, ANYWHERE, ANYTIME DDN Storage Powers Next Generation Video Surveillance Infrastructure

Analytics Platform System

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena.

#mstrworld. A Deep Dive Into Self-Service Data Discovery In MicroStrategy. Vijay Anand Gianthomas Tewksbury Volpe. #mstrworld

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

Teradata Strategic Direction Summary

AMD and Cloudera : Big Data Analytics for On-Premise, Cloud and Hybrid Deployments

IBM General Parallel File System (GPFS TM )

Big Data The Big Story

Hadoop in the Cloud. Ryan Lippert, Cloudera Product Cloudera, Inc. All rights reserved.

Big Data Meets High Performance Computing

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

SUSiEtec The Application Ready IoT Framework. Create your path to digitalization while predictively addressing your business needs

Ed Turkel HPC Strategist

Architecture Overview for Data Analytics Deployments

Business Insight at the Speed of Thought

StackIQ Enterprise Data Reference Architecture

Cask Data Application Platform (CDAP) Extensions

Big Data Analytics met Hadoop

Investor Presentation. Fourth Quarter 2015

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Adobe Deploys Hadoop as a Service on VMware vsphere

zdata Solutions BI / Advanced Analytic Platform and Pilot Programs

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Investor Presentation. Second Quarter 2016

A NEW PLATFORM FOR A NEW ERA. Russell Acton, VP &GM EMEA,

EMC Big Data: Become Data-Driven

Take a Tour of Native Hybrid Cloud & Neutrino. Modern, cloud native platforms

Oracle Enterprise Data Quality Product Roadmap and Statement of Direction. October 2016

BIG DATA TRANSFORMS BUSINESS

BIG DATA TRANSFORMS BUSINESS. Copyright 2012 EMC Corporation. All rights reserved.

ETL on Hadoop What is Required

More information for FREE VS ENTERPRISE LICENCE :

Architecture Optimization for the new Data Warehouse. Cloudera, Inc. All rights reserved.

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

CityEyes Appliance. Overview. CityEyes Appliances (CEA) Enterprise Video Mining and Video Management Solution. Simple Configuration and Management

Services Catalogue. Cloud Solutions

Rapid Start with Big Data Appliance X6-2 Technical & Operational Overview

Customer Case Study. Using Big Data Analytics to Create Better Outcomes for Cancer Patients

Lesson 3 Cloud Platform as a Service usages for accelerated Design and Deployment of IoTs

Chris Nelson. Vice President Software Development. #PIWorld OSIsoft, LLC

What companies are looking for

ORACLE BIG DATA APPLIANCE

Big and Fast Data: The Path To New Business Value

Business is being transformed by three trends

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

UNOTECH A SNAPSHOT. Customers. History. Growth. Proven India leader in OSS since 5+ years

By: Shrikant Gawande (Cloudera Certified )

Razvan IONITA 27 Oct 2016 UNIFORMANCE SUITE. Delivers New Process Intelligence Capabilities

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

5th Annual. Cloudera, Inc. All rights reserved.

CA UIM Log Analytics. Gain Full Stack Visibility With Contextual Log Insights. Mark Tukh Principal Presale Consultant CA NESS AT

4/26. Analytics Strategy

Statistics & Optimization with Big Data

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform

Hadoop Course Content

TELSTRA HOSTED SAP SOLUTIONS WITH ACCENTURE A SMARTER SAP SOLUTION

Who is Databricks? Today, hundreds of organizations around the world use Databricks to build and power their production Spark applications.

IBM BPM on zenterprise

Advanced Information Systems Big Data Study for Earth Science

The Internet of Everything and the Research on Big Data. Angelo E. M. Ciarlini Research Head, Brazil R&D Center

Infrastructure. Self Build Reference Architecture or Appliance? René Witteveen Consultant HP

New Big Data Solutions and Opportunities for DB Workloads

Big Data Hadoop Administrator.

Ask the right question, regardless of scale

<Insert Picture Here> Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter

Transcription:

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Who am I? Practice Director, Analytic Services at DataDirect Networks, Inc. 3+ years with Hadoop, 12+ with HPC Contact Details @jrjablo jjablonski@ddn.com/jrjablo@gmail.com www.linkedin.com/in/joeyjablonski 2

Why Hadoop? Scalable Performance & Capacity Growing Ecosystem (Flexibility) Established APIs & Interfaces Location on the adoption curve Proven base to create Analytical Platforms 3

What is Enterprise Class? Scalable OPEX & CAPEX Manageable Integration with existing tools Flexible Workflow Process Integration No Rip & Replace Metrics to manage towards Business Driven, Technological Capabilities 4

The Big Data Challenge The Big Data Equation: Volume Velocity Variety + + Petabytes of Data Trillions of Objects GB/s TB/s Millions of IO/s Object Operations Structured Unstructured Streams & Batches

Analytics Looking for Actionable Information Billions of Data Points to Consider Consumer purchasing trends Product perception Drug Discovery Genomics Surveillance Financial Analysis

Data Gravity Applications DATA Services 7

Why is data Analytics so hard? Technical Business Hacking Skills Business Acumen Math & Statistics knowledge Data Science Traditional Research Substantive Expertise Communications Analytics Poor Decisioning Curiosity

What is Hadoop missing today? Active-Active high-availability Established management tools Enterprise integration mindset Enterprise class hardware Consistent version-compatibility & deployment Efficient CAPEX & OPEX scaling Resource management/slas/qos Security. 9

Hadoop Operational Considerations Deploy Upgrade Manage Respond Monitor Software Platform Hardware Platform

Todays Enterprise Picture The Cloud 11

Getting there. Improved Results Insight Modify Behavior

Hadoop Architectural Considerations 13

Planning for Growth Adoption Higher is Better Goal for Human Costs Capacity Performance Scalability User Growth 14

Shared v. Commodity Shared Component Approach Lower Operational Costs Efficient operational resource scaling Shared resources with other IT platforms Efficiency in computing, connectivity & service placement Commodity Server Approach Lower Entry Costs Shorter MTBF Inefficient scaling of tools and processes Mis-match with traditional IT operations models 15

Ethernet v. Infiniband Infiniband 100% Storage Management Offload End-End InfiniBand Networking with RDMA Acceleration Real-Time Data Delivery to Provide MapReduce Process Consistency Smaller Compute, Compact Storage to Minimize Data Center Impact Ethernet Compatibility, ensured connectivity Limitations in traffic types and bandwidth availability High CPU/Overhead cost Minimal options for offloading with Linux environments 16

Analytic User Types Empowered Users Aware Users Enabled Users 17

Hadoop Enterprise Integration Monitoring & Response Extract Transform Load APIs Integration Data Information Insight Results 18

And finally, Hadoop is more then just hardware, It is about an ecosystem of hardware & software. about integrating with existing systems. a toolkit to build Analytical Platforms. a component of the larger corporate processes and mandates. a component of the wider business KPIs. 19

Q&A 20