Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

Similar documents
InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

InfoSphere Warehousing 9.5

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

The Alpine Data Platform

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

MapR Pentaho Business Solutions

SAS & HADOOP ANALYTICS ON BIG DATA

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

Microsoft Big Data. Solution Brief

Big Data The Big Story

Common Customer Use Cases in FSI

Hadoop and Analytics at CERN IT CERN IT-DB

Brian Macdonald Big Data & Analytics Specialist - Oracle

Bringing the Power of SAS to Hadoop Title

ETL on Hadoop What is Required

Data Strategy: How to Handle the New Data Integration Challenges. Edgar de Groot

Microsoft Azure Essentials

Big Data: Essential Elements to a Successful Modernization Strategy

Evolution to Revolution: Big Data 2.0

Copyright 2015 EMC Corporation. All rights reserved. STRATEGIC FORUM 2015 PAUL MARITZ CEO, PIVOTAL SOFTWARE

Why Big Data Matters? Speaker: Paras Doshi

In-Memory Analytics: Get Faster, Better Insights from Big Data

Practical Steps to Building a Big Data & Analytics Business

Big Data Live selbst analysieren

Data Analytics. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

IBM Retail Business Intelligence Solutions. Retail RBIS/Cognos Solutions Overview

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

ActualTests.C Q&A C Foundations of IBM Big Data & Analytics Architecture V1

Building Your Big Data Team

Implementing Relational Use Cases with mongodb and Pentaho

IBM SPSS & Apache Spark

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Cloud Based Analytics for SAP

FORIS Business Intelligence. Innovative Analytics

MapR: Solution for Customer Production Success

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Big Data Trends to Watch

SAP Predictive Analytics Suite

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Dell EMC IT Big Data Analytics Journey. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

ETL challenges on IOT projects. Pedro Martins Head of Implementation

ARCHITECTURES ADVANCED ANALYTICS & IOT. Presented by: Orion Gebremedhin. Marc Lobree. Director of Technology, Data & Analytics

BIG DATA and DATA SCIENCE

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

White Paper: SAS and Apache Hadoop For Government. Inside: Unlocking Higher Value From Business Analytics to Further the Mission

DELL EMC HADOOP SOLUTIONS

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY

Hybrid Data Management

Operational Hadoop and the Lambda Architecture for Streaming Data

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

The Evolution of Big Data

How Data Science is Changing the Way Companies Do Business Colin White

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Oracle Autonomous Data Warehouse Cloud

Microsoft BI Product Suite

Better Insights with BISM in SQL Server Analysis Services 2012

Hadoop Integration Deep Dive

The Intersection of Big Data and DB2

Knowledge Discovery and Data Mining

Hortonworks Powering the Future of Data

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Microsoft FastTrack For Azure Service Level Description

Unified Customer Profile: Getting a 360 Customer View in the Age of Big Data

Oracle Real-Time Decisions zur Entscheidungsoptimierung und dessen Einführung

Corporate Overview CRM. the Cloud

Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy

Data: Foundation Of Digital Transformation

FLINK IN ZALANDO S WORLD OF MICROSERVICES JAVIER LOPEZ MIHAIL VIERU

Building a Data Lake with Spark and Cassandra Brendon Smith & Mayur Ladwa

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Oracle Big Data Discovery The Visual Face of Big Data

MANUFACTURING EXECUTION SYSTEM

Reduce Money Laundering Risks with Rapid, Predictive Insights

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS

Modernizing Data Integration

Knauf builds high-speed business insight with SAP and IBM

Implementing a Data Warehouse with Microsoft SQL Server

Oracle Autonomous Data Warehouse Cloud

Hadoop and the Data Warehouse: When to Use Which

: Boosting Business Returns with Faster and Smarter Data Lakes

Analyzing Data with Power BI

20775: Performing Data Engineering on Microsoft HD Insight

IBM Big Data Summit 2012

1. Intoduction to Hadoop

Audience Profile The course will likely be attended by SQL Server report creators who are interested in alternative methods of presenting data.

Machine-generated data: creating new opportunities for utilities, mobile and broadcast networks

Analyzing Data with Power BI

Going beyond today: Extending the platform for cloud, mobile and analytics

Starting with Oracle Data Science in the Cloud

Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Azure PaaS and SaaS Microsoft s two approaches to building IoT solutions

Harnessing the Power of Big Data to Transform Your Business Anjul Bhambhri VP, Big Data, Information Management, IBM

IBM Sterling Gentran:Server for Windows

SAS & SAP Performance Meets Prediction. Casper Pedersen, SAS Institute

Architecture Overview for Data Analytics Deployments

Transcription:

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses November 2011 André Münger andre.muenger@emc.com +41 79 708 85 99 1

Agenda BIG Data Challenges Greenplum Overview Use Cases Summary Q&A 2

Big Data Challenges 3

BIG DATA 4

Data Sources Are Expanding Growth ratio of structured data to unstructured data will be approximately 1 : 8 Source : 2011 IDC Digital Universe Study 5

Perspective(s) Business IT Governance Resources Data Scientist.. 6

Databases Need to Adapt to Big Data Traditional RDBMS is not optimized for Big Data Analytics 50% of TDWI survey respondents will replace their DW platform in the next 3 years because: Cannot do advanced analysis Poor query response Can t support advanced analytics 40% 45% Cannot handle big data volumes Inadequate data load speed Can t scale up to large date volumes Cost of scaling up is too expensive 33% 37% 39% Poorly suited to real-time or on-demand workloads 29% Source: TDWI Next Gen Database Study, 2010 7

The Big Data Challenge Increased Volume of Data Increased No. of Formats / Sources Increased Business Demand Decreased time window Decreased budgets Decreased resources 8

It took us roughly 100 years from 9

. to space tourism 10

20 + Years of Evolution Data Warehouse Data Models BI Tools Consulting 11

20 + Years of Evolution Data Mining OLAP / BI Applications, Verticals Data Warehouse ROLAP MOLAP HOLAP Data Models BI Tools Consulting 12

20 + Years of Evolution OLAP / BI Data Mining Applications, Verticals Big Data Transition of traditional relational databases to MPP Massive Parallel Processing Data Warehouse ROLAP MOLAP HOLAP Turn unstructured data into actionable information. Data Models BI Tools Consulting 13

Greenplum Overview 14

Big Data UAP: Unified Analytics Platform 3 rd Party/Partner BI and Analytics Tools Greenplum Chorus Enterprise Collaboration Platform for Data Greenplum Data Computing Appliances Purpose-built for Big Data Analytics Greenplum Database Enterprise & Community Editions World s Most Scalable MPP Database Platform Greenplum HD Hadoop Enterprise & Community Editions Enterprise Analytics Platform for Unstructured Data 15

Greenplum DB MPP Shared-Nothing Architecture Greenplum s MPP database has extreme performance on commodity Infrastructure Optimized for BI and analytics Provides automatic parallelization Just load and query like any database Tables are automatically distributed across nodes No need for manual partitioning or tuning Extremely scalable and I/O optimized All nodes can scan and process in parallel Linear scalability by adding nodes Interconnect Loading 16

Massively Parallel Processing And Linear Performance Scalability Greenplum 4.0: Database Architecture SQL MapReduce Master Servers Query planning & dispatch...... Network Interconnect Segment Servers Query processing & data storage...... External Sources Loading, streaming, etc. 17

Greenplum HD Enterprise Edition Enterprise-Ready Hadoop Platform for Unstructured Data Reliable High Availability Mirroring Easier to Use NFS mountable System Management Faster 2 5x Faster than Apache Hadoop 18

Greenplum: Not Just About Technology Data Science teams will become the driving force for success with big data analytics University data science program collaboration with Stanford and UC Berkeley Greenplum s Data Science practice with leading PhDs and analytic tools expertise Community investment including the Greenplum Analytic Workbench, Community edition software, and Data Science Summits 19

Powerful Customer and Partner Ecosystem 20

Use Cases 21

Easynet / Retail Real-Time Scoring at POC Cross-selling Up-selling Customer Reward Program 22

Likelihood Of Conversion USE CASE Optimize Marketing Campaigns With Big Data Big Data Analytics Enables Better Customer Interactions HIGH Legacy System Greenplum In-Database Analytics Greenplum Big Data Analytics Clicks become users targeted to predicted outcomes Social Media, Blog and Press, & Competitor Website Behavior, Leveraged to Refine Predictions LOW 23

Customer Profit USE CASE Increase Revenue With Big Data Analytics Big Data Analytics Enables Increased Per Customer Profit For Retail Banking Firm HIGH Legacy System Greenplum Database BI Reporting Greenplum In-Database Analytics Greenplum Big Data Analytics LOW Agent Best Guess Branch Level Reporting Enabling Profit-based Recommendations Market Basket Analysis & Customer Lifetime Value Computations Enabling User-based Recommendations Data Enriched with Unstructured Activity Logs To Identify At Risk Customers TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED 24

Underwriting Risk USE CASE Reduce Risk With Big Data Analytics Big Data Analytics Enables Accurate Decisions For National Mortgage Underwriter HIGH Greenplum Database BI Reporting Greenplum In-Database Analytics Greenplum Big Data Analytics Legacy System Unstructured Data Sources Enrich The Data Delivering In Minutes What Was Days K-Means Clustering & Decision Tree Scoring Improves Accuracy LOW Monthly Risk Model Updates Daily Risk Model Updates TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED 25

Quality of Care USE CASE Innovate With Big Data Analytics Big Data Analytics Accelerate Health Care 2.0 for Evidence-based Care Provider HIGH Legacy System Greenplum Database BI Reporting Greenplum DB In-Database Analytics Greenplum Big Data Analytics Delivering 10 Years Of Data In Seconds Associative Rule Mining and User Clustering Improves Pathways External Data Sources Enable Personalized Medicine LOW Treatment Pathways on Summary Data Treatment Pathways on All the Data TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED 26

Summary 27

Possible Client Issues Performance Cost / TCO Value (Load / or Enduser) 28

If you face any of these Performance External POC on defined challenges. Value Analytic Labs with World Leading Data Scientists SAS / Greenplum Cost Cost assessment and Value Proposition Combined Internal POC including infrastructure and support 29

Summary In 5 years time MPP architectures will simply be the standard for Analytics and Data Warehouse infrastructures. In 5 years time the ability to Compete on Analytics will be the most important differentiator between companies on structured AND unstructured data. 30

Q & A 31