SAS & HADOOP ANALYTICS ON BIG DATA

Similar documents
SAS and Hadoop Technology: Overview

Bringing the Power of SAS to Hadoop Title

Big Data Analytics met Hadoop

Analytics in Action transforming the way we use and consume information

The Alpine Data Platform

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

5th Annual. Cloudera, Inc. All rights reserved.

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

BIG DATA AND HADOOP DEVELOPER

DataAdapt Active Insight

Session 30 Powerful Ways to Use Hadoop in your Healthcare Big Data Strategy

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

SAS FORUM RUSSIA Welcome

Modernizing Your Data Warehouse with Azure

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Building Your Big Data Team

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Common Customer Use Cases in FSI

Cask Data Application Platform (CDAP) Extensions

Cask Data Application Platform (CDAP)

Introducing Analytics with SAS Enterprise Miner. Matthew Stainer Business Analytics Consultant SAS Analytics & Innovation practice

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Hortonworks Connected Data Platforms


Hadoop and Analytics at CERN IT CERN IT-DB

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Hadoop Course Content

Enterprise Analytics Accelerating Your Path to Value with an Open Analytics Platform

Modern Analytics Architecture

Analytics Platform System

Big Data The Big Story

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Oracle Big Data Cloud Service

Spark and Hadoop Perfect Together

Optimal Infrastructure for Big Data

Why Big Data Matters? Speaker: Paras Doshi

Six Critical Capabilities for a Big Data Analytics Platform

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

H2O Powers Intelligent Product Recommendation Engine at Transamerica. Case Study

In-Memory Analytics: Get Faster, Better Insights from Big Data

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

Ensuring Trust in Big Data with SAP EIM Solutions. Scott Barrett Senior Director, Information Management Database & Technology Centre of Excellence

BIG DATA TRANSFORMS BUSINESS. The EMC Big Data Solution

Designing Business Intelligence Solutions with Microsoft SQL Server 2014

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

USING BIG DATA AND ANALYTICS TO UNLOCK INSIGHTS

ETL on Hadoop What is Required

IBM PureData System for Analytics Overview

Designing Business Intelligence Solutions with Microsoft SQL Server 2014 Course Code: 20467D

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

Exploring Big Data and Data Analytics with Hadoop and IDOL. Brochure. You are experiencing transformational changes in the computing arena.

CREATING A FOUNDATION FOR BUSINESS VALUE

Oracle Engineered Systems and Kalido

Data - tools for data integration, access, preparation, discovery, and data streaming.

Got Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes

Investor Presentation. Fourth Quarter 2015

E-guide Hadoop Big Data Platforms Buyer s Guide part 3

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

Business is being transformed by three trends

Managing Data Warehouse Growth in the New Era of Big Data

VANTAGE SOFTWARE OFFERS THE ONLY MODULAR APPLICATIONS DESIGNED TO CAPTURE AND ENHANCE THE UNIQUE QUALITIES THAT GIVE

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics Nov.

IBM Analytics. Data science is a team sport. Do you have the skills to be a team player?

MapR Pentaho Business Solutions

THE IMPORTANCE OF END USER DATA PREPARATION

When Big Data Meets Fast Data

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Make Business Intelligence Work on Big Data

COPYRIGHTED MATERIAL. 1Big Data and the Hadoop Ecosystem

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

Hybrid Data Management

Sr. Sergio Rodríguez de Guzmán CTO PUE

#mstrworld. A Deep Dive Into Self-Service Data Discovery In MicroStrategy. Vijay Anand Gianthomas Tewksbury Volpe. #mstrworld

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

Microsoft Big Data. Solution Brief

Big Data Introduction

Roles and Processes in Analytics Development

Big Data & Hadoop Advance

Insights to HDInsight

SOLUTION SHEET Hortonworks DataFlow (HDF ) End-to-end data flow management and streaming analytics platform

What s New. Bernd Wiswedel KNIME KNIME AG. All Rights Reserved.

Converting Big Data into Business Value with Analytics Colin White

Jason Virtue Business Intelligence Technical Professional

White Paper: SAS and Apache Hadoop For Government. Inside: Unlocking Higher Value From Business Analytics to Further the Mission

Taking Advantage of Cloud Elasticity and Flexibility

IBM Big Data Summit 2012

Welcome! 2013 SAP AG or an SAP affiliate company. All rights reserved.

Cloudera, Inc. All rights reserved.

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

GET MORE VALUE OUT OF BIG DATA

WebFOCUS: Business Intelligence and Analytics Platform

1. Intoduction to Hadoop

Digging into Hadoop-based Big Data Architectures

Hortonworks Data Platform

Apache Hadoop in the Datacenter and Cloud

Transcription:

SAS & HADOOP ANALYTICS ON BIG DATA

WHY HADOOP? OPEN SOURCE MASSIVE SCALE FAST PROCESSING COMMODITY COMPUTING DATA REDUNDANCY DISTRIBUTED

WHY HADOOP? Hadoop will soon become a replacement complement to: Business Intelligence; Data Warehousing; Data Integration; Analytics. HADOOP IN PRODUCTION: YES 10% NEVER #1 reason to go for Hadoop: Analytics (71%) < 12 MONTHS Challenges to Hadoop adoption: Hadoop has no analytic functions built in Cost: hefty payroll due to intensive hand coding 3+ YEARS < 36 MONTHS < 24 MONTHS SOURCE: 10 Myths About Hadoop - TDWI Best Practices Report

WHY SAS? ANALYTICS IN-MEMORY HIGH-PERFORMANCE DATA MANAGEMENT BUSINESS INTELLIGENCE DATA VISUALIZATION

WHY SAS? ANALYTICAL DECISION MAKING Competitive Advantage Optimize What is the best that can happen? Differentiators Predict What will happen next? Predict Prescribe Optimize What if these trends continue? Why is this happening? Statistical Analysis Forecast Alerts What actions are needed? Raw data Clean data Standard reports Ad hoc reports Query drill down Degree of Intelligence Where exactly is the problem? How many, how often, where? What happened?

AN ERA OF ABUNDANCE BIG DATA 2005 2007 2009 2011 2013 BIG DATA

AN ERA OF ABUNDANCE HADOOP 2005 2007 2009 2011 2013 BIG DATA HADOOP

AN ERA OF ABUNDANCE ANALYTICS 2005 2007 2009 2011 2013 BIG DATA HADOOP ANALYTICS

AN ERA OF ABUNDANCE WHERE WE ARE NOW 2005 2007 2009 2011 2013 BIG DATA Lots of data HADOOP Processing Power ANALYTICS Intelligence

SAS & HADOOP THE BUSINESS REASONING What organizations are looking for: Accuracy: bring superior analytics to Hadoop for more precise insights. Scalability: provide comprehensive support from data-to-decision to maximize the value of Hadoop across the enterprise. Governance: integrate and manage data in order to promote broad reuse and to comply with IT policies and procedures. Economics: drive bottom-line benefits by boosting the value of analytics infrastructure while reducing TCO.

SAS & HADOOP WHY THE MARRIAGE? High-performance Advanced Analytics; Business Intelligence and Data Visualization; At Massive Scale, on Distributed, Commodity Hardware

SAS & HADOOP HOW? SAS & Hadoop intersect in many ways: SAS can treat Hadoop just as any other data source, pulling data FROM Hadoop, when it is most convenient; SAS can work WITH Hadoop, lifting data in a purpose-built advanced analytics in-memory environment; SAS can work directly IN Hadoop, leveraging the distributed processing capabilities of Hadoop.

SAS & HADOOP SAS FROM HADOOP SAS accesses and extracts data from Hadoop to a SAS server for processing, and writes results back. Bridge to traditional SAS environments Hadoop treated as just another data source Performance limited to single pipe bandwidth Ideal when not all data is to be found in Hadoop, or when established process cannot run in Hadoop DATA MOVEMENT

SAS & HADOOP SAS WITH HADOOP SAS accesses and processes Hadoop data on SAS Servers while keeping the data and computations massively parallel. Provides capabilities Hadoop cannot do well Supports advanced analytics via shared computing Allows the scaling of data storage and analytics separately Ideal when analytical rigor, sophistication and governance are required DATA LIFT INTO MEMORY

SAS & HADOOP SAS IN HADOOP SAS processes data directly in the Hadoop cluster. SAS Embedded Process enables scalable SAS compute in Hadoop SAS compute is orchestrated via Hadoop technology Data manipulation, data quality, and scoring support Ideal when all data is landing in Hadoop, and Hadoop is the proper place for processing SAS LOGIC

SAS & HADOOP SAS IN HADOOP SAS processes data directly in the Hadoop cluster. SAS Embedded Process provides scalable SAS compute in Hadoop SAS compute is orchestrated via Hadoop technology Data manipulation, data quality, and scoring support Ideal when all data is landing in Hadoop, and Hadoop is the proper place for processing

SAS & HADOOP Prepare data IN Hadoop for analytics THE PRAGMATIC APPROACH Move data FROM Hadoop into a SAS environment Deploy and manage model score code IN Hadoop Lift data IN to memory for analytics at scale Use the right approach for what needs to be done! Explore data at scale, inmemory WITH data visualization Model data at scale inmemory WITH advanced modeling tools

SAS & HADOOP KEY POINTS SAS is the only vendor to work from + with + in Hadoop throughout the analytics lifecycle. All three approaches can be combined and coordinated, complementing each other for each situation. Each approach can evolve, mature and/or morph into the other. Metadata management across the whole analytics life cycle, crossing all Hadoop interactions, is key to success. SAS can help realize the value of Hadoop; bring production-analytics to the platform.

ROGERS MEDIA Data visualization and high performance analytics Processing data on 12 million customers 40 million records per month in Hortonworks More than 600 relevant web characteristics Several of us from Rogers in the room looked at each other, and said That is really wicked; that s cool. Chris Dingle Senior Director of Audience Solutions Rogers Communications

MACY S 20% reduction in churn $500,000 annual savings Customer lifetime value analysis More accurate response prediction Optimized promotions... they can look at data and spend more time analyzing it and become internal consultants who provide more of the insight behind the data. Kerem Tomak Vice President of Analytics

SAS/ACCESS TO HADOOP Uses Existing SAS Interfaces Standard Libname syntax PROC HADOOP Datastep and Proc SQL translated to Hive Filename support Execute Pig Scripts and MapReduce Push-down of certain procedures Custom SerDe support SPDE formats

SAS/ACCESS TO IMPALA Massively Parallel Processing (MPP) query engine SQL queries against the Hadoop file system (HDFS) Optimized for interactive queries Similar to Hive in function, but different in implementation Extraordinary performance

SAS/ACCESS TO HAWQ Direct, transparent access to the Pivotal HAWQ SQL engine SQL pass-through Enable you to interact with HBase Bulk loading, faster than inserting

SAS VISUAL ANALYTICS - EXPLORER Data exploration at massive scale Intuitive visual analytics

SAS VISUAL STATISTICS Descriptive and Predictive Modeling Model comparison Dynamic groupby processing

SAS IN-MEMORY STATISTICS FOR HADOOP In-Memory Statistics for Hadoop: Interactive Programming interface for SAS model development

SAS VISUAL ANALYTICS REPORT DESIGNER Visual Analytic Designer and Viewer: Reporting and analysis for broad audiences

SAS VISUAL ANALYTICS VIEWER FOR MOBILE Mobile BI for reporting

SAS HIGH-PERFORMANCE DATA MINING Highperformance procedure nodes in SAS Enterprise Miner

SAS DATA LOADER FOR HADOOP SAS Code Accelerator (DS2) Embedded Process and Hive Parallel data loading No data movement Data Profiling Data Quality Accelerator

SAS DATA MANAGEMENT CAN USE ALL THREE EP EP EP

SAS 9.4 SUPPORTED HADOOP DISTRIBUTION Cloudera CDH ( 支持 Kerberos) Hortonworks HDP ( 支持 Kerberos) MapR ( 未验证 Kerberos) Pivotal HD ( 支持 Kerberos) IBM BigInsights ( 未验证 Kerberos)