Apache Hadoop in the Datacenter and Cloud

Size: px
Start display at page:

Download "Apache Hadoop in the Datacenter and Cloud"

Transcription

1 Apache Hadoop in the Datacenter and Cloud

2 The Shift to the Connected Data Architecture Digital Transformation fueled by Big Data Analytics and IoT ACTIONABLE INTELLIGENCE Cloud and Data Center IDMS Relational Database Data in Motion Data at Rest Powered by Open Source System centric Mainframe Client / Server Web and SaaS 2 Hortonworks Inc All Rights Reserved Modern Applications Connected Data Architecture User centric Transformational Use Cases Predictive Retail Factory Automation Connected Cars Predictive Analytics Artificial Intelligence

3 Hadoop in the Data Center Create and Manage Central Data Lakes Support all Types of Data Provide Flexible Processing and Access Methods Reduce Architecture Costs by 80% or More Drive Transformational New Use Cases 3 Hortonworks Inc All Rights Reserved

4 Hadoop in the Cloud Fast On Ramp for New Users Elastic Compute and Storage Capabilities Zero configuration access engine capabilities (HD Insight) Eliminate Hardware purchases Facilitate Certain Modern Data Applications through Cloud Connectivity 4 Hortonworks Inc All Rights Reserved

5 Transformational Applications Require Connected Data Edge Analytics Machine Learning CLOUD Edge Data Data in Motion Data at Rest Stream Analytics DATA CENTER Data in Motion Data at Rest Edge Data Deep Historical Analysis Hortonworks Inc All Rights Reserved

6 Our Focus: Enable Modern Applications on Connected Data Platforms Continuous Insights Enterprise Ready Any Delivery Model Open Innovation Deliver insights from ALL data, origin to rest Management Security Governance Data Center Cloud Hybrid Architecture Community Ecosystem Hortonworks Inc All Rights Reserved

7 A Look at Hadoop in the Data Center 7 Hortonworks Inc All Rights Reserved

8 Actionable Intelligence from Connected Data Platforms Modern Data Applications Capturing perishable insights from data in motion Ensuring rich, historical insights on data at rest Necessary for modern data applications DATA IN MOTION ACTIONABLE INTELLIGENCE DATA AT REST Hortonworks DataFlow Hortonworks Data Platform 8 Hortonworks Inc All Rights Reserved

9 Hortonworks Data Platform for Data at Rest Powered by Open Enterprise Hadoop Open Central Interoperable Ready 9 Hortonworks Inc All Rights Reserved

10 Hortonworks Data Platform 2.5 Highlights Dynamic Security: Apache Atlas + Ranger Integration Enterprise Spark at Scale: Apache Zeppelin Notebook for Spark Real Time Applications: Storm and HBase/Phoenix Streamlined Operations: Apache Ambari Interactive Query in Seconds: Hive with LLAP (Technical Preview ) 10 Hortonworks Inc All Rights Reserved

11 Apache Atlas + Ranger More Powerful Together 11 Hortonworks Inc All Rights Reserved

12 Introducing Tag Based Security Apache Atlas and Ranger Integration Basic Tag policy Access and entitlements can be based on attributes. As an example: Personally Identifiable Information (PII) is a tag that can be leveraged to protect sensitive personal data. Geo based policy Access policy based on location. As an example: A user might be able to access data in North America, but may be restricted from access in EMEA due to privacy compliance. Time based policy Access policy based on time windows. An an example: A user might be able to access data only between 8AM 5PM (common in SOX regulations.) Prohibitions Restrictions on combining two data sets which might be in compliance originally, but not when combined together. As an example, SSNs and Names) Key Benefits: New scalable metadata based security paradigm Dynamic, real time policy Automatic updates to changes in metadata Centralized and simple to manage policy 12 Hortonworks Inc All Rights Reserved

13 Apache Atlas Powers Cross Component Data Lineage As a part of HDP 2.5, users can track lineage across the following components using Atlas: Apache Sqoop Import from and export to relational databases, and additional package that leverages Sqoop Hive Dataset lineage with entity versioning (including schema changes) Apache Kafka/ Storm IoTevent level processing, such as syslogs or sensor data Falcon Data lifecycle at Feed and Process entity level for replication, and repeating workflows. Tracks period icy, throttling, eviction. ATLAS 69 FALCON 1570 Key Benefits: Enterprises need open solutions, not single app vendor More native connectors than any other vendor Hardened metadata infrastructure 13 Hortonworks Inc All Rights Reserved

14 Expanded Native Connector: Dataset Lineage Teradata Connector Apache Kafka RDBMS Sqoop Custom Activity Reporter Metadata Repository 14 Hortonworks Inc All Rights Reserved

15 Apache Atlas Enables Business Catalog for Ease of Use Organize data assets along business terms Authoritative: Hierarchical business Taxonomy Creation Agile modeling: Model Conceptual, Logical, Physical assets Definition and assignment of tags like PII (Personally Identifiable Information) Comprehensive features for compliance Multiple user profiles including Data Steward and Business Analysts Object auditing to track Who did it Metadata Versioning to track what did they do Faster Insight: Data Quality tab for profiling and sampling User Comments Key Benefits: Easy way to create business Taxonomy Useful for multiple user types including Data Steward and Business Analysts Comprehensive features for compliance 15 Hortonworks Inc All Rights Reserved

16 Business Catalog Model and explore metadata via the new Business Catalog in Apache Atlas Data Steward 16 Hortonworks Inc All Rights Reserved

17 Streamlining Operations, Three Phase Plan Focused Strategic Investments into our core products to give customers more unique tooling to quickly understand the cluster s health, how business users are using it, and where to focus efforts when issues arise. Capabilities Phase 1: Advanced Performance & Health Metrics Dashboards with Ambari Phase 2: Consolidated Cluster Activity Reporting NEW! with SmartSense Phase 3: Centralized & Contextual Log Search Tech Preview with Ambari Core Technologies Apache Ambari Ambari Metrics System Apache Solr Hortonworks SmartSense Grafana Ambari Metrics System Grafana Solr AMBARI Log Search Dedicated UIs SmartSense 17 Hortonworks Inc All Rights Reserved

18 Streamlined Operations Phase 1: Advanced Metrics Visualization & Dashboarding Grafana Goal: Quickly understand cluster health metrics and key performance indicators Ambari Metrics System AMBARI Capabilities Centralized Dashboarding focusing on component Health & Performance Ad Hoc Graph Creation Pre Built Dashboards HDFS YARN HBase Core Technologies Ambari Metrics System Grafana 18 Hortonworks Inc All Rights Reserved

19 19 Hortonworks Inc All Rights Reserved Ambari now includes pre built dashboards for visualizing cluster health

20 Streamlined Operations Phase 2: Consolidated Cluster Activity Reporting AMBARI Ambari Ambari Metrics Metrics System System SmartSense Apache Zeppelin Goal: Quickly visualize and report on how business users and tenants are using the cluster, top 10 queues, users, most time consuming jobs Capabilities Top K Activity Reporting Chargeback Services Covered YARN MapReduce Hive/Tez Spark HDFS 20 Hortonworks Inc All Rights Reserved Core Technologies Hortonworks SmartSense Apache Zeppelin

21 Activity Explorer: Cluster Utilization Reporting 21 Hortonworks Inc All Rights Reserved

22 Preview: Streamlined Operations Investments Phase 3: Centralized & Contextual Log Search AMBARI Goal: When issues arise, be able to quickly find issues across all HDP components Solr Log Search Capabilities Rapid Search of all HDP component logs Search across time ranges, log levels, and for keywords Core Technologies: Apache Ambari Apache Solr Apache Ambari Log Search 22 Hortonworks Inc All Rights Reserved

23 23 Hortonworks Inc All Rights Reserved Tune the log collection system with Guided Smart Configurations

24 24 Hortonworks Inc All Rights Reserved View a comprehensive inventory of operational logs for each host

25 Hive 2 with LLAP Enable Interactive Query In Seconds Developer Productivity: Interactive query in seconds Ease of Use and Adoption : 100% compatible with Hive SQL Enterprise Readiness: Linear scaling at Terabytes volume of data Streamlined Operations: LLAP integration with Ambari with automated dashboards 25 Hortonworks Inc All Rights Reserved

26 Hive 2 with LLAP: Preliminary Numbers 80 Hive2.0 and LLAP: TPC DS at 10 TB Scale, 18 Nodes Min query time: Query 55: 2.38s Hive2.0 Tez LLAP q3 q7 q12 q13 q19 q21 q26 q27 q42 q43 q45 q52 q55 q60 q73 q84 q89 q91 q98 26 Hortonworks Inc All Rights Reserved

27 A Look at Hadoop in the Cloud 27 Hortonworks Inc All Rights Reserved

28 Traditional Hadoop Clusters 28 Hortonworks Inc All Rights Reserved 28

29 Why Cloud? IT & Business Agility No Upfront HW Costs Ephemeral & Long Running Unlimited Elastic Scale Hortonworks Inc All Rights Reserved

30 How Do We Approach The Cloud Market? HYBRID SEGMENT Today s enterprise customers CLOUD ONRAMP New users via digital engagement or existing customers exploring cloud options Seamless Connected Data Architecture across Cloud and Data Center. Always on enterprise use cases are common. Elasticity, Automation, Pay as you Go, One Click Start. Ephemeral use cases are common starting point. AzureHDInsight, HDP, and HDF are our Premier offerings. Customer journey to future state architecture, cloud operation & consumption model. AzureHDInsightis our Premier offering. Focused offerings for AWS that enable us to engage and position our Premier offerings. Cloud first approach to product design, development, testing & delivery 30 Hortonworks Inc All Rights Reserved

31 Outlook: Cloud and the Big Data Market Public cloud adoption (AWS, Azure, Google) will continue to accelerate Many customers will go Cloud First to simplify/speed adoption Customers deploying in public cloud expect a pay as you go (PAYG) pricing model Hourly pricing is default; reserved optimizes annual spend; spot optimizes hourly spend Interested in running workloads in the cloud and in addition to on premiseclusters. Familiar with Native Cloud tooling. Heightens importance of product packaging and user experience tuned to Cloud 31 Hortonworks Inc All Rights Reserved

32 Cloud IaaS and Hadoop as a Service Running Hadoop on Cloud IaaS Using Hadoop as a Cloud Service Public Cloud Service Providers 32 Hortonworks Inc All Rights Reserved

33 Microsoft Azure HDInsights Powered by Hortonworks Data Platform Seamless Access to the Public Cloud for Spark, Hive, and HBaseand other mission critical workloads Unmatched Economics combining HDInsight selasticity in the cloud with HDP s cost efficiencies at scale Enterprise Readiness with robust security, governance and operations in the cloud, powered by Hortonworks Data Platform 33 Hortonworks Inc All Rights Reserved

34 Connected Data Architecture with Azure HDInsight CLOUD Azure HDInsight Cloud Data Processing HDInsightCluster Types Ideal Use Cases Data Prep, Query, and Analysis (Hadoop, Hive, Pig) Iterative In Memory Analysis (Spark) HDF Data Flow Management Advanced Statistics, Modeling, Machine Learning (R Server on Spark) NoSQLData Storage (HBase) DATA CENTER HDP Enterprise Data Lake Real time Event Processing (Storm) Hortonworks Inc All Rights Reserved

35 Runs in more datacenters than anyone else Central US Iowa North Central US Illinois West Europe Netherlands China North * Beijing West US California South Central US Texas East US Virginia East US 2 Virginia North Europe Ireland India Central Pune China South * Shanghai Japan East Tokyo, Saitama Japan West Osaka East Asia Hong Kong SE Asia Singapore Australia East New South Wales Brazil South Sao Paulo State Australia South East Victoria Azure doubling compute and storage every 6 months 35 Hortonworks Inc All Rights Reserved

36 Microsoft Azure HDInsight and Apache Projects in the Cloud YARN DATA OPERATING SYSTEM Batch STORAGE GOVERNANCE OPERATIONS SECURITY STORAGE Machine Learning Standard Hadoop Projects for Hive, YARN, HDFS, MapReduce, Pig, Tez, Sqoop, oozie, Zookeeper, Mahout, Phoenix CompehensiveList of Emerging Projects Spark, Storm Hbase, and R Interactive Streaming Ability to Add Projects Add various projects to the the cloud Search 36 Hortonworks Inc All Rights Reserved

37 Forrester Wave : Big Data HadoopCloud Solutions, Q Elasticity, Automation, And Pay As You Go Compel Enterprise Adoption Of Hadoop In The Cloud 37 Hortonworks Inc All Rights Reserved

38 Connected Data Architecture with HDC for AWS CLOUD HDF Data Flow Management HDC for AWS Cloud Data Processing Ideal Use Cases Data Science and Exploration (Spark, Zeppelin) ETL and Data Preparation (Hive, Spark) DATA CENTER Hortonworks Inc All Rights Reserved HDP Enterprise Data Lake TECH PREVIEW Analytics and Reporting (Hive2 w/llap, Zeppelin)

39 Hortonworks Data Cloud for AWS Cluster Types 39 Hortonworks Inc All Rights Reserved TECH PREVIEW

40 Prescriptive On Demand Ephemeral Workloads ** Planned list of available Cluster Types 40 Hortonworks Inc All Rights Reserved TECH PREVIEW

41 Why Hortonworks Cloud Solutions? Choice of Cloud Rich Set of Capabilities and Security Zero configuration access engine capabilities (HD Insight) S3 Integrations on AWS (Tech Preview) Award Winning Hadoop Expertise 41 Hortonworks Inc All Rights Reserved

42 Connected Data Platforms Integrate Cloud and Data Center Deployments Edge Analytics Machine Learning CLOUD Edge Data Data in Motion Data at Rest Stream Analytics DATA CENTER Data in Motion Data at Rest Edge Data Deep Historical Analysis Hortonworks Inc All Rights Reserved

43 Thank You 43 Hortonworks Inc All Rights Reserved

Insights to HDInsight

Insights to HDInsight Insights to HDInsight Why Hadoop in the Cloud? No hardware costs Unlimited Scale Pay for What You Need Deployed in minutes Azure HDInsight Big Data made easy Enterprise Ready Easier and more productive

More information

Hortonworks Connected Data Platforms

Hortonworks Connected Data Platforms Hortonworks Connected Data Platforms MASTER THE VALUE OF DATA EVERY BUSINESS IS A DATA BUSINESS EMBRACE AN OPEN APPROACH 2 Hortonworks Inc. 2011 2016. All Rights Reserved Data Drives the Connected Car

More information

Hortonworks Data Platform

Hortonworks Data Platform Hortonworks Data Platform An open-architecture platform to manage data in motion and at rest Highlights Addresses a range of data-at-rest use cases Powers real-time customer applications Delivers robust

More information

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Duration: 5 days; Instructor-led Implement Spark Streaming Using the DStream API. Develop Big Data Real-Time Processing Solutions with Apache

More information

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. Course Content Course Description: The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight. At Course Completion: After competing this course,

More information

SOLUTION SHEET Hortonworks DataFlow (HDF ) End-to-end data flow management and streaming analytics platform

SOLUTION SHEET Hortonworks DataFlow (HDF ) End-to-end data flow management and streaming analytics platform SOLUTION SHEET Hortonworks DataFlow (HDF ) End-to-end data flow management and streaming analytics platform CREATE STREAMING ANALYTICS APPLICATIONS IN MINUTES WITHOUT WRITING CODE The increasing growth

More information

20775: Performing Data Engineering on Microsoft HD Insight

20775: Performing Data Engineering on Microsoft HD Insight Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com

More information

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel AZURE HDINSIGHT Azure Machine Learning Track Marek Chmel SESSION AGENDA Understanding different scenarios of Hadoop Building an end to end pipeline using HDInsight Using in-memory techniques to analyze

More information

20775A: Performing Data Engineering on Microsoft HD Insight

20775A: Performing Data Engineering on Microsoft HD Insight 20775A: Performing Data Engineering on Microsoft HD Insight Course Details Course Code: Duration: Notes: 20775A 5 days This course syllabus should be used to determine whether the course is appropriate

More information

Digital transformation is the next industrial revolution

Digital transformation is the next industrial revolution Digital transformation is the next industrial revolution Steam, water, mechanical production equipment Division of labor, electricity, mass production Electronics, IT, automated production Blurring the

More information

20775 Performing Data Engineering on Microsoft HD Insight

20775 Performing Data Engineering on Microsoft HD Insight Duración del curso: 5 Días Acerca de este curso The main purpose of the course is to give students the ability plan and implement big data workflows on HD. Perfil de público The primary audience for this

More information

SOLUTION SHEET End to End Data Flow Management and Streaming Analytics Platform

SOLUTION SHEET End to End Data Flow Management and Streaming Analytics Platform SOLUTION SHEET End to End Data Flow Management and Streaming Analytics Platform CREATE STREAMING ANALYTICS APPLICATIONS IN MINUTES WITHOUT WRITING CODE The increasing growth of data, especially data-in-motion,

More information

Microsoft Azure Essentials

Microsoft Azure Essentials Microsoft Azure Essentials Azure Essentials Track Summary Data Analytics Explore the Data Analytics services in Azure to help you analyze both structured and unstructured data. Azure can help with large,

More information

Big data is hard. Top 3 Challenges To Adopting Big Data

Big data is hard. Top 3 Challenges To Adopting Big Data Big data is hard Top 3 Challenges To Adopting Big Data Traditionally, analytics have been over pre-defined structures Data characteristics: Sales Questions answered with BI and visualizations: Customer

More information

Digitalisieren Sie Ihr Unternehmen mit dem Internet der Dinge Michael Epprecht Microsoft GBB IoT

Digitalisieren Sie Ihr Unternehmen mit dem Internet der Dinge Michael Epprecht Microsoft GBB IoT Digicomp Microsoft Evolution Day 2015 1 Digitalisieren Sie Ihr Unternehmen mit dem Internet der Dinge Michael Epprecht Microsoft GBB IoT michael.epprecht@microsoft.com @fastflame Partner: Becoming a digital

More information

Sr. Sergio Rodríguez de Guzmán CTO PUE

Sr. Sergio Rodríguez de Guzmán CTO PUE PRODUCT LATEST NEWS Sr. Sergio Rodríguez de Guzmán CTO PUE www.pue.es Hadoop & Why Cloudera Sergio Rodríguez Systems Engineer sergio@pue.es 3 Industry-Leading Consulting and Training PUE is the first Spanish

More information

Business is being transformed by three trends

Business is being transformed by three trends Business is being transformed by three trends Big Cloud Intelligence Stay ahead of the curve with Cortana Intelligence Suite Business apps People Custom apps Apps Sensors and devices Cortana Intelligence

More information

1 Hortonworks Inc All Rights Reserved

1 Hortonworks Inc All Rights Reserved Round Table Discussion Security & Governance Session#1: Merck, ING, Clearsense, Charter, HCSC Session#2: Discover, Universal, Expedia, Honeywell, SunLife, Geisinger, Bloomberg Customer Advisory Board June

More information

EXAMPLE SOLUTIONS Hadoop in Azure HBase as a columnar NoSQL transactional database running on Azure Blobs Storm as a streaming service for near real time processing Hadoop 2.4 support for 100x query gains

More information

Analytics Platform System

Analytics Platform System Analytics Platform System Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com Ofc 425-538-0044, Cell 303-324-2860 Sean Mikha, DW & Big Data Architect semikha@microsoft.com

More information

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS) ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS) Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how Dell EMC Elastic Cloud Storage (ECS ) can be used to streamline

More information

Hortonworks Data Platform for Enterprise Data Lakes delivers robust, big data analytics that accelerate decision making and innovation

Hortonworks Data Platform for Enterprise Data Lakes delivers robust, big data analytics that accelerate decision making and innovation IBM United States Software Announcement 218-187, dated March 20, 2018 Hortonworks Data Platform for Enterprise Data Lakes delivers robust, big data analytics that accelerate decision making and innovation

More information

Big Data & Advanced Analytics - "managed Services on Azure

Big Data & Advanced Analytics - managed Services on Azure Big Data & Advanced Analytics - "managed Services on Azure Guido Jacobs Global Black Belt TSP Big Data Guido.Jacobs@microsoft.com Microsoft Deutschland GmbH Hyper scale Infrastruktur das ist Azure! US

More information

MapR: Solution for Customer Production Success

MapR: Solution for Customer Production Success 2015 MapR Technologies 2015 MapR Technologies 1 MapR: Solution for Customer Production Success Big Data High Growth 700+ Customers Cloud Leaders Riding the Wave with Hadoop The Big Data Platform of Choice

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation Roger Ding Cloudera February 3rd, 2018 1 Agenda Hadoop History Introduction to Apache Hadoop

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

E-guide Hadoop Big Data Platforms Buyer s Guide part 1 Hadoop Big Data Platforms Buyer s Guide part 1 Your expert guide to Hadoop big data platforms for managing big data David Loshin, Knowledge Integrity Inc. Companies of all sizes can use Hadoop, as vendors

More information

Oracle Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Data Warehouse Cloud Today s leading-edge organizations

More information

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud Microsoft Technology Centers Microsoft Technology Centers Experience the Microsoft Cloud Experience the Microsoft Cloud ML Data Camp Ivan Kosyakov MTC Architect, Ph.D. Top Manager IT Analyst Big Data Strategic

More information

Cloud Based Analytics for SAP

Cloud Based Analytics for SAP Cloud Based Analytics for SAP Gary Patterson, Global Lead for Big Data About Virtustream A Dell Technologies Business 2,300+ employees 20+ data centers Major operations in 10 countries One of the fastest

More information

Hadoop Stories. Tim Marston. Director, Regional Alliances Page 1. Hortonworks Inc All Rights Reserved

Hadoop Stories. Tim Marston. Director, Regional Alliances Page 1. Hortonworks Inc All Rights Reserved Hadoop Stories Tim Marston Director, Regional Alliances EMEA Page 1 @timmarston Page 2 Plans for Hadoop Adoption (Gartner, May 2015) Start within 1 year 11% Start within 2 years 7% Already doing 27% No

More information

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect

Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect Aurélie Pericchi SSP APS Laurent Marzouk Data Insight & Cloud Architect 2005 Concert de Coldplay 2014 Concert de Coldplay 90% of the world s data has been created over the last two years alone 1 1. Source

More information

Depending on who you ask, IoT is either:

Depending on who you ask, IoT is either: Depending on who you ask, IoT is either: Nothing new We ve been doing this for 40 years A unicorn Magic, and will soon change everything. Connect devices and monitor telemetry Things Monitor and track

More information

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara Pentaho 8.0 and Beyond Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our

More information

Confidential

Confidential June 2017 1. Is your EDW becoming too expensive to maintain because of hardware upgrades and increasing data volumes? 2. Is your EDW becoming a monolith, which is too slow to adapt to business s analytical

More information

Two offerings which interoperate really well

Two offerings which interoperate really well Microsoft Two offerings which interoperate really well On-premises Cortana Intelligence Suite SQL Server 2016 Cloud IAAS Enterprise PAAS Cloud Storage Service 9 SQL Server 2016: Everything built-in built-in

More information

Modernizing Your Data Warehouse with Azure

Modernizing Your Data Warehouse with Azure Modernizing Your Data Warehouse with Azure Big data. Small data. All data. Christian Coté S P O N S O R S The traditional BI Environment The traditional data warehouse data warehousing has reached the

More information

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Simplifying the Process of Uploading and Extracting Data from Apache Hadoop Rohit Bakhshi, Solution Architect, Hortonworks Jim Walker, Director Product Marketing, Talend Page 1 About Us Rohit Bakhshi Solution

More information

Architecting an Open Data Lake for the Enterprise

Architecting an Open Data Lake for the Enterprise Architecting an Open Data Lake for the Enterprise 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Today s Presenters Daniel Geske, Solutions Architect, Amazon Web Services Armin

More information

Cask Data Application Platform (CDAP)

Cask Data Application Platform (CDAP) Cask Data Application Platform (CDAP) CDAP is an open source, Apache 2.0 licensed, distributed, application framework for delivering Hadoop solutions. It integrates and abstracts the underlying Hadoop

More information

Application Performance Management for Microsoft Azure and HDInsight

Application Performance Management for Microsoft Azure and HDInsight ebook Application Performance Management for Microsoft Azure and HDInsight How to build fast and reliable Big Data Apps on Microsoft Azure HDInsight and Azure Analytical Services Microsoft Azure makes

More information

Big Data Introduction

Big Data Introduction Big Data Introduction Who we are Experts At Your Service Over 50 specialists in IT infrastructure Certified, experienced, passionate Based In Switzerland 100% self-financed Swiss company Over CHF8 mio.

More information

Make Business Intelligence Work on Big Data

Make Business Intelligence Work on Big Data Make Business Intelligence Work on Big Data Speed. Scale. Simplicity. Put the Power of Big Data in the Hands of Business Users Connect your BI tools directly to your big data without compromising scale,

More information

Optimal Infrastructure for Big Data

Optimal Infrastructure for Big Data Optimal Infrastructure for Big Data Big Data 2014 Managing Government Information Kevin Leong January 22, 2014 2014 VMware Inc. All rights reserved. The Right Big Data Tools for the Right Job Real-time

More information

Pentaho 8.0 Overview. Pedro Alves

Pentaho 8.0 Overview. Pedro Alves Pentaho 8.0 Overview Pedro Alves Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our current intended product direction. It is provided for information

More information

BIG DATA AND HADOOP DEVELOPER

BIG DATA AND HADOOP DEVELOPER BIG DATA AND HADOOP DEVELOPER Approximate Duration - 60 Hrs Classes + 30 hrs Lab work + 20 hrs Assessment = 110 Hrs + 50 hrs Project Total duration of course = 160 hrs Lesson 00 - Course Introduction 0.1

More information

5th Annual. Cloudera, Inc. All rights reserved.

5th Annual. Cloudera, Inc. All rights reserved. 5th Annual 1 The Essentials of Apache Hadoop The What, Why and How to Meet Agency Objectives Sarah Sproehnle, Vice President, Customer Success 2 Introduction 3 What is Apache Hadoop? Hadoop is a software

More information

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud Datametica The Modern Data Platform Enterprise Data Hub Implementations Why is workload moving to Cloud 1 What we used do Enterprise Data Hub & Analytics What is Changing Why it is Changing Enterprise

More information

E-guide Hadoop Big Data Platforms Buyer s Guide part 3

E-guide Hadoop Big Data Platforms Buyer s Guide part 3 Big Data Platforms Buyer s Guide part 3 Your expert guide to big platforms enterprise MapReduce cloud-based Abie Reifer, DecisionWorx The Amazon Elastic MapReduce Web service offers a managed framework

More information

Analytics for All Your Data: Cloud Essentials. Pervasive Insight in the World of Cloud

Analytics for All Your Data: Cloud Essentials. Pervasive Insight in the World of Cloud Analytics for All Your Data: Cloud Essentials Pervasive Insight in the World of Cloud The Opportunity We re living in a world where just about everything we see, do, hear, feel, and experience is captured

More information

Spark and Hadoop Perfect Together

Spark and Hadoop Perfect Together Spark and Hadoop Perfect Together Arun Murthy Hortonworks Co-Founder @acmurthy Data Operating System Enable all data and applications TO BE accessible and shared BY any end-users Data Operating System

More information

Building a Data Lake on AWS EBOOK: BUILDING A DATA LAKE ON AWS 1

Building a Data Lake on AWS EBOOK: BUILDING A DATA LAKE ON AWS 1 Building a Data Lake on AWS EBOOK: BUILDING A DATA LAKE ON AWS 1 Contents Introduction The Big Data Challenge Benefits of a Data Lake Building a Data Lake on AWS Featured Data Lake Partner Bronze Drum

More information

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK Are you drowning in Big Data? Do you lack access to your data? Are you having a hard time managing Big Data processing requirements?

More information

Cask Data Application Platform (CDAP) Extensions

Cask Data Application Platform (CDAP) Extensions Cask Data Application Platform (CDAP) Extensions CDAP Extensions provide additional capabilities and user interfaces to CDAP. They are use-case specific applications designed to solve common and critical

More information

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW TOPICS COVERED 1 2 Fundamentals of Big Data Platforms Major Big Data Tools Scaling Up vs. Out SCALE UP (SMP) SCALE OUT (MPP) + (n) Upgrade

More information

Analytics in Action transforming the way we use and consume information

Analytics in Action transforming the way we use and consume information Analytics in Action transforming the way we use and consume information Big Data Ecosystem The Data Traditional Data BIG DATA Repositories MPP Appliances Internet Hadoop Data Streaming Big Data Ecosystem

More information

How In-Memory Computing can Maximize the Performance of Modern Payments

How In-Memory Computing can Maximize the Performance of Modern Payments How In-Memory Computing can Maximize the Performance of Modern Payments 2018 The mobile payments market is expected to grow to over a trillion dollars by 2019 How can in-memory computing maximize the performance

More information

Databricks Cloud. A Primer

Databricks Cloud. A Primer Databricks Cloud A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to

More information

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica Accelerating Your Big Data Analytics Jeff Healey, Director Product Marketing, HPE Vertica Recent Waves of Disruption IT Infrastructu re for Analytics Data Warehouse Modernization Big Data/ Hadoop Cloud

More information

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved. Apache Spark 2.0 GA The General Engine for Modern Analytic Use Cases 1 Apache Spark Drives Business Innovation Apache Spark is driving new business value that is being harnessed by technology forward organizations.

More information

Azure: Microsoft Cloud. Microsoft Cloud End-to-end solutions

Azure: Microsoft Cloud. Microsoft Cloud End-to-end solutions Azure: Microsoft Cloud Microsoft Cloud End-to-end solutions 5 Azure is an open cloud DevOps Clients Management Applications PaaS & DevOps App Frameworks & Tools Databases & Middleware Infrastructure Hyper

More information

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake White Paper Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake Motivation for Modernization It is now a well-documented realization among Fortune 500 companies

More information

TechArch Day Digital Decoupling. Oscar Renalias. Accenture

TechArch Day Digital Decoupling. Oscar Renalias. Accenture TechArch Day 2018 Digital Decoupling Oscar Renalias Accenture !"##$ oscar.renalias@acenture.com @oscarrenalias https://www.linkedin.com/in/oscarrenalias/ https://github.com/accenture THE ERA OF THE BIG

More information

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer Redefine Big Data: EMC Data Lake in Action Andrea Prosperi Systems Engineer 1 Agenda Data Analytics Today Big data Hadoop & HDFS Different types of analytics Data lakes EMC Solutions for Data Lakes 2 The

More information

Industrial IoT Solution Architecture Design From Connectivity to Data

Industrial IoT Solution Architecture Design From Connectivity to Data Industrial IoT Solution Architecture Design From Connectivity to Data Cheryl Hsu Program Manager Strategic Engagement & Industrial IoT, Microsoft IoT Enables a Digital Feedback Loop The benefits are profound

More information

WELCOME TO. Cloud Data Services: The Art of the Possible

WELCOME TO. Cloud Data Services: The Art of the Possible WELCOME TO Cloud Data Services: The Art of the Possible Goals for Today Share the cloud-based data management and analytics technologies that are enabling rapid development of new mobile applications Discuss

More information

Hadoop Course Content

Hadoop Course Content Hadoop Course Content Hadoop Course Content Hadoop Overview, Architecture Considerations, Infrastructure, Platforms and Automation Use case walkthrough ETL Log Analytics Real Time Analytics Hbase for Developers

More information

Investor Presentation. Fourth Quarter 2015

Investor Presentation. Fourth Quarter 2015 Investor Presentation Fourth Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Big Data Cloud. Simple, Secure, Integrated and Performant Big Data Platform for the Cloud

Big Data Cloud. Simple, Secure, Integrated and Performant Big Data Platform for the Cloud Big Data Cloud Simple, Secure, Integrated and Performant Big Data Platform for the Cloud Big Data Platform engineered for the data-driven enterprise Oracle s Big Data Cloud delivers a Big Data Platform

More information

Common Customer Use Cases in FSI

Common Customer Use Cases in FSI Common Customer Use Cases in FSI 1 Marketing Optimization 2014 2014 MapR MapR Technologies Technologies 2 Fortune 100 Financial Services Company 104M CARD MEMBERS 3 Financial Services: Recommendation Engine

More information

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper Sponsored by Successful Data Warehouse Approaches to Meet Today s Analytics Demands EXECUTIVE BRIEF In this Paper Organizations are adopting increasingly sophisticated analytics methods Analytics usage

More information

Modern Data Architecture with Apache Hadoop

Modern Data Architecture with Apache Hadoop Modern Data Architecture with Apache Hadoop Automating Data Transfer with Attunity Replicate Presented by Hortonworks and Attunity Executive Summary Apache Hadoop didn t disrupt the datacenter, the data

More information

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia MapR: Converged Data Pla3orm and Quick Start Solu;ons Robin Fong Regional Director South East Asia Who is MapR? MapR is the creator of the top ranked Hadoop NoSQL SQL-on-Hadoop Real Database time streaming

More information

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect (technical) Updates & demonstration Robert Voermans Governance architect Amsterdam Please note IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Spotlight Sessions. Nik Rouda. Director of Product Marketing Cloudera, Inc. All rights reserved. 1

Spotlight Sessions. Nik Rouda. Director of Product Marketing Cloudera, Inc. All rights reserved. 1 Spotlight Sessions Nik Rouda Director of Product Marketing Cloudera @nrouda Cloudera, Inc. All rights reserved. 1 Spotlight: Protecting Your Data Nik Rouda Product Marketing Cloudera, Inc. All rights reserved.

More information

C3 Products + Services Overview

C3 Products + Services Overview C3 Products + Services Overview AI CLOUD PREDICTIVE ANALYTICS IoT Table of Contents C3 is a Computer Software Company 1 C3 PaaS Products 3 C3 SaaS Products 5 C3 Product Trials 6 C3 Center of Excellence

More information

Oracle Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Database Cloud Service for Data Warehouse Today s leading-edge

More information

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

Angat Pinoy. Angat Negosyo. Angat Pilipinas. Angat Pinoy. Angat Negosyo. Angat Pilipinas. Four megatrends will dominate the next decade Mobility Social Cloud Big data 91% of organizations expect to spend on mobile devices in 2012 In 2012, mobile

More information

Embracing the Hybrid Cloud using Power BI in CSP. Name Role Group

Embracing the Hybrid Cloud using Power BI in CSP. Name Role Group Embracing the Hybrid Cloud using Power BI in CSP Name Role Group Agenda Cloud Vision & Opportunity What is Power BI Power BI in CSP Power BI in Action Summary Microsoft vision for new era Unified platform

More information

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics Nov.

Outline of Hadoop. Background, Core Services, and Components. David Schwab Synchronic Analytics   Nov. Outline of Hadoop Background, Core Services, and Components David Schwab Synchronic Analytics https://synchronicanalytics.com Nov. 1, 2018 Hadoop s Purpose and Origin Hadoop s Architecture Minimum Configuration

More information

Managing explosion of data. Cloudera, Inc. All rights reserved.

Managing explosion of data. Cloudera, Inc. All rights reserved. Managing explosion of data 1 Customer experience expectations are converging on the brand, not channel Consistent across all channels and lines of business Contextualized to present location and circumstances

More information

Contents at a Glance COPYRIGHTED MATERIAL. Introduction... 1 Part I: Getting Started with Big Data... 7

Contents at a Glance COPYRIGHTED MATERIAL. Introduction... 1 Part I: Getting Started with Big Data... 7 Contents at a Glance Introduction... 1 Part I: Getting Started with Big Data... 7 Chapter 1: Grasping the Fundamentals of Big Data...9 Chapter 2: Examining Big Data Types...25 Chapter 3: Old Meets New:

More information

Investor Presentation. Second Quarter 2016

Investor Presentation. Second Quarter 2016 Investor Presentation Second Quarter 2016 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition delivers high-performance data movement and transformation among enterprise platforms with its open and integrated E-LT

More information

Taking Advantage of Cloud Elasticity and Flexibility

Taking Advantage of Cloud Elasticity and Flexibility Taking Advantage of Cloud Elasticity and Flexibility Fred Koopmans Sr. Director of Product Management 1 Public cloud adoption is surging 2 Cloudera customers are leading the way 3 Hadoop was born for the

More information

Oracle Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud Oracle Autonomous Data Warehouse Cloud 1 Lower Cost, Increase Reliability and Performance to Extract More Value from Your Data With Oracle Autonomous Data Warehouse Cloud Today s leading-edge organizations

More information

ADVANCED ANALYTICS & IOT ARCHITECTURES

ADVANCED ANALYTICS & IOT ARCHITECTURES ADVANCED ANALYTICS & IOT ARCHITECTURES Presented by: Orion Gebremedhin Director of Technology, Data & Analytics Marc Lobree National Architect, Advanced Analytics EDW THE RIGHT TOOL FOR THE RIGHT WORKLOAD

More information

SAP Cloud Platform Big Data Services EXTERNAL. SAP Cloud Platform Big Data Services From Data to Insight

SAP Cloud Platform Big Data Services EXTERNAL. SAP Cloud Platform Big Data Services From Data to Insight EXTERNAL FULL-SERVICE BIG DATA IN THE CLOUD, a fully managed Apache Hadoop and Apache Spark cloud offering, form the cornerstone of many successful Big Data implementations. Enterprises harness the performance

More information

New Big Data Solutions and Opportunities for DB Workloads

New Big Data Solutions and Opportunities for DB Workloads New Big Data Solutions and Opportunities for DB Workloads Hadoop and Spark Ecosystem for Data Analytics, Experience and Outlook Luca Canali, IT-DB Hadoop and Spark Service WLCG, GDB meeting CERN, September

More information

Stateful Services on DC/OS. Santa Clara, California April 23th 25th, 2018

Stateful Services on DC/OS. Santa Clara, California April 23th 25th, 2018 Stateful Services on DC/OS Santa Clara, California April 23th 25th, 2018 Who Am I? Shafique Hassan Solutions Architect @ Mesosphere Operator 2 Agenda DC/OS Introduction and Recap Why Stateful Services

More information

HDInsight - Hadoop for the Commoner Matt Stenzel Data Platform Technical Specialist

HDInsight - Hadoop for the Commoner Matt Stenzel Data Platform Technical Specialist HDInsight - Hadoop for the Commoner 10-1-2016 Matt Stenzel Data Platform Technical Specialist SQL Saturday #557 Thank you Sponsors! Please visit the sponsors and enter their end-of-day raffles. Event After

More information

IBM Analytics Unleash the power of data with Apache Spark

IBM Analytics Unleash the power of data with Apache Spark IBM Analytics Unleash the power of data with Apache Spark Agility, speed and simplicity define the analytics operating system of the future 1 2 3 4 Use Spark to create value from data-driven insights Lower

More information

Oracle Big Data Cloud Service

Oracle Big Data Cloud Service Oracle Big Data Cloud Service Delivering Hadoop, Spark and Data Science with Oracle Security and Cloud Simplicity Oracle Big Data Cloud Service is an automated service that provides a highpowered environment

More information

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies DLT Stack Powering big data, analytics and data science strategies for government agencies Now, government agencies can have a scalable reference model for success with Big Data, Advanced and Data Science

More information

Analytics for All Data

Analytics for All Data Analytics for All Data How Oracle Analytics Helps Agencies Improve Their Effectiveness FORCES 2017 Jim Penn Sr Manager, Public Sector Oracle Analytics & Big Data Agenda Oracle s Analytics Platform Overview

More information

EBOOK: Cloudwick Powering the Digital Enterprise

EBOOK: Cloudwick Powering the Digital Enterprise EBOOK: Cloudwick Powering the Digital Enterprise Contents What is a Data Lake?... Benefits of a Data Lake on AWS... Building a Data Lake on AWS... Cloudwick Case Study... About Cloudwick... Getting Started...

More information

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand Paper 2698-2018 Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand ABSTRACT Digital analytics is no longer just about tracking the number

More information

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB Data Analytics and CERN IT Hadoop Service CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB 1 Data Analytics at Scale The Challenge When you cannot fit your workload in a desktop Data

More information

MIGRATING AND MANAGING MICROSOFT WORKLOADS ON AWS WITH DATAPIPE DATAPIPE.COM

MIGRATING AND MANAGING MICROSOFT WORKLOADS ON AWS WITH DATAPIPE DATAPIPE.COM MIGRATING AND MANAGING MICROSOFT WORKLOADS ON AWS WITH DATAPIPE DATAPIPE.COM INTRODUCTION About Microsoft on AWS Amazon Web Services helps you build, deploy, scale, and manage Microsoft applications quickly,

More information

Building a Data Lake on AWS

Building a Data Lake on AWS Partner Network EBOOK: Building a Data Lake on AWS Contents What is a Data Lake? Benefits of a Data Lake on AWS Building a Data Lake On AWS Featured Data Lake Partner Bronze Drum Consulting Case Study:Rosetta

More information