Cloudera Hadoop & Industrie 4.0 wohin mit dem Datenstrom? Bernard Doering Regional Sales Director, Central Europe 1
Cloudera Hadoop Scalable Flexible Open Cost- EffecLve 2 2014 Cloudera, Inc. All rights reserved.
Hadoop vs RelaLonal Databases Schema- on- Write Schema- on- Read Schema must be created before any data can be loaded Data is simply copied to the file store, no transformalon is needed Reads are fast Loads are fast Standards and Governance Flexibility and agility 3 2014 Cloudera, Inc. All rights reserved.
Cloudera Company Snapshot Founded 2008, by former employees of Funding >$1B invested in opportunity, ~$670M Primary Employees Today ~740 World Class Support More than 70-24x7 Global Product Support Staff Pro- aclve & PredicLve Support Programs using our EDH Mission Cri:cal ProducLon deployments in run- the- business applicalons worldwide Financial Services, Retail, Telecom, Media, Health Care, Energy, Government, Manufacturing The Largest Ecosystem More than 1,000 Partners Cloudera University Over 40,000 IT engineers trained Open Source Leaders Cloudera Employees are Leading Developers & Contributors to the complete Apache Hadoop ecosystem of projects. 4 2014 Cloudera, Inc. All rights reserved.
Leading the Way in Data Management Powered by Hadoop 2008 2009 2011 2012 2013 2014 CLOUDERA FOUNDED BY MIKE OLSON AMR AWADALLAH & JEFF HAMMERBACHER CLOUDERA RELEASES CDH THE FIRST COMMERCIAL APACHE HADOOP DISTRIBUTION CLOUDERA REACHES 100 PRODUCTION CUSTOMERS CLOUDERA ENTERPRISE 4 THE STANDARD FOR HADOOP IN THE ENTERPRISE CLOUDERA IMPALA CLOUDERA NAVIGATOR CLOUDERA SEARCH THE ENTERPRISE DATA HUB LAUNCHED CDH 5 CLOUDERA ENTERPRISE Cloudera Manager 4 ASK BIGGER QUESTIONS 2009 2010 2011 2012 2013 HADOOP CREATOR DOUG CUTTING JOINS CLOUDERA CLOUDERA MANAGER: FIRST MANAGEMENT APPLICATION FOR HADOOP CLOUDERA UNIVERSITY EXPANDS TO 140 COUNTRIES CLOUDERA CONNECT REACHES 300 PARTNERS TOM REILLY JOINS AS CEO OVER 800 PARTNERS IN CLOUDERA CONNECT ENTERPRISE DATA HUB
Hadoop and the Enterprise Data Hub An Open- Source Data Engine at the Core and Built for the Modern Enterprise CLOUDERA S ENTERPRISE DATA HUB Key A&ributes BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK WORKLOAD MANAGEMENT YARN STREAM PROCESSING SPARK STREAMING 3 RD PARTY APPS DATA MANAGEMENT CLOUDERA NAVIGATOR Ø Ø Secure, Governed, and Compliant Unified and Managed STORAGE FOR ANY TYPE OF DATA FILESYSTEM HDFS UNIFIED, ELASTIC, RESILIENT,, SECURE SENTRY ONLINE NOSQL HBASE SYSTEM MANAGEMENT CLOUDERA MANAGER Ø Ø Open Architecture and Scalable Open- Source and Cost- Effec:ve 6 2014 Cloudera, Inc. All rights reserved.
Cloudera & The Intel Alliance 7
Big Deal: Cloudera + Intel Intel invests $740M in Cloudera As Intel s largest data center venture capital investment, which represents Intel s commitment to Internet of Things and Big Data Supports Cloudera s ability to remain independent Intel & Cloudera drive innovation through open source Accelerate evolution of Hadoop by joining forces on foundational technologies Enable open source developers to innovate in and on top of the Hadoop platform Intel enables CDH to run best on Intel Architecture performance optimisation Enables Cloudera to make best use of Intel data center technologies Provides datacenter infrastructure for Cloudera development & benchmarking at scale Intel Confidential 8
Big Goal: Converge on one open source platform Intel Confidential Most stable, compatible, and mature The only distribution with performance Hadoop distribution and security enhanced from the silicon Leading SQL functionality & up performance (Impala) Deepest management and governance capabilities 50 Hadoop developers and 12 150 Hadoop developers committers 100 open source committers Leading security capabilities including encryption, access control, and auditing Long-standing committment to open source with 1000 developers working on Linux, KVM, Xen, Java, OpenStack, Hadoop 9
Intel Confidential Cloudera for Big Data
Data drives innovalon Internet of Things 40 Zettabytes of data will be generated WW in 2020 1 SMART CLIENTS INTELLIGENT CLOUD Richer data to analyze Richer user experiences INTELLIGENT THINGS 2.8 Zettabytes of data generated WW in 2012 1 Richer data from devices 11 Sources: (1) IDC Digital Universe 2020, (2) IDC
Big Data is All Data and All Paradigms Transac:onal & Applica:on Data Machine Data Social Data Enterprise Content Volume Velocity Variety Variety Structured Semi- structured Highly unstructured Highly unstructured Throughput IngesLon Veracity Volume 12
Expanding Data Requires A New Approach 1980s Bring Data to Compute Now Bring Compute to Data Compute Data Compute Data Data Compute Data Process- centric businesses use: Structured data mainly Internal data only Important data only Rela:ve size & complexity Compute Compute Compute Data Informa:on- centric businesses use all data: MulL- structured, internal & external data of all types 13 2014 Cloudera, Inc. All rights reserved.
The Old Way: Moving Data to Compute Huge Investment in Specialized Systems that Treat Data as a Commodity Major Challenges Missing Data Leaving data behind Risk and compliance High cost of storage Time to Data Up- front modeling Transforms slow Transforms lose data Cost of Analy:cs ExisLng systems strained No agility BI backlog EDWS MARTS SERVERS DOCUMENTS STORAGE SEARCH ARCHIVE Complex Architecture Many special- purpose systems Moving data around No complete views ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES 14 2014 Cloudera, Inc. All rights reserved.
The Old Way: Siloed Business FuncLons Lack of CoordinaLon Increases Opportunity Costs and Decreases Data Availability Major Challenges Ø Poor Visibility Ø Inefficiency Ø Extreme Cost Ø Complexity MARKETING RISK TRANSACTIONAL LENDING CREDIT CARDS INVESTMENT BACK OFFICE TRANSACTIONS LOGS MARKET DATA RESEARCH CUSTOMER DATA 15 2014 Cloudera, Inc. All rights reserved.
The New Way: Bringing Compute to Data Maximize Benefit from All Your Data for Mission- CriLcal Jobs and InnovaLon Major Benefits Ac:ve Compliance Archive Full fidelity original data Indefinite Lme, any source Lowest cost storage Persistent Storage One source of data for all analylcs Persist state of transformed data Significantly faster & cheaper Self- Service Exploratory BI Simple search + BI tools Schema on read agility Reduce BI user backlog requests Diverse Analy:c Plaaorm Bring applicalons to data Combine different workloads on common data (i.e. SQL + Search) True analy=c agility SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES 16 2014 Cloudera, Inc. All rights reserved.
The New Way: Bring Business FuncLons to Data Consolidate Relevant Services and Data in MulL- tenant Environment Major Benefits Ø Compliant Ø Centralized Ø Ø Self- Service Mul:ple Workloads MARKETING BACK OFFICE RISK 360 o VIEW LOGS RESEARCH MARKET TRANSACTIONS CUSTOMER INVESTMENT TRANSACTIONAL LENDING CREDIT CARDS 17 2014 Cloudera, Inc. All rights reserved.
The Modern InformaLon Architecture Data Architects System Operators Engineers Data Scien:sts Analysts Business Users META DATA / ETL TOOLS CLOUDERA MANAGER CONVERGED APPLICATIONS MACHINE LEARNING BI / ANALYTICS ENTERPRISE REPORTING ENTERPRISE DATA HUB ENTERPRISE DATA WAREHOUSE ONLINE SERVING SYSTEM SYS LOGS WEB LOGS FILES RDBMS WEB/MOBILE APPLICATIONS Customers & End Users 18 2014 Cloudera, Inc. All Rights Reserved.
19 Sample Use Cases
Insurance Use Case Problem Solu/on 360 o View DifferenLate coverage oplons by customizing plans based on informalon collected about customers lifestyle, health paterns, habits, and preferences. Can t Scale for Sensor Data Current systems can not integrate telemetric and sensor data delivered in real Lme with historical data to tailor policies and incenlve plans to the user. Stream Processing Spark Streaming is used to calculate pricing occasions in real Lme based on live, unstructured data- in- molon from sensors, mobile devices, nanotechnology, etc. Partners 20 2014 Cloudera, Inc. All rights reserved.
21
Streamlining drivers customer experience Challenge Each vehicle is comprised of thousands or millions of components, many streaming machine data Want to build loyalty by minimizing maintenance issues Auto Manufacturer Solu:on Improved customer loyalty through proaclve care Cloudera correlates manufacturing data with customer informalon PredicLve analylcs & machine learning enable dynamic customer profiles & personalizalon 22
Manufacturing IoT Trends Connected Car and Smart Meter Grids Value- added Services & Apps: Customer micro segmentalon and loyalty Alerts Pro- aclve maintenance Quality Improvement Operator Services Performance oplmisalon, e.g. fuel or power consumplon 23
Customer Success Across Industries Financial & Business Services Telecom & Technology Healthcare & Life Sciences Media & InformaLon Retail & Consumer Energy & Public Sector 24 2014 Cloudera, Inc. All rights reserved.
Enabling The App Store of Big Data BI and AnalyLcs Partners SI, Cloud, MSP Partners Database Partners Resellers Data IntegraLon Partners Hardware Partners 25 2014 Cloudera, Inc. All rights reserved.
Thank You! Bernard Doering bdoering@cloudera.com Tel. +49 172 692 9837 26