Managing Data in Motion with the Connected Data Architecture

Similar documents
Microsoft Azure Essentials

Bringing Big Data to Life: Overcoming The Challenges of Legacy Data in Hadoop

20775: Performing Data Engineering on Microsoft HD Insight

Transform Application Performance Testing for a More Agile Enterprise

Introduction to the IBM MessageSight appliance for Mobile Messaging and M2M

Insights to HDInsight

Azure IoT Suite. Secure device connectivity and management. Data ingestion and command + control. Rich dashboards and visualizations

More information for FREE VS ENTERPRISE LICENCE :

Tascent Enterprise Suite Multimodal Biometric Identity Platform

Secure information access is critical & more complex than ever

Operational Intelligence in Industrial Environments

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Transforming IIoT Data into Opportunity with Data Torrent using Apache Apex

Hortonworks Powering the Future of Data

Extending Enterprise to the Edge

: Boosting Business Returns with Faster and Smarter Data Lakes

Cask Data Application Platform (CDAP)

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

FLINK IN ZALANDO S WORLD OF MICROSERVICES JAVIER LOPEZ MIHAIL VIERU

Copyright 2014, Oracle and/or its affiliates. All rights reserved. 2

Real-time Streaming Insight & Time Series Data Analytic For Smart Retail

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Event Driven Architecture for Real-Time Analytics

SSL ClearView Reporter Data Sheet

Azure PaaS and SaaS Microsoft s two approaches to building IoT solutions

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Pentaho 8.0 Overview. Pedro Alves

Financial Data Enterprise Platform

F5 Visualization and Analytics. Nishant Shah Sr. Product Manager

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform

Oracle Autonomous Data Warehouse Cloud

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

Sr. Sergio Rodríguez de Guzmán CTO PUE

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Cisco Connected Asset Manager for IoT Intelligence

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Digital Industrial Transformation Powered by the Industrial IoT

INDUSTRIAL IOT TECHNICAL STRATEGY WITH USE CASES

CloudSuite Corporate ebook

CA Network Automation

Asset Avatars. Get a 360-Degree View of Your Assets. By Hitachi Vantara

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

The IoT Solutions Space: Edge-Computing IoT architecture, the FAR EDGE Project John Professor Athens Information

Common Customer Use Cases in FSI

Understanding Your Enterprise API Requirements

CA UIM Log Analytics. Gain Full Stack Visibility With Contextual Log Insights. Mark Tukh Principal Presale Consultant CA NESS AT

WebFOCUS: Business Intelligence and Analytics Platform

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

INFOSYS REALTIME STREAMS

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS

Improving enterprise performance through operations intelligence solutions siemens.com/xhq

Oracle Autonomous Data Warehouse Cloud

Data Analytics. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

An Enterprise Architect s Guide to API Integration for ESB and SOA

Oracle Big Data Cloud Service

Edge Analytics for IoT Device Intelligence

Oracle Enterprise Data Quality Product Roadmap and Statement of Direction. October 2016

Microsoft Big Data. Solution Brief

Make smart business decisions when they matter most September IBM Active Content: Linking ECM and BPM to enable the adaptive enterprise

Oracle Financials Cloud

Innovate with Oracle Public Cloud Platform & Infrastructure Services

The ABCs of. CA Workload Automation

CUTTING THROUGH THE NOISE

Transforming Big Data to Business Benefits

Smart Enterprise Trends Strategic Drivers that will Empower the Smart Enterprise

Plan, Deploy and Configure Microsoft InTune

Aprimo Marketing Productivity

Oracle Product Hub Cloud

An Oracle White Paper July Enterprise Operations Monitor: Real-Time Voice over IP Monitoring and Troubleshooting

Cognitive enterprise archive and retrieval

ORACLE CLOUD MANAGEMENT PACK FOR MIDDLEWARE

SYSPRO Product Roadmap Q Version 03

Internet of Things. Point of View. Turn your data into accessible, actionable insights for maximum business value.

The Need For Speed: Fast Data Development Trends Insights from over 2,400 developers on the impact of Data in Motion in the real world

The IBM Advantage for Cognitive Discovery Cloud Architecture

Get The Best Out Of Oracle Scheduler

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

What s new in Machine Learning across the Splunk Portfolio

Big Data Analytics for Retail with Apache Hadoop. A Hortonworks and Microsoft White Paper

Data: Foundation Of Digital Transformation

SOA, Web 2.0, and Web Services

TIBCO Live Datamart providing an operational command and control center in a virtual train application.

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

Simplifying Your Modern Data Architecture Footprint

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

Building Real-time and Responsive Applications on Azure. Girish Phadke & Maneesha Marathe Tata Consultancy Services Ltd.

ORACLE ADVANCED FINANCIAL CONTROLS CLOUD SERVICE

ThingWorx Manufacturing Apps

WHITEPAPER. Unlocking Your ATM Big Data : Understanding the power of real-time transaction monitoring and analytics.

Compiere ERP Starter Kit. Prepared by Tenth Planet

SALESFORCE CERTIFIED DATA ARCHITECTURE AND MANAGEMENT DESIGNER

INFORMATION CONSOLIDATED

Enterprise Output Management For Banking, Finance, and Insurance

Turn Data into Business Value

EMC ATMOS. Managing big data in the cloud A PROVEN WAY TO INCORPORATE CLOUD BENEFITS INTO YOUR BUSINESS ATMOS FEATURES ESSENTIALS

Operational Hadoop and the Lambda Architecture for Streaming Data

Aspect Customer Experience Platform (CXP)

Transcription:

Managing in Motion with the Connected Architecture Dmitry Baev Director, Solutions Engineering Doing It Right SYMPOSIUM March 23-24, 2017 1 Hortonworks Inc. 2011 2016. All Rights Reserved 4 th Big & Business Analytics Symposium March 23-24, 2017

Disclaimer This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed. Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery. This document s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product. Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions. 2 Hortonworks Inc. 2011 2016. All Rights Reserved

AGE OF DATA INTERNET OF ANYTHING Open source is the norm, and Apache is the center of gravity Founded: 1999 3 Hortonworks Inc. 2011 2016. All Rights Reserved

Connected Defines the New Way of Business Systems of Insight Personalized Experiences Balanced Supply Chains New Products & Services Operational Efficiencies 4 Hortonworks Inc. 2011 2017. All Rights Reserved

Harnessing in Motion Sources Regional Infrastructure Core Infrastructure 5 Hortonworks Inc. 2011 2016. All Rights Reserved Constrained High-latency Localized context Hybrid cloud / on-premises Low-latency Global context

Hortonworks Flow: in Motion Platform Platform to build dataflow management and streaming analytics solutions that collect, curate, analyze and act on data in motion across the data center and cloud. 6 Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks Flow for in Motion Powered by Apache NiFi, Kafka, and Storm Easy, Secure, Reliable Way to Get the You Need From the edge to anywhere with intelligence FLOW MANAGEMENT STREAM PROCESSING Immediate and Continuous Insights Because acting on perishable insights in real time maximizes value ENTERPRISE SERVICES Provisioning, Management, Monitoring, Security, Audit, Compliance, Governance Because it all has to work together in an enterprise environment 7 Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks Flow Manages in Motion Sources Regional Infrastructure Core Infrastructure 8 Hortonworks Inc. 2011 2016. All Rights Reserved Constrained High-latency Localized context Hybrid cloud / on-premises Low-latency Global context

Simplistic View of flows: Easy, Definitive Acquire Flow Process Analyze Store 9 Hortonworks Inc. 2011 2016. All Rights Reserved

Realistic View of flows: Complex, Convoluted Acquire Store Store Acquire Flow Store Process and Analyze Acquire Store Store Acquire 10 Hortonworks Inc. 2011 2016. All Rights Reserved

Requirements for in Motion Prioritization Connectivity Scalability Extensibility Real-Time Adaptability Security Provenance Perishable Insights 11 Hortonworks Inc. 2011 2016. All Rights Reserved

Apache NiFi: Designed for 8 challenges of global enterprise dataflow Visual Command and Control For agile and immediate creation, configuration, control of dataflows Lineage (Provenance) Ensures trust of your data Prioritization Because not all data is of equal importance Buffering/Back-Pressure Since not all senders/receivers/connections work perfectly all the time Control Latency vs Throughput Adapt to different situations with different requirements Secure Control Plane/ Plane Security of data, and data access Scale out Clustering Scalability Extensibility Ecosystem flexibility and growth 12 Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks Flow, powered by Apache NiFi 13 Hortonworks Inc. 2011 2016. All Rights Reserved

Connecting Between Ecosystems Without Coding: 180+ Processors FTP SFTP HL7 UDP XML Hash Merge Extract Duplicate Encrypt Tail Evaluate Execute GeoEnrich Scan Replace Translate Split Convert HTTP Email HTML Image Syslog AMQP MQTT Route Text Route Content Route Context Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery All Apache project logos are trademarks of the ASF and the respective projects. 14 Hortonworks Inc. 2011 2016. All Rights Reserved

Edge Intelligence with Apache MiNiFi Key Features Guaranteed delivery buffering Backpressure Pressure release Prioritized queuing Flow specific QoS Latency vs. throughput Loss tolerance provenance Recovery / recording a rolling log of fine-grained history Designed for extension Different from Apache NiFi Design and Deploy Warm re-deploys 15 Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks Flow Manages in Motion Acquisition Edge Intelligence Bi-directional Comm. Acquisition & Movement Intelligent Routing & Filtering Parse, Enrich & Delivery Detect Complex Patterns in Real-Time Descriptive, Prescriptive & Predictive Analytics Join/Split Streams Streaming Analytics Manager EDGE/ SOURCES REGIONAL INFRASTRUCTURE CORE INFRASTRUCTURE Constrained High-latency Localized context 16 Hortonworks Inc. 2011 2016. All Rights Reserved Hybrid cloud/on-premises Low-latency Global context

Stream Processing using Streaming Analytics Manager 17 Hortonworks Inc. 2011 2016. All Rights Reserved

in Motion Needs Flow Management and Stream Processing Flow Management in Motion Using Flow Management Ingestion Edge Intelligence First Mile Problem Physical Movement Simple event processing such as Route, Filter, Enrich, Transform, etc. Flow Management and Stream Processing in Motion Using Flow Management and Stream Processing Flow Management to deliver data for Stream Processing Stream Processing for complex pattern matching on unbounded streams of data. 18 Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks Flow: -in-motion Platform Streaming Analytics Manager Kafka NiFi Storm Others Flow management Acquisition Edge Processing Flow management + Stream Processing Real Time Stream Analytics Rapid Application Development Enterprise Services Ambari Ranger Other services 19 Hortonworks Inc. 2011 2016. All Rights Reserved Hortonworks Inc. 2011 Page 19

Building a Real-Time Insights Application Example: Company X provides alerting services when users resting heart rate higher than a threshold Company X Cloud Instance 1 Acquire Acquire Across Cloud Instances Core Center Parse, Filter, Validate, Enrich and Route Analytics/Pattern Match Company X Cloud Instance 2 Acquire Store Alerts Company X Cloud Instance 3 Acquire Dashboards/Visualization Legend: Flow Management 20 Hortonworks Inc. 2011 2016. All Rights Reserved Stream Processing

in Motion Needs flow Management and Stream Processing Acquire data from various Wearable Device s Cloud Instances Move from Customer Cloud Instances to on-premise instance Intelligent Routing & Filtering of data. The routing and filtering rules will be often changed at run-time. Deliver the data to various downstream systems. New downstream apps will be added over time Parse the device data to standardized format that downstream system can understand Enrich the data with contextual information including patient/customer info (age, gender, etc..) Recognize the Pattern when the resting heart rate exceeds a certain threshold (the insight), and then create an alert/notification. Run a Outlier detection model on streaming heart rate that comes in. If the score is above a certain threshold, alert on the heart rate. Flow Management Stream Processing 21 Hortonworks Inc. 2011 2016. All Rights Reserved

Thank You 22 Hortonworks Inc. 2011 2016. All Rights Reserved

Managing in Motion with the Connected Architecture Perishable Insights ACTIONABLE INTELLIGENCE Historical Insights Capture streaming data Deliver perishable insights Combine new & old data Store data forever Access a multi-tenant data lake Model with artificial intelligence DATA IN MOTION (Hortonworks Flow) DATA AT REST (Hortonworks Platform) 23 Hortonworks Inc. 2011 2017. All Rights Reserved

Hortonworks Flow: in Motion Platform HDF provides the flow management, stream processing, and enterprise services needed to collect, curate, analyze and act on data-in-motion across the data center and cloud. 24 Hortonworks Inc. 2011 2016. All Rights Reserved

Real-Time, Visual Control of Flows between center and the Cloud HORTONWORKS DATAFLOW Dynamically adjust the pipeline Add and adjust data sources Visually trace the data path Add and Adjust Sources to maximize the opportunity that you capture from perishable insights Visually Trace the Path to manage the what, who, where and how around data in motion Dynamically Adjust the Pipeline to match the dataflow with your bandwidth 25 Hortonworks Inc. 2011 2016. All Rights Reserved

Integrated Processes and Control for Flows and Streaming Analytics COMMON ARCHITECTURE WITHOUT HORTONWORKS DATAFLOW Ingest Scripts Messaging WITH HORTONWORKS DATAFLOW HORTONWORKS DATAFLOW Scripts Optimize the Architecture Reduce cost an complexity with the most efficient data collection technologies Assure Efficient Operations Via real-time control of data inputs, outputs, transportation and transformations Rely on a Common Foundation Eliminating dependence on multiple customized systems 26 Hortonworks Inc. 2011 2016. All Rights Reserved

Secure Flows with Chain of Custody and Provenance HORTONWORKS DATAFLOW End-to-End Security Apply security rules to encrypt, decrypt, filter and replace data from the point of collection at the jagged edge to its final destination Granular Control and Sharing Move beyond role-based access and dynamically share an entire dataflow Real-Time Traceability Rich metadata and contextual detail helps troubleshoot security issues and informs timely decisions 27 Hortonworks Inc. 2011 2016. All Rights Reserved

Streaming Flows with Adaptive Control Automated Bi-directional communication between source and destination adapts data flows automatically, according to current priorities On-Demand Operational control to adapt to changing conditions and requirements Scalable By incorporating data from any device small machine sensors to enterprise data centers HDF connects you to the broadest set of disparate data sources 28 Hortonworks Inc. 2011 2016. All Rights Reserved

Acquisition & Delivery: First Mile Problem - Architecture Consumer Devices Regional Center Core Center MiNiFi Client Libraries DVR NiFi NiFi NiFi NiFi NiFi NiFi Server Cluster Server Cluster MiNiFi Client Libraries Home Security 29 Hortonworks Inc. 2011 2016. All Rights Reserved

End to End in Motion Architecture Tier 1: Regional Center (~ 1000) MiNiFi Tier 2: Distribution Center (~ 100) MiNiFi Tier 3: National Center (3-6) Router Switch Gateway Storage Gateway Storage Networking device Storm/Spark/ Flink/Apex Kafka Streaming Analytics Manager NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries DVR Client Libraries Home Security Server Cluster Kafka Server Cluster Splunk 30 Hortonworks Inc. 2011 2016. All Rights Reserved