Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Similar documents
Cask Data Application Platform (CDAP)

Microsoft Azure Essentials

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

Streaming Analytics, Data Lakes and PI Integrators

MapR Pentaho Business Solutions

Data Analytics. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

Got Data Silos? Automate Data Ingestion Into Isilon In Support Of Analytics

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Hybrid Data Management

Oracle Enterprise Data Quality Product Roadmap and Statement of Direction. October 2016

ETL challenges on IOT projects. Pedro Martins Head of Implementation

Bringing the Power of SAS to Hadoop Title

Copyright 2014, Oracle and/or its affiliates. All rights reserved. 2

Cloud Based Analytics for SAP

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

Azure PaaS and SaaS Microsoft s two approaches to building IoT solutions

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

Microsoft Dynamics 365 and Columbus

Architecture Overview for Data Analytics Deployments

Microsoft FastTrack For Azure Service Level Description

Oracle Big Data Cloud Service

Sr. Sergio Rodríguez de Guzmán CTO PUE

MapR Streams A global pub-sub event streaming system for big data and IoT

Integrating MATLAB Analytics into Enterprise Applications

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

The IoT Solutions Space: Edge-Computing IoT architecture, the FAR EDGE Project John Professor Athens Information

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

Oracle Big Data Discovery The Visual Face of Big Data

Introducing Infor Xi/Ming.le for M3

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Data Center Operating System (DCOS) IBM Platform Solutions

Asset Avatars. Get a 360-Degree View of Your Assets. By Hitachi Vantara

Turn Data into Business Value

Your Big Data to Big Data tools using the family of PI Integrators

The Rise of Engineering-Driven Analytics

: Boosting Business Returns with Faster and Smarter Data Lakes

Deploying Microservices and Containers with Azure Container Service and DC/OS

Koen van den Biggelaar Senior Manager, Solutions Architecture Amazon Web Services

HP SummerSchool TechTalks Kenneth Donau Presale Technical Consulting, HP SW

MapR: Solution for Customer Production Success

Common Customer Use Cases in FSI

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Let s distribute.. NOW: Modern Data Platform as Basis for Transformation and new Services

20332B: Advanced Solutions of Microsoft SharePoint Server 2013

Stuck with Power BI? Get Pyramid Starting at $0/month. Start Moving with the Analytics OS

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Real-time Streaming Insight & Time Series Data Analytic For Smart Retail

Microsoft Big Data. Solution Brief

RICHARD BEESON. OSIsoft

Limitless Creativity in the Cloud

Collibra Catalog for Big Data Analytics Product Preview

Analytics empowering clients to see farther & go faster

Cisco Connected Asset Manager for IoT Intelligence

Vijeo Citect roadmap and how to get more from your investment. Darren Fraser General Manager SCADA Adam Barnes Product Manager Vijeo Citect

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

IIOT Data Access with the PI System

Operational Hadoop and the Lambda Architecture for Streaming Data

Azure IoT Suite. Secure device connectivity and management. Data ingestion and command + control. Rich dashboards and visualizations

IBM Virtual Appliance for Oracle Database

Leveraging Oracle Big Data Discovery to Master CERN s Data. Manuel Martín Márquez Oracle Business Analytics Innovation 12 October- Stockholm, Sweden

Hortonworks Powering the Future of Data

Boomi Basics: Going Beyond Integration with APIs, Data Management and Workflow Automation

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Jason Virtue Business Intelligence Technical Professional

Cloud Data Integration and Data Quality: Extending the Informatica Platform to the Cloud

Information Server 11.3 Overview. Kevin D Silva Client Technical Professional, InfoSphere Information Server

Aprimo Digital Asset Management

AllSites Energy Management App

Safe Harbor Statement

Bringing Big Data to Life: Overcoming The Challenges of Legacy Data in Hadoop

ARCHITECTURES ADVANCED ANALYTICS & IOT. Presented by: Orion Gebremedhin. Marc Lobree. Director of Technology, Data & Analytics

BMC - Business Service Management Platform

Adobe Deploys Hadoop as a Service on VMware vsphere

The IBM Reference Architecture for Healthcare and Life Sciences

Oracle s Service-Oriented Architecture Strategy

Oracle Big Data Discovery Cloud Service

Wonderware System Platform 2017 Real-time Operations Control Platform for Supervisory, HMI, SCADA and IIoT

Dynamics CRM Update and Roadmap

IBM Big Data Summit 2012

Big Data The Big Story

Oracle's Cloud Strategie für den Geschäftserfolg Alles Neue von der OOW

Oracle s Integration Strategy

Data Lake or Data Swamp?

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

4/26. Analytics Strategy

Hadoop in the Cloud. Ryan Lippert, Cloudera Product Cloudera, Inc. All rights reserved.

Expert Reference Series of White Papers. Microsoft Service Manager Simplified

Middleware Modernization: lay the foundation to your digital success

Oracle Integration Cloud Service Catalyst for Success in the Cloud A Case Study

Modernizing Data Integration

F5 Visualization and Analytics. Nishant Shah Sr. Product Manager

Innovate with Oracle Public Cloud Platform & Infrastructure Services

WHITE PAPER. Top 10 Reasons Why OEMs Choose MicroStrategy for Analytics

Understanding the Business Value of Docker Enterprise Edition

Accelerating Change: HR in the Cloud GENERAL SESSION. Rajan Krishnan Group Vice President, Product Development Oracle

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

A Reference Architecture for Hybrid Integration. Peter Broadhurst Senior Technical Staff Member for IBM App Connect

Oracle SCM Cloud. Integration and Extensibility Strategy. Jon Chorley. CSO and Group Vice President Oracle SCM Product Strategy.

Gyors piacra jutás felhő platformon - hagyományos IT fejlesztés nélkül? Petrohán Zsolt

The Next Generation of Analytic Apps

Transcription:

Pentaho 8.0 and Beyond Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Safe Harbor Statement The forward-looking statements contained in this document represent an outline of our current intended product direction. It is provided for information purposes only and is not a commitment to deliver any new or enhanced product or functionality, or that we will pursue the product direction described. Facts and circumstances may occur which may impact current plans, resulting in changes to the information in this presentation. This information is current only as of the date it is made and should not be relied upon in making purchasing decisions. The development, release (if at all), and timing of any features or functionality described for the Pentaho products remains at the sole discretion of Pentaho.

Pentaho 8.0 and Beyond 1 Product Vision 2 Pentaho 8.0 3 Product Roadmap

Product Vision

The Power of Three HITACHI DATA SYSTEMS > Content platform > Storage solutions PENTAHO > Data Integration > Business Analytics HITACHI INSIGHT GROUP > Lumada IoT

Pentaho Business Analytics Platform Data Engineer Data Analyst / Data Scientist Business Analyst Consumer Production Reporting Interactive Query and Analysis Custom and Self-Service Dashboards Pentaho Data Integration Data Preparation Integrated Machine Learning learning OPEN AND EMBEDDABLE Operational Data Big Data Data Stream Public/Private Clouds

Future Vision: A Single Consistent Experience Data Engineering Data Prep Analytics Ingestion Processing Blending Data Delivery Data Discovery / Analysis Analysis & Dashboards Administration Security Lifecycle Management Data Provenance Dynamic Data Pipeline Monitoring Automation

Pentaho 8.0

Introducing Pentaho 8.0 Challenge #1 Data volumes and velocity are growing exponentially Pentaho 8.0 Broadens connectivity to streaming data sources Connect to Kafka streams Stream processing with Spark Big data security with Knox Challenge #2 Processing and storage resources are constrained Pentaho 8.0 Optimizes processing resources Enhanced Adaptive Execution (AEL) Native Avro and Parquet handling Worker nodes for Scale-out Challenge #3 Shortage of Big Data talent and lack of productivity Pentaho 8.0 Boosts team productivity across the pipeline Data explorer filters Improved repository UX Extended operations mart

Streaming for Time Sensitive Insight Enable use cases that require real-time processing, monitoring and aggregation Real-time device monitoring Log-file aggregation Notifications And more NEW in Pentaho 8.0 ü Kafka Producer Step ü Kafka Consumer Step ü Get records from stream Step ü Spark streaming via AEL

Pentaho 7.1 Adaptive Execution for Spark PDI Pentaho Kettle ü No Coding ü Build Once ü Execute on Any* Engine *Currently Available Engines

Enhanced Adaptive Execution Simplified setup Eliminated Zookeeper component HADOOP CLUSTER Reduced number of setup steps Hardened deployment Fail-over at the edge PDI Client AEL-Spark Daemon on Edge Nodes Spark/Hadoop Processing Nodes Spark Executors Kerberos impersonation for client More flexible Hadoop/Spark Compatible Storage Cluster Support multiple run configurations Customize cluster settings per job type AEL-Spark Engine (Spark Driver) HDFS Azure Storage Amazon S3 Etc

Worker Nodes for Scaling Out Scale work items across multiple nodes (containers) Easily add and remove resources as required Monitor and balance changing workloads Deploy on premise, cloud and hybrid Distribute and Scale Worker Node (a) Worker Node (b) Worker Node (c ) NEW in Pentaho 8.0 ü Container framework ü Orchestration framework ü Node monitoring ü Enhanced HA implementation

Worker Nodes Architecture WORKER NODES Orchestration Framework Orchestration (Scheduler, monitoring, security, etc.) Powered by Pentaho Clients Master (Working) Master (Standby) Controller (HA) Master (Standby) Pentaho Server Container Framework Pentaho Repository WN 1 e.g. KJB WN 2 e.g. KTR WN n Executor

Pentaho 7.0 Data Explorer Access visualizations during data prep for inspection and prototyping

Data Explorer Filters Enhanced data inspection in PDI Identify data to be cleaned or removed Deliver data to the business more quickly ENHANCED in Pentaho 8.0 ü Numeric filters ü String filters ü Include/Exclude data points

Pentaho 8.0 Complete Data Integration Filters in Data Explorer for enhanced data inspection during prep New PDI Repository Dialogs for better usability Run Configurations for Jobs for seamless user experience Big Data Stream Data Processing to simplify near real time integration with Kafka Enhanced AEL for reliability, performance, and security Big Data File Formats to support crucial Hadoop use cases Big Data Security with HDP Knox Gateway VFS Improvements for named Hadoop clusters Enterprise Platform Worker Nodes Scale-Out to drive superior agility and TCO for enterprises Ruby Theme new platform branding Additional Items Ops Mart for Oracle, MySQL, SQL Server Big Data Sandbox VM updates Platform password security improvements PDI Mavenization for infra alignment Documentation improvements on help.pentaho.com

Product Roadmap

Roadmap Initiatives Visual Data Experience Data Exploration Visual Data Prep Embedded Analytics Data Catalog Big Data Processing Adaptive Execution Spark Execution Stream Processing Machine Learning Enterprise Platform Scale-out Deployment Metadata Management Operations Management Cloud Deployment EMERGING TRENDS AND TECHNOLOGY Advanced Analytics Real-time PENTAHO FOUNDATIONAL INVESTMENT AREAS

Strengthening the Bridge Between Data and Insight ü Visual data inspection ü Intuitive data prep ü Advanced visualization DATA EXPLORER Source 1 Source 2 Source 3 Source 4 Source 5 CATALOG ü Governed access ü Searchable metadata ü Collaboration

Inline Data Prep Vision Intuitive, excel-like transformation design Inline Model Inline Transformation Integrated Profiling Field Statistics Field Type: Integer Records: 10,000 Cardinality: 273 Min <count>: 1 Max <count>: 23 Bin Size (%): Quintile Merge Fields

Pentaho Machine Learning Orchestration Roadmap projects that serve emerging needs of data scientists. Catalog Data Explorer Notebook Integrations Adaptive Execution Native Algorithms

Pentaho Roadmap Features and dates are subject to change. Nov 2017 1H18 (8.1) Future VISUAL DATA EXPERIENCE Data Explorer Filters Catalog I Visual Profiling Catalog Search Data Prep from DET Layout Manager New User Console Data Science Viz Real-time Viz (BIG) DATA PROCESSING Kafka Interface Spark Streaming Parquet and Avro Enhanced AEL Streaming II Enhanced JSON/XML/ORC AEL - extend distros Advanced Profiling Rules Validator Native ML algorithms AEL Flink Thin Kettle (Composer) Web Designer Data Operations Mgr. AEL Next ENTERPRISE PLATFORM Scale-out Framework Foundry Integration Unified Monitoring Harden Metadata Bridges Vantara Integrations Enhanced Upgrade Enhanced Security New Content Lifecycle Vantara Integrations Metadata Manager Business Glossary Multi-tenancy Vantara Integrations ECOSYSTEM AEL HDP, MapR Google Cloud Platform Cassandra/NoSQL Update Multi-cloud Orchestration Cloud App Connectors Mainframe Enhanced SAP and SFDC

Hitachi Vantara Portfolio Application Framework Studio Dashboards Visualization Notifications App Development Edge Processing Asset Management Data Data Integration Analytics Asset registry Data catalog Metadata management Modeling and lineage Governance Data connectors Transformation engines Profiling and quality Data blending Data preparation Business analytics Content analytics Artificial intelligence Batch and stream Search Foundry Software Service Platform Workflow Scheduling Security Clustering Repository Monitoring Flash Storage Storage Converged Infrastructure Automated Management Data Protection

IoT Solutions from Edge to Outcomes SMART DATA CENTER SMART BUSINESS SMART INDUSTRY SMART CITY Edge Fog Layer Core Core Insights Outcomes Sensors Sensors Things Things People People Telemetry Edge Edge Filtering Asset Registry Asset Registry Stream Queues Ingest Process Visualize Model Predict Notify IoT Data Pipeline Lumada IoT Data Pipeline IoT Analytic Processor

Unlock the Business Value in YOUR Data YOUR STRATEGY Need for Better Insights To Achieve Better Outcomes YOUR INSHGTS Big Data Analytics Content Exploration Pentaho Hitachi Content Intelligence Hitachi Content Platform YOUR DATA TX TX Transactional Data Email and Documents Video, Image and Audio Social Media IT, Sensor and Machine Logs

The Power of Three HITACHI DATA SYSTEMS > Content platform > Storage solutions PENTAHO > Data Integration > Business Analytics HITACHI INSIGHT GROUP > Lumada IoT

Summary

Summary What we covered today: Product Vision Pentaho 8.0 Release Product Roadmap

Next Steps Want to learn more about Pentaho 8.0 and product roadmap? Other recommended breakout sessions: Processing Big Data with Pentaho: Rakesh Saha Operating Pentaho at Scale: Jens Bleul Solution Expo Pentaho 8.0 and Beyond Lumada IoT Platform Hitachi Content Platform Spark Processing And more.

Pentaho 8.1 Preview Some Candidate Projects Enhanced Streaming Enhanced Profiling Google Cloud Platform Unified Monitoring and Logging Enhanced Metadata Handling Pentaho 8.1 Expected Availability Q2 2017