Pentaho Technical Overview. Max Felber Solution Engineer September 22, 2016

Similar documents
ETL challenges on IOT projects. Pedro Martins Head of Implementation

PENTAHO FACT SHEET. Pentaho tag line: Powerful analytics that work Delivering the future of business analytics today

SAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services

Modernizing Your Data Warehouse with Azure

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake


Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager

MapR Pentaho Business Solutions

Architecting an Open Data Lake for the Enterprise

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Azure ML Data Camp. Ivan Kosyakov MTC Architect, Ph.D. Microsoft Technology Centers Microsoft Technology Centers. Experience the Microsoft Cloud

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Big Data Platform Implementation

The Alpine Data Platform

MapR: Solution for Customer Production Success

Analytics in Action transforming the way we use and consume information

Realising Value from Data

PRODUCT UPDATES APJ PARTNER SUMMIT - BALI. February Software AG. All rights reserved. For internal use only

Make Business Intelligence Work on Big Data

Cloudera, Inc. All rights reserved.

Cask Data Application Platform (CDAP)

Getting Started with Amazon QuickSight

SIMPLIFYING BUSINESS ANALYTICS FOR COMPLEX DATA. Davidi Boyarski, Channel Manager

Hortonworks Connected Data Platforms

Your Top 5 Reasons Why You Should Choose SAP Data Hub INTERNAL

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Pentaho 8.0 Overview. Pedro Alves

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

AZURE HDINSIGHT. Azure Machine Learning Track Marek Chmel

Luxoft and the Internet of Things

Hadoop Course Content

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Processing Big Data with Pentaho. Rakesh Saha Pentaho Senior Product Manager, Hitachi Vantara

Big Data & Hadoop Advance

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

THE CIO GUIDE TO BIG DATA ARCHIVING. How to pick the right product?

Transforming Analytics with Cloudera Data Science WorkBench

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Big Data Introduction

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

Confidential

ABOUT THIS TRAINING: This Hadoop training will also prepare you for the Big Data Certification of Cloudera- CCP and CCA.

Changing The Business landscape SAS and Open Source, Better Together. Dr Mark Chia, Head of Advanced Analytics, SAS

Architecture Optimization for the new Data Warehouse. Cloudera, Inc. All rights reserved.

Advanced Analytics in Azure

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager

DIGITAL LOGISTICS. Drive personalized customer experience

The Importance of good data management and Power BI

BIG DATA PROCESSING A DEEP DIVE IN HADOOP/SPARK & AZURE SQL DW

Oracle 全数据平台解决方案 : 打破技术壁垒, 释放数据能量. Sally Piao 甲骨文公司全球研发副总裁

SSRS and Izenda: Comparing an Enterprise Reporting Tool and an Embedded BI Platform. 1

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Roles and Processes in Analytics Development

BIG DATA AND HADOOP DEVELOPER

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

SAP Cloud Platform Big Data Services EXTERNAL. SAP Cloud Platform Big Data Services From Data to Insight

Analytics Platform System

THE IMPORTANCE OF END USER DATA PREPARATION

NOT ALL CLOUDS ARE CREATED EQUAL

Adobe and Hadoop Integration

Embedded Landscape and Key Trends in the Market

Apache Spark 2.0 GA. The General Engine for Modern Analytic Use Cases. Cloudera, Inc. All rights reserved.

Architecture Overview for Data Analytics Deployments

Building Your Big Data Team

Making Realtime Reporting a Reality

5th Annual. Cloudera, Inc. All rights reserved.

Governing Big Data and Hadoop

Hybrid Data Management

Copyright 2014, Oracle and/or its affiliates. All rights reserved. 2

Apache Hadoop in the Datacenter and Cloud

Microsoft Azure Essentials

Adobe and Hadoop Integration

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Investor Presentation. Fourth Quarter 2015

Big data is hard. Top 3 Challenges To Adopting Big Data

Common Customer Use Cases in FSI

Building data-driven applications with SAP Data Hub and Amazon Web Services

Application Integrator Automate Any Application

Data Analytics. Nagesh Madhwal Client Solutions Director, Consulting, Southeast Asia, Dell EMC

DOAG Big Data Days 2018 DWH Modernization

Quinnox BI OBIEE Solution. For more information, visit.

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Koen van den Biggelaar Senior Manager, Solutions Architecture Amazon Web Services

Big Data Analytics met Hadoop

Evolving Your Infrastructure to Cloud

Company Overview 2018

Sr. Sergio Rodríguez de Guzmán CTO PUE

Hadoop Administration Guide

Staffing Services Portfolio Advisory Fulfilment

Optimize to Modernize. Automated ERP Performance

The Need For Speed: Fast Data Development Trends Insights from over 2,400 developers on the impact of Data in Motion in the real world

CON7015 How Integration Analytics Enables Insight On-Premises and in the Cloud

About the CXP Group CXP Group

Implementing Relational Use Cases with mongodb and Pentaho

Data Science, realizing the Hype Cycle. Luigi Di Rito, Director Data Science Team, SAP Center of Excellence

Big Data Trends Arató Bence. BI Consulting

How In-Memory Computing can Maximize the Performance of Modern Payments

Integrate and Automate. Business Processes. Cloud Essentials. Easily connect applications using Oracle s cloud integration capabilities.

Course Content. The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

Transcription:

Pentaho Technical Overview Max Felber Solution Engineer mfelber@pentaho.com September 22, 2016

Industry Leader in Self-Service Big Data Preparation

Gartner recently completed a study on 36 selfservice data providers [Gartner Report] According to Gartner, a vendor should fulfill the following 4 pillars of self-service data preparation: 1. Stand-Alone Self-Service Data Preparation 2. Integrated With Existing Data Integration Platforms 3. Integrated With Modern BI&A Platforms 4. Integrated With Advanced Analytics/Data Science Platforms Only 3 vendors met each pillar: Oracle, IBM and Pentaho 3

How often have you heard the term Data Lake? James Dixon, our founder, invented the term 4

Maybe you ve heard of Weka for machine learning? Mark Hall, the man who leads our data mining services not only developed it, he wrote the book on it 5

So why do customers choose Pentaho? 1. Metadata Injection. By utilizing dynamic vs. static bindings, we can reduce the number of data transformations by 80-90 percent. YouTube Demo by Matt Casters, Chief Architect of Data Integration and Kettle Project Founder at Pentaho 2. Visual MapReduce. Graphically build Hadoop data transformations without coding. This enables you to reduce your Hadoop development time by over 85 percent. Additionally, Pentaho automatically manages the deployment and execution of Hadoop transformations with YARN. YouTube Demo by Doug Moran, Product Manager for Big Data Technologies and Co-Founder of Pentaho 3. Embedded Analytics. White label our reports, visualizations and dashboards directly into your web applications. Use Java and REST APIs to access Pentaho data transformations and reports. YouTube Demo by Anthony de Shazor, SVP of Customer Care and Principal Architect 4. Beyond ETL. Pentaho supports Enterprise Information Integration (EII), also known as data federation. No you can create ETL jobs that blend data from structured and big data sources and invoke it via JDBC with Teiid 5. Weka Scoring, Forecasting and R script execution with our Data Science Pack. This package helps Data Scientists reduce data preparation times by 60-80 percent 6. Deep Integration with your Cloudera, Hortonworks or MapR Hadoop ecosystem. Pentaho offers real control over YARN jobs, Spark execution, Oozie, Sqoop and more 7. 2,000 Commercial Customers and 20,000 Production Deployments

Best of Breed vs. Best Platform VS.

It s all about Balance. Focus on the Integration and Workflow.

Pentaho Part of Hitachi Group Companies Digital Operations, Digital Enterprise, Digital Customer Experience & Smart Infrastructure Turnkey BI and Big Data Solutions Digital Transformation & Social Innovation Big Data Integration & Analytics. Embedded or SaaS model CLOUD Big Data since 2009 Traditional DI & BI Since 2004 SSO & Java Spring DB per Group/Tenant Row-level Multi-tenancy Object Multi-tenancy UI Multi-tenancy Scale-out Architecture Unified Compute Platform (UCP) Hyper-Scale-out Platform (HSP) 9

End-to-End Big Data Integration, Embedded Analytics & Dashboards Self-Service Analytics Big Data Sources Built for Business Analysts and Developers Structured Data Streamlined Data Refinery Blended Big Data / Legacy Data Federation Extract, Transform & Load Enterprise Information Integration Hadoop, NoSQL, Streaming Self-Service Analytics Dashboards, Mobile Analyzer, Interactive Reporting Automatic Schema Generation Web-based, Open-Platform Embeddable, Open Standards Weka & R Support Dozens of Visualizations Dashboards in Minutes Mobile Reporting Web & Streaming

End-to-End Development Solution Data Engineering Data Engineering Data Preparation Analytics Managing and Automating the Pipeline Administration Security Lifecycle Management Data Provenance Dynamic Data Pipeline Monitoring Automation

So why do customers buy Pentaho? 1. Metadata Injection. By utilizing dynamic vs. static bindings, we can reduce the number of data transformations by 80-90 percent. YouTube Demo by Matt Casters, Chief Architect of Data Integration and Kettle Project Founder at Pentaho 2. Visual MapReduce. Graphically build Hadoop data transformations without coding. This enables you to reduce your Hadoop development time by over 85 percent. Additionally, Pentaho automatically manages the deployment and execution of Hadoop transformations with YARN. YouTube Demo by Doug Moran, Product Manager for Big Data Technologies and Co-Founder of Pentaho 3. Embedded Analytics. White label our reports, visualizations and dashboards directly into your web applications. YouTube Demo by Anthony de Shazor, SVP of Customer Care and Principal Architect 4. Current Big Data Projects Struggling or Failing 5. Internet of Things (IoT) Initiatives 6. On-Boarding Initiatives (Consolidation, Data Warehousing, SaaS) 7. 360 Degree View (Blending Traditional & Big Data Sources) 8. Predictive Analytics ( R ) & Machine Learning ( Weka ) 9. Embedding (White-Labeling) Reporting & Analytics 10. ERP Migration 11. Cloud Deployments 12. Data Federation (a.k.a. Enterprise Information Integration) 13. Blended Application and Data Layer Integration (SOA, Web Services, etc.)

Thank You 13