Building a Data Pipeline with Pentaho From Ingest to Analytics

Size: px
Start display at page:

Download "Building a Data Pipeline with Pentaho From Ingest to Analytics"

Transcription

1 Building a Data Pipeline with Pentaho From Ingest to Analytics Bruce Berry Senior Training and Development Specialist, Global Learning September 2018

2 Agenda Evolution of Business Intelligence Pentaho Tools Hands-on Demonstration: Data Source to Dashboard

3 Evolution of Business Intelligence

4 Evolution of Business Intelligence Part I Challenge: Combining data from applications or source systems, databases, and files for reporting purposes Solution: ETL (Extract, Transform, Load) Data warehouse/lake Reporting

5 Evolution of Business Intelligence Part II Challenge: Provide reporting tools to non-technical business users Solution: ETL (Extract, Transform, Load) Data warehouse/lake Data model (known reporting need) Self-service reporting

6 Evolution of Business Intelligence Part III Challenge: Multi-dimensional analytics and visualizing data Solution: ETL Data Warehouse/Lake OLAP Cube (known reporting needs) Self-service analytics, dashboarding

7 Online Analytical Processing (OLAP) OLAP provides users a multi-dimensional, aggregated view of data OLAP cube Measures (sales; quantity) Dimensions Hierarchies (geography; time) Levels (country>state>city; year>quarter>month)

8 Evolution of Business Intelligence Part IV Challenge: Reporting needs are not known or predefined Solution: ETL OLAP cube ( on the fly ) Analytics, Dashboarding

9 Evolution of Business Intelligence Future Challenge: Incorporate cloud-based and streaming data in reporting and analytics Solution: ETL Blend data Data service OLAP cube ( on the fly ) Analytics, dashboarding Machine learning, predictive analytics The future is now!

10 Pentaho Tools

11 Pentaho Tools Pentaho Data Integration (PDI) ETL Blend data Data service Data model or OLAP cube ( on the fly ) Machine learning Predictive analytics

12 Pentaho Tools (Continued) Schema Workbench OLAP cube Pentaho Analyzer Self-service analytics, visualizations CTools Community Data Access Community Dashboard Editor Dashboarding

13 Pentaho Tools (Continued) Metadata Editor Data model Interactive Reports Self-service reporting

14 Hands-on Demonstration: Data Source to Dashboard

15 Hands-on Demonstration Overview In this guided demonstration, we will: Review a Pentaho Data Integration (PDI) transformation that obtains data on energy generation and usage around the world, prepares the data for analytics by building a data model (cube), and publishes the data to the repository as a data service Review a PDI job that runs the transformation and publishes the cube to the repository so it can be used for analytics Use Analyzer to analyze and visualize the data View an interactive dashboard that presents several views of the data

16 Hands-on Demonstration (Continued) ACCESS All Enterprise Data Sources STREAMLINE Information Delivery VISUALIZE & Report Information In Any Style DELIVER When & Where Users Need It CLOUD Obtain worldwide energy data PDI-Transformation Prepare and publish data service ANALYZER Analyze and visualize the data CTOOLS Deliver the data in an interactive dashboard PDI-Job Run transformation Create and publish data model/cube

17 Resources

18 Resources Hitachi Vantara Web Site Innovate with Data and Analytics Pentaho Data Integration Pentaho Business Analytics

19 Resources (Continued) Training Pentaho Data Integration Pentaho Data Integration Fundamentals (DI1000) Pentaho Data Integration Advanced (DI1500) Business Analytics Business Analytics User Console (BA1000) Business Analytics Report Designer (BA2000) Business Analytics Data Modeling (BA3000)

20 Resources (Continued) Training CTools CTools Fundamentals (CT1000) CTools Advanced (CT1500)

21 Please Complete the Survey 1. From the Schedule screen on the app, select your session. 2. Open the Training Survey.

22 Thank You

23