Designing your BI Architecture

Similar documents
Retail Business Intelligence Solution

IBM Software Group. Welcome. DB2 Information Management Software. Vanessa Chan Software Group IBM China/Hong Kong Limited IBM Corporation

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

Adding Insight into Business Data with DB2 UDB Data Warehouse Editions

InfoSphere Warehousing 9.5

NICE Customer Engagement Analytics - Architecture Whitepaper

Information On Demand Business Intelligence Framework

Oracle Retail Data Model (ORDM) Overview

InfoSphere Software The Value of Trusted Information IBM Corporation

Louis Bodine IBM STG WW BAO Tiger Team Leader

AVANTUS TRAINING PTE PTE LTD LTD

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

IBM Dynamic Warehousing

Bringing the Power of SAS to Hadoop Title

IBM Cognos 10.2 BI Demo

MS-20466: Implementing Data Models and Reports with Microsoft SQL Server

Oracle Data Warehouse for Retail

OLAP Technologies and Applications

The New, Extended Oracle Business Intelligence - A System for Enterprise Performance Management. Gavin Dupre Director, BI Sales Consulting EMEA

Microsoft Enterprise Cube. BPM Solutions for Today s s Business Needs

Implementing Data Models and Reports with Microsoft SQL Server

Aligning financial services IT to the business through the use of dashboards

Oracle Real-Time Decisions (RTD) Ecommerce Interaction Management Use Case

"Charting the Course to Your Success!" MOC Designing a Business Intelligence Solution by Using Microsoft SQL Server 2008.

Five Advances in Analytics

STATE OF THE ART ANALYTICS

Democratising Predictive & Embedded Analytics. Clinton Etheridge Senior Pre-Sales Consultant

Your Top 5 Reasons Why You Should Choose SAP Data Hub INTERNAL

Analytics in the Digital Economy data, experience, ideas & people. Juergen Hagedorn, Viktor Kehayov Product Management, SAP Analytics March 2017

Solution Architect with 18 years experience in business, visual production and technology. AIIM Certified Enterprise Content Management Practitioner

Rhonda Stonaker Infosemantics, Inc.

By 2020, more than half of major new business processes and systems will incorporate some element of the IoT.

COPYRIGHTED MATERIAL. Contents. Part One Requirements, Realities, and Architecture 1. Acknowledgments Introduction

Exceed your business with SharePoint Server 2010

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Sascha Schubert Product Manager Data Mining SAS EMEA Copyright 2005, SAS Institute Inc. All rights reserved.

IBM PERFORMANCE Madrid Smarter Decisions. Better Results.

Systems Management of the SAS 9.2 Enterprise Business Intelligence Environment Gary T. Ciampa, SAS Institute Inc., Cary, NC

SAP CRM 7.0. Overview. SAP CRM 7.0 Marketing

Building data-driven applications with SAP Data Hub and Amazon Web Services

Customer Value Analytics for Banking & Capital Markets

Andrew Macdonald ILOG Technical Professional 2010 IBM Corporation

Ian Cloves Analytics Client Architect. IBM Cognos Analytics (R6) Overview IBM Corporation

IBM Cognos TM1. Highlights. IBM Software Business Analytics

Analytics in Action transforming the way we use and consume information

Modern Analytics Architecture

SAP Predictive Analytics Hands-On. Andreas Forster December 2015

SAS Viya. Примеры проектов на новой платформе. Copyright SAS Institute Inc. All rights reserved.

Roles and Processes in Analytics Development

<Insert Picture Here> Oracle Business Intelligence Roadmap and Strategy

IBM Retail Business Intelligence Solutions. Retail RBIS/Cognos Solutions Overview

Information On Demand for General Business

Integrated systems for operational analytics

IBM Grid Offering for Analytics Acceleration: Customer Insight in Banking

Managing explosion of data. Cloudera, Inc. All rights reserved.

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Approaching an Analytical Project. Tuba Islam, Analytics CoE, SAS UK

Introduction to Hyperion Financial Reporting

Welcome to the topic on the analytic content and applications that are made possible in SAP Business One version by running on SAP HANA.

Analytical Approaches in Insurance How to assure profitable business. Andrea Berková Business Development Manager - Oracle Financial Services ECEMEA

Profitics Retail Analytics

IBM Cognos BI Server Distribution Options - How to Manage Distributed Servers Session Number Dean Browne, IBM Corp.

CUSTOMER 360 WITH QLIK & CLOUDERA

A technical discussion of performance and availability December IBM Tivoli Monitoring solutions for performance and availability

Trusted Experts in Business Analytics. Business Analytics Training Catalog

Optimize Process Performance with Analyzer, Monitor & Business Intelligence

The Industry Leader in Data Warehousing, Big Data Analytics, and Marketing Solutions

Customer Value Analytics for Banking & Capital Markets

SAP Predictive Analysis

Exalytics The Fastest Oracle Essbase, Hyperion Planning, & OBIEE Performance Imaginable

2008 Oracle Corporation

TAP Air Portugal. in Real Time TÍTULO. Subtítulo. Rui Monteiro - February 19. Data da apresentação

<Insert Picture Here> Oracle Business Intelligence Strategy and Roadmap

Decision Server. Combining Business Event Processing and Business Rules Management for Decision Agility and Effectiveness IBM Corporation

Cubeware Connectivity for SAP Solutions

deister software Company Overview

Trusted Experts in Analytics. Business Analytics Training Catalog

20466: Implementing Data Models and Reports with Microsoft SQL Server 2014

Benefits of Grid Computing for SAS Applications

Oracle Business Intelligence & Spatial Views

SAP Predictive Analytics Tools

SAP Predictive Analytics Suite

Scott Lowden SAP America Technical Solution Architect

Cognos 8 Business Intelligence. Evi Pohan

THE STRATEGIC IMPORTANCE OF OLAP AND MULTIDIMENSIONAL ANALYSIS A COGNOS WHITE PAPER

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

Session 15 Business Intelligence: Data Mining and Data Warehousing

SAP BusinessObjects XI 4.0 What s Coming? Dec. 9, SAP Run Better

Elixir Repertoire and Ad Hoc Business Intelligence for Enterprise 2.0

360 Production Awareness: Reporting and Analytics for SAP Manufacturing. Salvatore Castro, Satheesh Gannamraju

Business Process Modeling for SOA Prepare for SOA Implementation Dr. Alex Kwok 22 Nov 2004

SAP Business Analytics Overview and Strategy. Patric Imark, Business Architect EPM SAP (Suisse) AG 29 May 2013

IBM Business Automation Workflow

Next Generation Performance Dashboards. Wayne Eckerson Director, TDWI Research

HYPERION SYSTEM 9 PLANNING

SmartCare. SPSS Workshop. Rick Durham - North American Advanced Analytics Channel Team IBM Corporation. Date: 5/28/2014

SAS Decision Manager

SSRS and Izenda: Comparing an Enterprise Reporting Tool and an Embedded BI Platform. 1

Microsoft SQL Server 2000 Reporting Services

Uncover possibilities with predictive analytics

Transcription:

IBM Software Group Designing your BI Architecture Exploiting your Data Warehouse David Cope EDW Architect Asia Pacific 2007 IBM Corporation

The Analytical Evolution Easy Mining and Alphablox enable insights to be delivered throughout the enterprise. IBM Differentiator Action Business Value Reports Insight Discovering previously unknown and unsuspected Ad Hoc information. Analysis Empowering analysts to test hypotheses for better decision making. Query and OLAP Static, repetitive queries about past results. Decision Empowerment 2

IBM DB2 Warehouse Software Embedded analytics Modeling and design Data mining and visualization Data partitioning Performance optimization Workload control In-line analytics Data movement and transformation Database management IBM DB2 Warehouse Deep compression Administration and control 3

IBM DB2 Warehouse Software Embedded analytics Modeling and design Data mining and visualization Data partitioning Performance optimization Workload control In-line analytics Data movement and transformation Deep compression Administration and control Database management IBM DB2 Warehouse 4

DWE OLAP Model Cube Cube dimension Cube hierarchy Cube Model Dimension Hierarchy Cube Level Facts Join Level Cube Facts Measure Measure Attribute Join Attribute dimension tables fact table dimension tables Relational tables in DB2 5

Model-Based Optimization Administrator Model Catalog Tables Base Tables OLAP Metadata Time & Space constraints Query Types Benefits Smart Aggregate Selection Smart Index Selection SQL Generation DB2 Exploitation Model Information Statistics Data Samples MQT's Performance Advisor 6

OLAP Metadata Interchange meta data bridge OLAP Metadata meta data bridge DB2 Alphablox OLAP Metadata MITI OLAP Metadata OLAP Metadata Hyperion OLAP Metadata OLAP Metadata OLAP Metadata DML DDL RDBMS Metadata DB2 Data Warehouse DATA OLAP Metadata OLAP Metadata OLAP Metadata BUSINESS OBJECTS QMF for Windows Model & ETL tool metadata BI tool metadata QlikTech ArcPlan 7

Alphablox IBM Software Group Platform for Customized Analytic Applications and Inline Analytics Pre-built components (Blox) for analytic functionality Allows you to create customized analytic components that are embedded into existing business processes and web applications 8

Alphablox IBM Software Group For end-users: A web application, portal or dashboard with embedded analytics in an easy-to-use interactive interface For application developers: A J2EE application for analysisoriented interaction A set of analytic-focused extensions to the application server Alphablox with DWE: SQL generated by DWE Design Studio can be pasted into Alphablox pages for warehousebased embedded analytics 9

Alphablox Architecture Web Browser DHTML Based Client similar to AJAX XMLHttpRequest WebLogic WebSphere Tomcat Alphablox UI Model GridBlox ChartBlox PresentBlox Calculations Bookmarks Alerts Comments DataBlox OLAP Essbase / MSAS / SAP BW Alphablox Cubing Engine ROLAP Relational Databases MQ 10

Relational Cubing Engine & OLAP Optimization Application Server Tier Relational Cubing Engine Relational Cube cubelets Cube Definition Dimension Data Retrieval Metadata Import OLAP Metadata Database Server Tier DB2 Cube Views DB2 MQTs Star Schema DB2 Alphablox Server MDX MDX Data Blox DB2 Alphablox Application Present Blox Grid Blox Chart Blox Fact Data Retrieval Customer Tier HTTP Server 11

Versatile Architecture Support Mart BI Applications and Tools DB2 Warehouse supports versatile analytics architectures EDW Analytics directed against External Mart Internal Mart Virtual Mart External Marts Internal Marts Virtual Marts 12

IBM DB2 Warehouse Software Embedded analytics Modeling and design Data mining and visualization Data partitioning Performance optimization Workload control In-line analytics Data movement and transformation Deep compression Administration and control Database management IBM DB2 Warehouse 13

IBM Software Group DWE Easy Mining Mining without a Statistician Realize the benefits of mining by enabling analysts, rather than relying on statisticians, for your data mining needs Reporting Tool DB2 Data Warehouse Edition 14

Two Types of Data Mining Discovery & Predictive Discovery Automatically find trends and patterns Answer unasked questions Relatively undirected analysis Tool reports on findings In a word Easier Useful for non-statisticians Predictive Specific question Probability associated with outcomes Directed analysis Iterative process Train Test Apply Apply model in database at customer touch points 15

DWE Easy Mining Algorithms DWE Enterprise Data Warehouse Data Warehouse Selected Data Extracted Information Select Transform Mine Assimilate Business Analyst DWE Partner Assimilated Information Statistician & Data Mining Workbench Discovery Methods finding useful patterns and relationships Associations Which item affinities ( rules ) are in my data? [Beer => Diapers] single transaction Sequences Which sequential patterns are in my data? [Love] => [Marriage] => [Baby Products] sequential Clustering Which interesting groups are in my data? customer profiles, store profiles Predictive Methods predicting values Classification How to predict categorical values in my data? will the patient be cured, harmed, unaffected by treatment? Regression How to predict numerical values in my data? how likely a customer will respond to the promotion how much will each customer spend this year? Score data directly in DB2, scalable and real time 16

How to Recognize a Data Mining Need What do my customers look like? Which customers should I target in a promotion? Which products should I use for the promotion? How should I lay out my new stores? Which products should I replenish in anticipation of a promotion? Which of my customers are most likely to churn? How can I improve customer loyalty? What is the most likely item that a customer will purchase next? Who is most likely to have another heart attack? What is the likelihood of a part failure? When one part fails, what other part(s) are most likely to fail soon? How can I identify high-potential prospects (lead generation)? How can I detect potential fraud? 17

High Level view of the Data Mining Process Business Problem A minor miracle occurs Validate, Refine Data Warehouse Extract & Transform data Build Model Deploy Insight 18

The Data Mining Process This is an iterative process! MINING Revise Data & Refine Model Discover & Interpret Information DEPLOY Business Problem Data Warehouse Select Data Σ(X j ) Σ( Σ( Σ( Y = f(x,z) Apply Results Select Transform Mine Report ETL Visualize Analyze Understand Score data Embed in application Data Preparation Data Mining 19

Associations IBM Software Group Discovery technique to find associations or affinities among items (or conditions, outcomes, etc.) in a single transaction. Constructs statements ( rules ) that quantify the relationships among items that tend to occur together in transactions Example: In a supermarket, Cola is bought in 20% of all purchases. Cola is bought in 60% of the purchases involving Orange juice. 3.7% of all purchases involve both Cola and Orange juice. The rule [ Orange juice ] [ Cola ] has the following properties: Support = 3.7% Cola and OJ are present together in 3.7% of all baskets. Confidence = 60% Cola is present in 60% of the baskets containing OJ. Lift = 60% / 20% = 3 Cola is 3 times as likely to be in the basket when OJ is also. Scoring Given the item(s) purchased (rule body), what item (rule head) is most likely to be purchased as well? Common uses Promotional or cross-sell offers, Disease management, Part failure 20

Sequences IBM Software Group Discovery technique to find affinities among items (or conditions, outcomes, etc.) across multiple transactions over time. Quantifies relationships ( sequences ) to identify the most likely item in the next transaction C G, B ---- C ---- X B ---- A ---- Y 100% of the customers who get C will get X at a later time 67% of the customers who get B will get X at a later time X Y ---- D ---- C --- B ---- X Scoring Given the item(s) purchased previously (rule body), what item (rule head) is most likely to be purchased in a subsequent transaction within a certain time frame? Common uses Fraud detection, Promotional offers, Disease management, Part failure 21

Clustering IBM Software Group Discovery technique to find clusters having distinct behaviors and characteristics Gain insights to customers, stores, insurance claims, etc. Generate distinct behavioral/demographic profiles Understand the most important attributes of each cluster Create a model to assign individuals to best-fit clusters Apply model to assign new individuals or re-assign existing individuals Design business actions tailored to different characteristic profiles Scoring Apply model to assign each record to its best-fit cluster Apply appropriate business action for each record based on its assigned cluster Common uses Customer segmentation, store profiling, deviation detection 22

Classification Prediction technique to classify individuals by outcome Classify by a categorical class variable (e.g., YES-NO-MAYBE response) Understand the most important factors (predictors) leading to each outcome Modeling Create a model to classify individuals according to expected outcome Design business action based on most important predictors Scoring Apply model to predict the outcome for each individual New prospects (expected behavior) Existing individuals (changes in behavior) Identify target individuals for business action Common uses Customer attrition (churn), Part failure 23

Regression IBM Software Group Set of predictive techniques to predict a dependent variable Predict continuous value or binary numeric value Continuous: e.g., revenue (prediction represents amount of revenue) Binary: e.g., 0=No, 1=Yes (prediction represents probability of Yes) Understand the most important predictors of the dependent variable Transform regression, linear regression, polynomial regression Modeling Create a model to predict the dependent variable Design business action (e.g., predict likelihood of default for a loan application, in real time) Scoring Apply model to generate a prediction for each individual (e.g., probability of part failure) Identify target individuals for business action Common uses Predict revenue/cost/profitability, Predict risk of loan default 24

The Data Mining Process This is an iterative process! MINING Revise Data & Refine Model Discover & Interpret Information DEPLOY Business Problem Data Warehouse Select Data Σ(X j ) Σ( Σ( Σ( Y = f(x,z) Apply Results Select Transform Mine Report ETL Visualize Analyze Understand Score data Embed in application Data Preparation Data Mining 25

Data exploration DWE enables you to explore the data. Check data quality (prior to performing ETL for data preparation) and gain a general understanding of the data Design Studio provides four tools to inspect data: Table sampling Univariate distributions Bivariate distributions Multivariate distributions All these tools are accessible by rightclicking on a table/view/alias/nickname in the database explorer: -> Data for table sampling/editing -> Value Distributions for multivariate/ univariate/bivariate distributions 26

The Data Mining Process This is an iterative process! MINING Revise Data & Refine Model Discover & Interpret Information DEPLOY Business Problem Data Warehouse Select Data Σ(X j ) Σ( Σ( Σ( Y = f(x,z) Apply Results Select Transform Mine Report ETL Visualize Analyze Understand Score data Embed in application Data Preparation Data Mining 27

Leveraging Mining and Alphablox: DWE Miningblox Create web applications that provide access to DWE Data Mining Extends the DB2 Alphablox API with mining specific functionality. With Miningblox, you can perform the following tasks: Selecting input data Processing input data Displaying mining results graphically in a Web browser, for example, the characteristics of a customer segment Administering or managing mining runs Typically a web application using MiningBlox tags might be integrated in a business application or an intranet portal. 28

Why use Miningblox? Provide access to Data Mining for a group of business analysts. Create a Miningblox web application that provides access to mining functionality through the Web browser, no need to install software on the Client s machines Analysts can execute mining runs and view results in a customized web application without extensive knowledge about mining software. With the Miningblox Application wizard in the DWE Design Studio, you can easily create Web applications by selecting sample templates or you can extend Alphablox applications with mining functionality. 29

Deployment through Alphablox application example MBA application console 30

Deployment through Alphablox application example MBA execution 31

Deployment through Alphablox application example MBA completion 32

Deployment through Alphablox application example MBA results report 33

IBM Software Group Case Study: Retail Department Store Analytics with Data Mining and Alphablox David Cope EDW Architect Asia Pacific 2007 IBM Corporation

Retail Department Store Chain Business requirements Perform a data mining POC (really a pilot project) to support the original DWE decision, ensure success, and highlight DWE capabilities for further uptake Define business problem Boost storewide sales (across other departments) based on women s shoes Define analytical approach and ETL procedure Extract all transactions of customers who have purchased women s shoes Transform transactional data into one record per customer, for customer segmentation Perform market basket analysis (MBA) for high-potential customers who have purchased women s shoes Challenges Engagement sponsored by IT with limited access to business users (LOB) 35

Solution Overview Prepare data for mining by: Pulling transactions for women s shoe customers Creating data for customer segmentation Use DB2 Mining to perform: Clustering Identify high-potential customer segments Market Basket Analysis for high-potential segments Identify associated items Identify next-most-likely purchases Deploy mining results in Alphablox Integrate data mining information into the dashboard and as part of the guided analysis Build a dashboard in Alphablox: Provide critical information and metrics in an Alphablox dashboard to merchandising and marketing. Integrate powerful visualization to make it easier to identify problem areas Alphablox Cubing Engine Analytical Dashboard Heat Maps / Other Visualization DB2 Data Warehouse Mining Models & Services Clustering Associations & Sequences Scoring Services Data Mining Visualizer/ Alphablox Data Mining API 36

Business Scenario for Mining Business requirements for POC Focus on customers who have purchased women s shoes in the past 12 months Boost storewide sales (across other departments) based on women s shoes Increase wallet share from high-potential customers Business questions to be answered What do my women s shoes customers look like? Which of these customers should I target in a promotion? Which products should I use for the promotion? Which products should I replenish in anticipation of a promotion? How can I improve customer loyalty? What is the most likely item that a women s shoes customer will purchase next? 37

Step 1: Identify High-Potential Shoe Customers 38

Result: 16 Distinct Clusters Created 39

Cluster 1: Those who Act Like VIP s Frequent Shoppers Big Spenders VIP s Active Shoppers Respond to Discounts High Returns High Potential Customers! 40

Cluster 6: Frequent Good Shoppers Shop Here 30 days/yr Above-Avg Purchases Above-Avg Spending Respond to Discounts Average Returns High Potential Customers! 41

Step 2: Identify Associated Items for Clusters 1 & 6 Extracted transactions for those clusters of customers Performed market basket analysis and interpreted results Associations (items purchased together in one visit) + 42

Identify Purchased Together for Clusters 1 & 6 43

Results: Associations for Clusters 1 & 6 44

Step 3: Identify Next Likely Purchase for Clusters 1 & 6 Extracted transactions for those cluster of customers Performed market basket analysis and interpreted results Sequences (next most likely purchase in a future visit) 45

Identify Next Likely Purchases for Clusters 1 & 6 46

Results: Sequences for Customers in Clusters 1 & 6 47

Results and Future Ideas Deployment of customer segmentation and MBA End-user application with Alphablox Create & refresh mining models Identify high-potential customer segments Refresh assignment of each customer to best-fit cluster Target selected customer segments for promotions Batch scoring to identify best offer(s) for each customer/segment Merchandising now has a view of their customers, not just products Future ideas Score a customer at checkout register in real time MBA scoring (associations, sequences) Focused MBA scoring for known customers, based on best-fit cluster Make an offer to induce customers to visit other departments before leaving the store 48

49