Lecture Outline. Learning Objectives. Building Prototype BIS with Excel & Access Short demo & Model-Driven Business Intelligence Systems: Part I

Size: px
Start display at page:

Download "Lecture Outline. Learning Objectives. Building Prototype BIS with Excel & Access Short demo & Model-Driven Business Intelligence Systems: Part I"

Transcription

1 Building Prototype BIS with Excel & Access Short demo & Model-Driven Business Intelligence Systems: Part I Week 8 Dr. Jocelyn San Pedro School of Information Management & Systems Monash University IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, 2004 Lecture Outline Building Prototype System with Excel & Access Short demo Some background tutes in weeks 5-6 The Assignment and Assignment Marking Guide Model-Driven BIS Part I Descriptive Statistics Time Series Forecasting Models IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Learning Objectives At the end of this lecture, the students should have gained an overall picture of how the tutes in weeks 5-6 could help in completing Part II of the Assignment an understanding of the Assignment Marking Guide introductory knowledge to model-driven BI systems an understanding of meaning of descriptive statistics and some time series forecasting models IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

2 Building Prototype System with Excel & Access Detecting fraud in healthcare Background Data Analysis Requirements Dimensional analysis BI Requirements Identifying BI opportunities thru interpretation of summarised reports (pivot tables and charts) System Requirements Data mart? Data warehouse? Front-end tool? IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Building Prototype System with Excel & Access YourPatients.csv (data source) Conduct dimensional analysis, extract, transform and load data MyPatients.xls (ETL tool) Populate fact and dimension tables MyPatients.mdb (data mart) Implement star schema SampleBIS.xls (front-end tool) Generate pivot tables/charts and GUI IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Dimensional Analysis 1. Who was the patient with highest monthly average payments? FACT table should include NetPayments, TimeID and PatientID Patient Claims PatientID TimeID NetPayments Time Dimension TimeID, Month (ServiceDate grouped into Months) Patients Dimension Patient ID, Patient Name Time TimeID Month Patients PatientID PatientName IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

3 Dimensional Analysis 2. When did the patient incur the highest net payment? Grain/granularity? Detailed instead of summarised - allow drill-down to actual service date Time TimeID Month Time TimeID ServiceDate IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Dimensional Analysis 3. How many procedures were paid on the day in (2)? Fact Table should also include ProcedureCode Procedures Dimension - ProcedureCode Patient Claims PatientID TimeID ProcedureCode NetPayments Procedures ProcedureCode IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Dimensional Analysis 4. If there was more than one procedure on the day in (3), was this performed by the same provider? Fact Table should also include ProviderID Providers dimension - ProviderID, ProviderName Patient Claims PatientID TimeID ProcedureCode ProviderID NetPayments Providers ProviderID ProviderName IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

4 Dimensional Analysis YourPatients.csv (Operational Data) Star Schema (Data model) IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Prototype Building Short Demo ETL Process Implementing star schema in Access Querying and Reporting (Pivot Tables/charts) Developing front-end tool in Excel IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Model-Driven Business Intelligence Systems Part I IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

5 Model-Driven BIS information systems that provide BI through access and manipulation of models (statistical, analytical, conceptual, etc) Simple statistical and analytical tools most elementary functionality (Power, 2001) IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Example of Model-Driven BIS FMC Financial Models FMCModel improves the decision making process through portfolio modeling and rebalancing. Decisions are checked through pre-trade compliance and are then sent to your trade order management system. Total modeling flexibility through what if analysis Versatile asset allocation Advanced rebalancing Pre-trade compliance IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Sample Model-Driven BIS: FMCModel IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

6 Sample Model-Driven BIS: FMCModel IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Sample Model-Driven BIS: SAS 9 IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Sample Model-Driven BIS: SAS Risk Dimensions Data management, risk analysis, and risk reporting Powerful and Unique Modeling Tools... for More Reliable Risk Measures possible to specify non-normal models for risk factors and to fit these models to your data. Apart from common models users can specify their own models. The resulting risk measures are much more reliable than those based on normal approximations IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

7 Descriptive Statistics IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Descriptive Statistics What are they? How do they provide BI? Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Descriptive Statistics Measures of Central Tendency tendency of observed values to centre around a particular value Mean - add together all observed values and divide by number of observations e.g., mean temperature of 25 0 Median the middle score of an ordered set of scores If the no. of scores is odd, the median score will be the middle score, i.e., for 11 scores, the median is the 6 th score If the no. of scores is even, the median score is halfway between two middle scores, i.e., for 10 scores, the median is the average of 5 th and 6 th scores Mode the score that appears the most number of times IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

8 Example: Mean, Median, Mode IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Measures of Dispersion Measures of Dispersion measures of variability of observed values Range difference between the largest and smallest observations (Max score Min score) Variance the mean of squared deviations from the mean Why squared? Standard Deviation a way of indicating an average amount by which all the values deviate from the mean Square root of variance IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Example: Mean, Median, Mode IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

9 Time Series Forecasting Models IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Time Series Time Series a sequence of evenly spaced (e.g. weekly, monthly, quarterly) data points Components of Time Series Trend upward or downward movement of the data over time Seasonality pattern of demand fluctuations that occurs every year above or below the average demand Cycles pattern that occurs over several years Random variations blips in the data caused by chance and unusual situations; no pattern; cannot be captured or used to forecast values IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Time Series Seasonal peaks Trend Component Demand for Product Average Demand over 4 years Year1 Year2 Year3 Year4 IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

10 Time Series Cycle repeating every 4 years Demand for Product Year1 Year2 Year3 Year4 Year5 Year6 Year7 Year8 IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Excel Modules Excel Add-In Render, B., Stair, R. and Balakrishnan, N. (2003) Managerial Decision Modeling, Prentice Hall. IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Moving Average Useful if we can assume that the item we are forecasting will stay fairly steady over time Example: To calculate a 3- month moving average Take average of 3 previous months This 3-month moving average serves as forecast for next month IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

11 Moving Average Moving Average Value 15 Actual Value Forecast Time IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Weighted Moving Average Useful when recent periods may be more important than earlier periods Example: To calculate a 3- month weighted moving average Multiply data by corresponding weight Take sum of the products of weight and corresponding data in 3 previous months This 3-month weighted moving average serves as forecast for next month IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Weighted Moving Averages 0.2 * * * IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

12 Exponential Smoothing Both moving averages and weighted averages are effective in smoothing out fluctuations to provide stable estimates More periods, better estimate; but extensive historical data Exponential smoothing Also a moving average technique Requires less historical data IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Exponential Smoothing The forecast in period t+1 is the weighted average of actual values in period t, t-1, t-2, t-3, and so on, with weights α, α(1- α), α(1- α) 2, α(1- α) 3, for periods t, t-1, t-2, t-3 respectively Weights decrease exponentially over time Example: if α = 0.2, then the weights α(1- α) 2 α(1- α) 3 α 0.2 α(1- α) IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Exponential Smoothing Forecast for period t+1 = forecast for period t + α(actual value in period t forecast for period t) F t+1 = F t + α(a t F t ) or To reflect weights decreasing exponentially F t+1 = α A t + α(1- α )A t-1 + α(1- α )2 A t-2 + α(1- α )3 A t-3 + IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

13 Exponential Smoothing IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Exponential Smoothing Actual Values Vs. Forecasts Actual Value Month Actual Values 3-Month WMA Forecasts 3-Month MA Forecasts Exponential Smoothing Forecasts IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Measuring Forecast Error Forecast Error = Actual Value Forecast Value Mean Absolute Deviation Average of the absolute values of the individual forecast errors Mean Squared Error Average of the squared values of the individual forecast errors Mean Absolute Percent error Average of the absolute difference between the forecasted and actual values, expressed as % of the actual values IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

14 Forecast Error Analysis IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Summary of Forecast Error Analysis Forecasting Model MAD MSE MAPE Moving Averages % Weighted Moving Averages % Exponential Smoothing % IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Summary Prototype Building with Access and Excel Model-Driven BIS Access and manipulation of models Descriptive Statistics Describe central tendency and dispersion of data Time Series Forecasting Models Estimating future values based on historical data Moving Averages fairly stable data Weighted Averages some data points are more important than others Exponential Smoothing weights decrease exponentially with time Forecast Errors measure the accuracy of forecasts IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,

15 References Langley, R. (1970) Practical Statistics Simple Explained, Dover Publications, NY. Render, B., Stair, R. and Balakrishnan, N. (2003) Managerial Decision Modeling, Prentice Hall. Rowntree, D. (1981) Statistics Without Tears: A Primer for Non-mathematicians, Penguin Books. IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, Questions? Jocelyn.sanpedro@sims.monash.edu.au School of Information Management and Systems, Monash University T1.28, T Block, Caulfield Campus IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1,