Siim Karus 2011 Fall
|
|
- Robert Collins
- 5 years ago
- Views:
Transcription
1 Siim Karus 2011 Fall
2 Business Intelligence Data Acquisition Data Analyisis Results presentation
3 Definition Relation to Data Mining Themes of BI, history Applications of BI
4 The ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal. Hans Peter Luhn (1958, IBM)
5 Improving Business Insight A broad category of applications and technologies for gathering, storing, analyzing, sharing and providing access to data to help enterprise users make better business decisions. Gartner
6 all
7
8
9 Seek Profitable Customers Identify Problem Areas Correct Data During ETL Understand Customer Needs Descriptive Analysis Detect and Prevent Fraud Predictive Analysis Anticipate Customer Churn Performance Monitoring Business Activity Monitoring Build Effective Marketing Campaigns Predict Sales & Inventory
10 efine Data dentify Task et Results
11 Sources (choice of features, process or content centric) Extract, transform and load (ETL vs ELT) Storage (Hadoop, StreamInsight, BigData)
12 Measurement tools Termometer Clock Storage Tablets Books Database Systems Extraction tools SQL queries ETL systems
13
14
15
16
17
18
19
20 Generating an unified model (Data Warehouse) Cleaning data Merging data Applying aggregations Evaluating data Splitting data Data Transformations (ETL vs ELT)
21 Hadoop BigData StreamInsight
22 Know the right questions to ask Beware of feedback loops
23 Know the right questions to ask Beware of feedback loops
24 Know the right questions to ask Beware of feedback loops
25 Know the right questions to ask Beware of feedback loops
26 Know the right questions to ask Beware of feedback loops Do not forget to model missing data
27 Visual Mining (cubes, dimensions, partitions) Data Mining (choice of algorithms) Visual Data Mining (learning from user interactions with results) Social BI Interpretation
28 Visual Learning Tablets Pivot Machine Learning Excel Statistics Suites Analysis Suites Visual BI Analysis Suites Social BI People
29
30
31 Regular (Simple) Dimensions (Star schema) Referenced Dimensions (Snowflake schema) Fact (Many-to-Many) Dimensions
32
33 Simple aggregations Min Max Average Sum Count Complex aggregations Difference with previous period Conditional sum or count Calculation types Precomputed vs computed during runtime Over visible nodes vs over all nodes
34 Split cube by dimension values Partitioning: Different data sources Different storage policies (e.g. Operative non-cached data ROLAP partition and historic cached data MOLAP partition) Read-only vs. Read-write partitions
35 Subsets of cubes (not necessarily subcubes) Purpose: show cube data relevant to different stakeholders
36
37
38 Learning from users analysis-decision patterns Raw data Data Representation Domain Expert High dimensionality data Domain knowledge User Feedback Dimensionality Reduction User Low dimensionality data Visualisation evaluation Visualisation
39 recaptcha Prediction markets Social media sites as data source
40 We do not model causality We only model dependence
41 Choice of relevant baseline Random guess Mean Most common class No change Measures of performance Precision and recall Lift Cumulative Gain Mean Relative Error Mean Absolute Error Interpretation mistakes Post hoc ergo propter hoc Cum hoc ergo propter hoc Affirming the consequent Confirmation bias Confounding Uncorrelated does not imply independent Third-cause fallacy Selection bias Sampling bias
42 Basic reporting (what, how, data overload) Events, reactions Estimation fallacy (self-reference) Decision Making Process Improvement
43 Visualisation Tablets Drawings Reporting Suites Process Automation Ticket Systems Tracking Systems Notification Alerts Business Process Management Solutions
44
45
46 Report types Forms Tables Charts Gauges Scorecards Interactions Drill down Drill through Linked views Writeback Delivery On-demand Runtime Cached Subscription based Published Security Dimension filters Consumer specific reports
47 Upper management Scorecards Drilltrough Publishing Middle management Dashboards Drilldown On-demand Lower management Tables Forms Runtime
48 KPI based Value Goal Trend Status Event based Threshold Event Delivery Dashboard Mail SMS Push-notifications
49 Dashboards Scorecards Scenario analysis Bottleneck identification Getting the relevant data to people who need it when they need it.
50
51
52
53
54 Confusing correlation with causality
55 Confusing correlation with causality Showing too little or too much data Forgetting about drill actions Ignoring analysis results Overreacting or misinterpreting
56
57
58 Name one BI task you solve daily? Describe your Data sources Analytic process Decision support
59 Armor your aircrafts Download the datasheet about aircraft past battle damage report Choose, which aircraft parts to armor ( Enter your specification to the simulator (you will get to run it and re-design your craft at practice) Keep in mind that Armor increases weight and reduces mobility Armor increses part s durability approximately 3 times
60 Complete self-referential aptitude test:
61 /en/html/lu2_learningobject3.html