Data Warehousing 1 and Data Mining
2 Data warehousing: Introduction A collection of data designed to support decisionmaking. Term data warehousing generally refers to the combination of different databases across an entire organization The data stored for business analysis can be most effectively be accessed, by separating it from the data in operational systems. Business analysis is done using batch reports that ran during off-peak hours.
3 Primary goals of data warehousing Provide access to the data of an organization Data consistency Capacity to separate and combine data Inclusion of tools to set up query Publish used data Drive business reengineering
4 Data in data warehouse Data in the warehouse can be characterized into Subject oriented Integrated Non-volatile Time-variant
5 Data warehouse architecture Major components of data warehouse are, Summarized data Light summarized for department level High summarized for enterprise level Operational system records Integration/transformation programs Current detail Meta data Archives
6 Data warehouse architecture Major components of data warehouse are, Summarized data Light summarized for department level High summarized for enterprise level Operational system records Integration/transformation programs Current detail Meta data Archives
7 Advantages of data warehouse Cost effective decision-making Better enterprise intelligence Enhanced customer service Business reengineering Information system reengineering
8 Data warehouse structure The structure consists of Physical data warehouse Logical data warehouse Data mart
9 Data warehouse structure The structure consists of Physical data warehouse Logical data warehouse Data mart
10 Data mining: Introduction Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge driven decisions. When implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases to deliver answers to questions.
11 Why data mining What goods should be promoted to this customer? Probability that the customer will respond to planned promotion? Most profitable share to buy? Will this customer default on a loan or pay back on schedule? what medical diagnosis should be assigned to a particular patient? How large are the peak loads on telephone or energy network? Why does the facility suddenly start to produce defective goods?
12 Evolution of data mining Data collection(1960) Data access(1980) Data warehouse and decision making(1990) Data mining
13 Verification vs. Discovery Verification is drill down Discovery is finding hidden patterns
14 Advantages of Data mining Automated prediction of trends and behaviors Automated discovery of previously unknown pattern Database can be larger in both breadth and depth
15 Technologies used in Data mining Neural networks Rule induction Evolutionary programming Case based reasoning Decision tree Genetic algorithms Non-linear regression methods
16 Tasks solved by data mining Predicting Classification Detection of relations Explicit modeling Clustering Deviation detection