Data Warehousing (The Need, Importance & the Big Picture)
|
|
- Rosa Small
- 5 years ago
- Views:
Transcription
1 Data Warehousing (The Need, Importance & the Big Picture) Naveed Iqbal, Assistant Professor NUCES, Islamabad Campus (Lecture Slides Week # 1)
2 Why this Course? The World is changing / (in fact changed) Either change or Be left behind. Missing the opportunities or going in the wrong direction has prevented us from growing. What is the right direction? Harnessing the data, in the knowledge driven economy. Doing what can t be or difficult to automate. NUCES, Islamabad Campus Data Warehousing - Fall
3 The Need of the Time Drowning in data AND/BUT starving for information. Knowledge is power BUT Intelligence is absolute/super power. NUCES, Islamabad Campus Data Warehousing - Fall
4 The Need of the Time POWER ($/ ) Intelligence Knowledge Information Data NUCES, Islamabad Campus Data Warehousing - Fall
5 Evolution of Information Systems NUCES, Islamabad Campus Data Warehousing - Fall
6 NUCES, Islamabad Campus Data Warehousing - Fall
7 NUCES, Islamabad Campus Data Warehousing - Fall
8 Business Intelligence NUCES, Islamabad Campus Data Warehousing - Fall
9 NUCES, Islamabad Campus Data Warehousing - Fall
10 Visualization NUCES, Islamabad Campus Data Warehousing - Fall
11 Date Warehousing the big picture Data (Tier 0) Data Warehouse Server (Tier 1) OLAP Servers (Tier 2) Clients (Tier 3) Semistructured Sources www data Meta Data MOLAP Query/Reporting IT Users Archived data Operational Data Bases Data sources Extract Transform Load (ETL) Data Warehouse Data Marts ROLAP Analysis Data Mining Tools Business Users Business Users NUCES, Islamabad Campus Data Warehousing - Fall
12 NUCES, Islamabad Campus Data Warehousing - Fall
13 Approach of the Course Develop an understanding of the underlying RDBMS concepts. Apply these concepts to VLDB / DSS environments and understand where and why they break down? Expose the differences between RDBMS and Data Warehouse in the context of VLDB. Provide the basics of DSS tools such as OLAP, Data Mining and demonstrate their applications. Demonstrate the application of DSS concepts and limitations of the OLTP concepts through lab exercises. NUCES, Islamabad Campus Data Warehousing - Fall
14 Summary of the Course Introduction & Background Extract-Transform-Load (ETL) Normalization & De-Normalization Dimensional Modeling Online Analytical Processing (OLAP) Data Quality Management (DQM) Need for Speed (Parallelism, Join and Indexing Techniques) DWH Implementation Steps Complete Implementation Case Study Lab and Tool Usage NUCES, Islamabad Campus Data Warehousing - Fall
15 Books Reference Books Golfarelli & Rizzi, Data Warehouse Design Modern Principles and Methodoligies, McGRAW-Hill W. H. Inmon, Building the Data Warehouse, John Wiley & Sons Inc., NY R. Kimball, The Data Warehouse Toolkit, John Wiley & Sons Inc., NY A. Abdullah, Data Warehousing for Beginners: Concepts & Issues. Paulraj Ponniah, Data Warehousing Fundamentals, John Wiley & Sons Inc., NY... NUCES, Islamabad Campus Data Warehousing - Fall
16 Course Execution Plan Lecturing / Discussions Lab Work + Tutorials Assignments / Case Studies Projects Marks Breakup: Mid-I: 12% Quizzes: 6% Mid-II: 13% Assignments/Case Study: 9% Final*: 40% Projects*: 20% * Mandatory (Missing means F) NUCES, Islamabad Campus Data Warehousing - Fall
17 Code of Conduct Regularity Attendance criteria as per university policy Punctuality No entry after 5 minutes from class start time (N/A for habitual late comers) Discipline ABSOLUTLY NO COMPROMISE Positive Attitude High Level of Class Participation No Plagiarism, Cheating No Change in Deadlines No Usage of Mobile / Other Devices NUCES, Islamabad Campus Data Warehousing - Fall
18 Scenario 1 ABC Pvt Ltd is a company with branches at Karachi, Quetta, Peshawar and Lahore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system. NUCES, Islamabad Campus Data Warehousing - Fall
19 Scenario 1 : ABC Pvt Ltd. Karachi Quetta Peshawar Sales per item type per branch for first quarter. Sales Manager Lahore NUCES, Islamabad Campus Data Warehousing - Fall
20 Solution 1:ABC Pvt Ltd. Extract sales information from each database. Store the information in a common repository at a single site. NUCES, Islamabad Campus Data Warehousing - Fall
21 Solution 1:ABC Pvt Ltd. Karachi Report Quetta Data Warehouse Query & Analysis tools Sales Manager Peshawar Lahore NUCES, Islamabad Campus Data Warehousing - Fall
22 Scenario 2 One Stop Shopping Super Market has huge operational database. Whenever Executives wants some report, the OLTP system becomes slow and data entry operators have to wait for some time. NUCES, Islamabad Campus Data Warehousing - Fall
23 Scenario 2 : One Stop Shopping Data Entry Operator Report Wait Operational Database Management Data Entry Operator NUCES, Islamabad Campus Data Warehousing - Fall
24 Solution 2 Extract data needed for analysis from operational database. Store it in warehouse. Refresh warehouse at regular interval so that it contains up to date information for analysis. Warehouse will contain data with historical perspective. NUCES, Islamabad Campus Data Warehousing - Fall
25 Solution 2 Data Entry Operator Report Transaction Operational database Extract data Data Warehouse Manager Data Entry Operator NUCES, Islamabad Campus Data Warehousing - Fall
26 Scenario 3 Cakes & Cookies is a small, new company. President of the company wants his company should grow. He needs information so that he can make correct decisions. NUCES, Islamabad Campus Data Warehousing - Fall
27 Solution 3 Improve the quality of data before loading it into the warehouse. Perform data cleaning and transformation before loading the data. Use query analysis tools to support adhoc queries. NUCES, Islamabad Campus Data Warehousing - Fall
28 Solution 3 Expansio n Data Warehouse Query and Analysis tool sales President time Improvemen t NUCES, Islamabad Campus Data Warehousing - Fall
29 Case Study AFCO Foods & Beverages is a new company which produces dairy, bread and meat products with production unit located at Gujranwala. There products are sold in all the region of Pakistan. They have sales units at provincial Head Quarters. The President of the company wants sales information. NUCES, Islamabad Campus Data Warehousing - Fall
30 Sales Information Report: The number of units sold. 113 Report: The number of units sold over time January February March April NUCES, Islamabad Campus Data Warehousing - Fall
31 Sales Information Report : The number of items sold for each product with time Jan Feb Mar Apr Wheat Bread 6 17 Cheese Swiss Rolls Product NUCES, Islamabad Campus Data Warehousing - Fall
32 Time Sales Information Report: The number of items sold in each City for each product with time Karachi Wheat Bread Jan Feb Mar Apr Cheese Swiss Rolls Lahore Wheat Bread 3 7 Cheese 3 8 Swiss Rolls Product NUCES, Islamabad Campus Data Warehousing - Fall
33 Sales Information Report: The number of items sold and income in each region for each product with time. Jan Feb Mar Apr Rs U Rs U Rs U Rs U Karachi Wheat Bread Cheese Swiss Rolls Lahore Wheat Bread Cheese Swiss Rolls NUCES, Islamabad Campus Data Warehousing - Fall
34 Data Warehousing includes Building Data Warehouse Online Analysis/Analytical Processing (OLAP) Presentation Cleaning,Selection & Integration RDBMS Presentation Flat File Warehouse & OLAP server Client NUCES, Islamabad Campus Data Warehousing - Fall