Data Aging with mysap Business Intelligence SAP AG

Size: px
Start display at page:

Download "Data Aging with mysap Business Intelligence SAP AG"

Transcription

1 Data Aging with mysap Business Intelligence SAP AG

2 Learning Objectives As a result of this lecture, you will be able to: Describe the main challenges of Information Lifecycle Management Understand SAP s overall Archiving Strategy for ERP and DW Components Explain the benefits of Near Line Storage in an SAP BW environment Appreciate that TCO can be dramatically decreased for your data warehouse by a successful archiving strategy Understand that SAP Business Intelligence provides the rich toolkit for implementing your archiving and near-line storage strategy. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 2

3 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 3

4 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 4

5 Increasing Volume of Data Bases Meta Group 70% of all enterprise data currently resides in databases Gartner Group Multi-Terabyte databases are a reality today and will grow to hundreds of TB by 2006 Database applications like SAP are a key driver of storage growth Growing at 64% rate SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 5

6 The Attractiveness of Data is changing with its age Frequency of access reads updates Age of data SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 6

7 Distribution of the memory costs Hard disk costs do not even represent a quarter of the memory costs (Giga Information Group) Misc (Purchasing, training) 10% Personel 45% Hard disk 23% Environment (Electricity, Space) 3% Storage- Mangement (Soft- & Hardware) 19% Administrative expense for 1 Terabyte of memory are appropriate for five to seven times more higher than the memory costs themselves (Dataquest/Gartner) SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 7

8 Lifecycles, Storagemedia, Costs, Strategies Zugriffshäufigkeit Kosten RAM Leistung DASD hoch Virtuelles Band Zugriffszentr. Band niedrig Kapazitätszentr. Band Zeit Online- Speicher Tage Monate Jahre Near Line Storage mit direktem SQL-Zugriff Offline- Archiv Speicher- Management SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 8

9 State of the Art Storage Management Systems Key Capabilities Performance- and costoptimized management to administer and maintain data in multilevel and transparent storage systems Storage Management Systems combine all types of disk-, tape- and optical devices Direct row oriented data access to all types of devices Access strategies and aging patterns for logical grouping of data volumes available Dynamical migration to cheaper device types for data files with decreasing access frequency Automatic procedures for backup, shadowing, mirroring, recovery etc. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 9

10 Benefits of a sound Data Archiving Strategy Value proposition for deploying Data Archiving 1 Availability Faster and simpler software and release mgmt and upgrades. Reduced backup & recovery times. 2Resource consumption Reduction of the hardware costs for hard disks, main memory and CPU, and the costs of the system administration 3 Performance Faster Dialog response times. Faster Data load times. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 10

11 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 11

12 Open Data Warehouse Architecture SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 12

13 BW Architecture: Layers & accessibility Operational Data Store Operational Reporting Near Real-Time / Volatile Granular Built with ODS Objects Data Warehouse Non volatile Granular Historical foundation Integrated Built with ODS Objects Multidimensional Model Multidimensional analysis Aggregated view Integrated Built with InfoCubes SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 13

14 Transaction Data Processing into the BW MM Layer SAP BW InfoCubes Customer dimension A InfoCube Material dimension ODS Object ODS Objects BW Architected Data Mart Layer other InfoCubes Customer Material Time Amount Company Currency A A Time dimension Information other InfoCubes monthly weekly daily SAP ERP daily Process Data Master data Documents Change Docs Customer Time Doc No Pos Material Amount Local Currency A am New booking A pm Correction booking A pm New booking SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 14

15 Transaction Data Processing with BW EDW Layer SAP BW Customer dimension InfoCube Material dimension A 4712 InfoCubes BW Architected Data Mart Layer other InfoCubes Customer Material Time Amount Company Currency A A Information Time dimension other InfoCubes BW Enterprise Data Warehouse Layer ODS ObjectODS Objects monthly weekly Customer Time DocNo Pos Material Amount Local Currency daily Amount Company Currency A am A pm A pm Information Base SAP ERP Master data Documents Change Docs daily Process Data Customer Time DocNo Pos Material Amount Local Currency A am New booking A pm Correction booking A pm New booking SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 15

16 Impact of Information Base EDW Layer Pros Designed for the future Foundation of data for future developments Easier delta load control. Reduce redundancy in very granular data Greater control of data distribution Fine tuning of data availability in InfoProviders Archiving of base data possible Cons Increasing Data Volume Data redundancy Increased data management and coordination Increased DW administration Upfront design considerations SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 16

17 Bill Inmon s: Enterprise Data Warehousing Departmental Data Marts Acctg Finance Marketing ETL Sales DSS Applications ERP ERP ERP CRM Changed Data Staging Area EDW ecomm. Bus. Int. ERP Corporate Applications local ODS Global ODS Oper. Mart Granularity Manager Exploration warehouse/ data mining Cross media Storage Management Session Analysis Near line Storage Internet Dialogue Manager Cookie Cognition Preformatted dialogues Web Logs Archives Source:Bill Inmon SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 17

18 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 18

19 Motivation for a Data Aging Strategy : Benefits Costs offline vs. online storage costs System usage overhead CPU, Memory, etc Control of system growth System Availability vs. costs Data availability faster rollups, change runs, etc System availability less downtime for backups, upgrades, etc Performance vs. costs Faster load times Faster query times See also "Scalability with SAP Business Information Warehouse at Legal Requirements SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 19

20 Data Archiving and Storage with mysap mysap System External Storage System Database ArchiveLink Storage System Application data Data Objects File System HSM- System Alternative Storage SAP Third Parties (optional) SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 20

21 SAP7 SAP BW Customer motivation to archive Customer requirements (derived from ASUG): Data objects relevant for archiving InfoCubes ODS Objects PSA Master data Functionality Both Archiving and Data Deletion (without archiving) Select data based on any criteria Automatically scheduled on a periodic basis Restoring of archived data Data retention time 3 to 5 years in InfoCube and ODS Objects Consistent archiving processes SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 21

22 Slide 21 SAP7 wie sieht es mit PSA und Stammdaten aus? d017664, 9/29/2003

23 SAP BW Archiving Object Architecture activates BW Repository generates Archiving Object InfoCube ODS Object DataManager reads deletes Write Delete Schedules Read Archive Administration (SARA) Datamart Extractor ADK File system, CMS, HSM SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 22

24 Archiving in BW versus other SAP Components SAP BW Technology platform for (analytic) applications Generated data structures Generation of Archiving Objects Mainly replicated data Highly consolidated data Flat archive data objects Selective one-step deletion after complete verification of archived data No special check for archivability Near-line Storage Other SAP components Dedicated to a single Application Delivered (fixed) data structures Delivered Archiving Objects Often original data Unconsolidated (local) data Segmented data objects Individual deletion of verified data objects Check of archivability for individual data objects prior to the archiving run Archive Information System and Document Relationship Browser SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 23

25 SAP8 SAP BW data archiving technology : Write process Time Slot archiving is a powerful tool for selection of time dependent transactional archive data. ODS Object Sep 2002 Oct 2002 Nov 2002 Dec 2002 Jan 2003 Feb 2003 Mar 2003 Current Date: Mar 2003 Time slot archiving: Archive complete Years 2002 & 2003 Only complete fiscal years Exclude Nov 2002 Protect archive areas Archive file Sep 2002 Oct 2002 Dec 2002 Pros: Complex time selection options Cons: Limited to time selections only New Data loads Dec 2002 Mar 2003 SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 24

26 Slide 24 SAP8 dazu habe ich noch eine Frage d017664, 9/29/2003

27 SAP9 SAP BW data archiving technology : Write process Field Data selection archiving is a very powerful tool for free selection of transactional archive data based on any criteria: ODS Object comp 10 Oct 2002 comp 10 Nov 2002 comp 20 Nov 2002 comp 10 Dec 2002 comp 20 Jan 2003 comp 20 Feb 2003 comp 20 Mar 2003 Current Date: Mar 2003 Data selection archiving: Archive Years 2002 & 2003 Company 10 only Archive file comp 10 Oct 2002 comp 10 Nov 2002 comp 10 Dec 2002 New Data loads Comp 10 Dec 2002 Comp 20 Mar 2003 Pros: Flexible data selection Cons:Time selections more complex No protected archiving areas SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 25

28 Slide 25 SAP9 hierzu auch d017664, 9/29/2003

29 SAP BW data archiving technology: Extraction/Reloading BW Extractors and Export DataSources are Archive enabled InfoCube ODS object InfoCube ODS object Update Rules Export DataSource InfoPackage Archive Files InfoPackage is extended by an option 'archive selection' Selection options for available archive sessions and files Only full extraction supported Archive files are scanned with selection criteria of the request Reload to original DataTarget is possible but not recommended Reload Recommendation: Extract to a copy of original DataTarget instead Use MultiProvider to combine remaining data with reloaded data SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 26

30 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 27

31 Typical Data Warehouse Problems End-User Challenges Making timely, informed business decisions - Users cannot wait for historical data to be restored - Transparent access to data for regular reporting and ad-hoc analysis IT Management Challenges Meeting end-user data demand while managing cost - High costs of adding/managing online disk storage - High costs of backup and recovery especially when data is infrequently accessed - Data protection and availability SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 28

32 Data Aging Strategy Implementation Data aging is a strategy for managing data over time, balancing data access requirements with TCO. Each data aging Strategy is uniquely determined by the customer s data and the business value of accessing the data. Which tools should I consider to use? Online Database Storage Near line Storage (BW) - Data Archiving Frequently read /updated data Infrequently read data Very rarely read data SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 29

33 Why would we consider Near-line storage? Reasons for Direct Query Access to Archive Data Business-driven reasons Introducing new characteristics with a historical background Strategic Analysis of data over long periods Just because it s our data! Legal reasons Regulatory and industry specific requirements Data is immediate accessible from a legal and technical perspective Example: GDPdU demands a ten-year period for retention for tax relevant data and the data s immediate legibility and machine evaluation within the entire period for retention ( 146 u. 147 AO) Example: Food & Drug Administration (FDA) SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 30

34 Near-line Storage Solution for SAP BW (Key points) Separating frequently used (kept in database) and infrequently used data (stored in Near-line Storage) Supporting both InfoCubes and ODS Objects Transparent access to "non-archived" and "archived" data for queries Hierarchical Storage Management (depending on the provider) First Level: BW Database Second Level + further Levels: Near-line Storage Intelligent Data Access Data Selection Analysis and Feedback High level index in BW DB Low level index in Near-line Storage Openness StorHouse / FileTek, CBW / PBS Software, DiskXtender for BW / Legato, SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 31

35 SAP BW 4.x Open Solution (Planned) Query Data Flow Control Flow InfoCube/ODS Object with Near-line Services Split and Dispatch Union High level index BW DB Interface Archive/ Restore Near-line Storage Adapter BW Data Base Near-line Storage Partner Solution High speed disk Low level index Data Manager Robotic Tape Libraries NAS or Low-Cost Disk Optical Jukeboxes SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 32

36 SAP BW Near-line Storage technology A data view of BW and Near-line Storage solution integration ODS Object Sep 2002 Oct 2002 Nov 2002 Dec 2002 Jan 2003 Feb 2003 Mar 2003 Relocate complete Year 2002 Virtual InfoCube FTP Near-line DB table Sep 2002 Oct 2002 Nov 2002 Dec 2002 Dec 2002 Multiprovider - Provides consistent view of data New Data loads Mar 2003 BW Queries SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 33

37 Extending your Application Non-SAP Source Operational Data Store SAP Source Persistent Staging Area Data Warehouse Architected Data Marts Information Access New! Archiving & Near-Line Storage SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 34

38 Near Line Storage - Enormous Challenges Analysis and Reporting are operating on a combination of Online-, Near Line-, and Off Line datavolumes whereby data consistency is an indispensable requirement. Archiving processies touching several levels of Near Line Storage have to garuantee consinstancy : Archiving and Deleting of online data has to be one Logical Unit of Work (LUW). Rollback mechanisms for single archiving steps have to be available. The Archive is getting the attidude of a database. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 35

39 Where is Archiving and Near-line Storage applicable? Access frequency reads Decreasing TCO Archiving with SAP BW Archiving (BW 3.X) For analyses, archived data must be reloaded first again into the BW database Reduction in costs of data retention on alternative media Access frequency reads Decreasing TCO Data age Near Line Storage Near-line Storage (BW 4.X*) Direct accesses to data to alternative storage media for Queries Performance and data retention costs for access aged data can be minimized Data age * Pilot project possible for BW 3.X SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 36

40 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 37

41 Transparency for existing BW-Archives In case that you did already archive BW data via the standard BW 3.0 Archive tools, you might ask how can I get my classical archive files transparent for BW Queries are there any migration possibilities from BW Archiving to the NLS option One of our Partners offers another solution for you SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 38

42 PBS archive add on CBW - Solution Overview SAP-BW 3.0b BW Query DB + Archive Integrated Data Access with BW Queries to Non-Archived and Archived BW data Archive and Index Data are stored in Filesystem or Archive Server Connection to Archive Server via ArchiveLink Archive Server File System BW Data Base SAP Archive Files (ADK) PBS Index Files (ADK) SAP Archive Files (ADK) PBS Index Files (ADK) SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 39

43 PBS: Add-on CBW Most important Features CBW is a ADK Based, Generic Archive Solution which allows BW Queries to Archived and DB Infocubes. It Contains a Flexibel Index Tool which Operates on Customer Defined Infocubes. The PBS Indices are Stored Outside the BW Database in ADK- Format. It Allows a Reduction of BW Database Growth Without Loss of Data Access for Endusers. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 40

44 Archive Setup Steps 1. SAP Standard BW Data-Archiving Process (Archive and Delete) This process is a prerequisition for PBS! 1. Archive Data BW DB SID Table Info Cube SAP Archiving/Delete Program (SARA) 2. Delete Data BW Archive Files (ADK Format) Filesystem/Archive Server The Archiving Files Contain the Resolved Dimension ID s! Customer ID SID Customer Sales area ID Dim. Table DIM-ID SID Customer SID Sales area Fact Table DIM-ID DIM-ID amount Dim. Table DIM-ID SID Material SID Table Material ID SID Material SID Sales area SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 41

45 Archive Setup Steps 2. Generation of PBS Archive Components PBS Administration Cockpit Define PBS Index Attributes Define PBS Indices Generate PBS Index Archive Object Generate PBS Specific Programs Generate Infoproviders (Virtual/Multi) Copy BW Queries to Generated Infoproviders Detail Detail Detail SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 42

46 BW Query Data Access Concept SAP BW Read DB and Archive BW Query Read Archive Only Basis Infocube Multiprovider Virtual Infocube with Services BW Database PBS Index Files BW Archive Files PBS ADK Interface SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 43

47 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 44

48 Overview of SAP BW 3.X Project Solution Query No Intelligence Basis Cube or ODS Object Query MultiProvider Copy Proprietary Interfaces to select 3 rd Party Providers Virtual InfoCube Near-line Storage Adapter Near-line Storage Partner Solution BW Database High speed disk Robotic Tape Libraries NAS or Low-Cost Disk Optical Jukeboxes SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 45

49 SAP BW 3.X Project Solution Sample screen shot 3 rd Party Options SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 46

50 Agenda Information Lifecycle Management Lifecycle of Data Information Lifecycle Management Lifecycle of Information Data-Aging-Strategien im SAP BW Conventional Archiving im SAP BW 3.0 NLS-Strategy of new SAP BW Releases Partnerstudies Making ADK- Archives transparent for Analyses Near Line Storage first steps Summary SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 47

51 What about your Data Volume growth? Expected size without Archiving Allocated DB size Allocated DB content 'Without' Archiving Initial Archiving With regular archiving DB growth: ~15 GB/month Reduction: ~60GB DB growth: ~7 GB/month SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 48

52 The right time to start... 1The healthy System Data Archiving shouldn t be the final step to prevent a system going into cardiac arrest! 2Early Planning Proactively maintaining and sustaining performance in the system 3Interdisciplinary Process Data Archiving makes a high level of coordination necessary between IT (technical) and Application (functional) groups SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 49

53 Copyright 2003 SAP AG. All Rights Reserved No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, WINDOWS, NT, EXCEL, Word, PowerPoint and SQL Server are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iseries, pseries, xseries, zseries, z/os, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, Informix and Informix Dynamic Server TM are trademarks of IBM Corporation in USA and/or other countries. ORACLE is a registered trademark of ORACLE Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, the Citrix logo, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, MultiWin and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc. HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. JAVA is a registered trademark of Sun Microsystems, Inc. JAVASCRIPT is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MarketSet and Enterprise Buyer are jointly owned trademarks of SAP AG and Commerce One. SAP, R/3, mysap, mysap.com, xapps, xapp, SAP NetWeaver and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. SAP AG 2003, TechED Basel 2003, BW301_EMEA, Rainer Uhle / 50