Preparing for the Future with PureData for Analytics

Size: px
Start display at page:

Download "Preparing for the Future with PureData for Analytics"

Transcription

1 Seattle Children s Hospital Preparing for the Future with PureData for Analytics 4/10/2013

2 Who am I and Why am I Here? Wendy Soethe Manager, EDW & BI Knowledge Management Information Services Seattle Children s Hospital

3 Seattle Children s Hospital Hospital Statistics FY 2012 Location: Seattle, WA Includes Seattle Children s Hospital, Research Institute and Foundation Licensed beds: 254 Total Employees: 5,195 Active Medical Staff: 1,189 Hospital Admissions: 14,498 Clinic Visits: 290,671 ED Visits: 32,810

4 Our Mission We believe all children have unique needs and should grow up without illness or injury. With the support of the community and through our spirit of inquiry, we will prevent, treat and eliminate pediatric disease.

5 Seattle Children s Hospital Integrated Data Journey 2007 and prior Decision support: manual processes to extract data, Invision into TSI 2008 Rolled out PowerInsight with BOE for Cerner reporting needs Epic/Clarity go-live; Crystal reports for Revenue Cycle, ADT, Coding data Signed deal with MS Amalga v1.5 for integrated data ( Alpha Partners) 2009 Initiated Microsoft BI program to augment Amalga Continued Amalga development led by MS moved to v Replaced Amalga with more traditional SQL data warehouse environment as an interim solution Rolled out Tableau to promote self service and support power users Focused on key initiatives to drive EDW work (i.e., CSW) Conducted DWA Assessment with Brightlight Consulting Conducted POC which became part of pilot implementing IBM PureData System for Analytics powered by Netezza technology Integrating more data monthly

6 Current Data Warehouse Profile 8 team members 1 Data Architect, 1 DBA, 4 EDW/BI Developers, 1 DA/Developer Recently moved off SQL Server to a PureData for Analytics/Netezza DWA based EDW Currently 10 source systems and 10 CSV files End users access via BI solutions built in the Microsoft stack, Tableau and BOE

7 Interim EDW Architecture

8 Interim EDW Architecture Source Replication DB Acquisition Integration Distribution BI Portals and Tools Tools Portals Cerner CIS_PRDLOGIC Epic Clarity ERP Lawson Portal (GL and Payroll Reporting) Lawson MSOW Excel Departmental Tableau BO BO InfoView Prod Portal (Clinical and Revenue Cycle Reporting) Active Dir SoftMed Center Point UMRA SoftMed (Backup) Center Point (Backup) KM EDW Stage KM EDW and Data Marts Target KM EDW Views and OLAP Cubes Crystal SSRS SQL Excel Knowledge Exchange SharePoint Portal (Inpatient Access, SC, HEAT Dashboards & Reports) EPSi Genomic, M2M, EMR, HL7, HIE, Clinical/Regulatory 3 rd Party, DATSTAT, TSI, PHIS, CHARS, ClinDoc, Other Existing In Progress, Currently Approved Long Term Vision Tableau Portal (Organizational Dashboards and Reporting)

9 EDW Assessment with Brightlight

10 EDW Assessment Approach Analyzed the Children s unique business intelligence needs through on-site interviews with key business and technical team members (total 26 individuals) Reviewed the BI environment and key documentation Mapped assessment findings against Brightlight Consulting s extensive business intelligence knowledge base and experiences and against EDW environments at various other companies Developed reports of preliminary findings and recommendations that were presented and reviewed with key KM representatives Prepare a final report of findings and recommendations for utilizing business intelligence at Seattle Children s (SCH)

11 Key Challenges Rapidly increasing demand for integrated data Excessive time to provision new storage to meet demands (3 to 6 months) EDW architecture inefficient and old Existing infrastructure is not engineered for high performance analytics (advanced analytical computations and fast query performance on large complex data volumes) No reliable server failover for Prod as well as for Test and Dev (if Prod server goes down, Test is used as a temp solution until Prod is up again) Data movement across 20+ servers (Dev, Test, Prod) All KM servers provisioned for Amalga approaching 4 years old

12 Key Challenges, Continued Knowledge Management cannot keep up with data demands of strategic initiatives KM EDW/BI team spends more time tuning inefficient EDW architecture and less time taking on new data integration projects KM team spends more time satisfying one-off requests than focusing on larger strategic initiatives KM Analysts spend most of the time performing manual data integration tasks. Such activities do not result in creating a repeatable process, and manually integrated data cannot be automatically refreshed and re-used.

13 What Was Attractive About a DWA Purpose Built Database, Server, and Storage tightly configured MPP - Optimized for analytical processing High Performance SQL x faster Very fast loads (750+ gb / hour) Simplicity Faster deployment Fewer resources to manage

14 Solution Options (the Short List) Option Benefits Risks/Concerns Maintain Status Quo No new training of staff. Time lag for server/storage provisioning; Long project cycles; Limited bandwidth for new projects; Not scalable with FTE count; ETL and query performance; Ability to integrate data sets PureData for Analytics Data Warehouse Appliance Accelerate Time to Market and BI Throughput; Increase number of strategic projects; Decrease FTE cost to maintain infrastructure; Eliminate storage bottleneck; Add capacity for future growth; Decrease number of EDW servers; Introduce failover/recovery; Introduce flexible analytical sand-box environment New approach, requires training and consulting; New technology

15 Strategic Questions Require Access to Integrated Data

16 Estimated BI Throughput and Time to Value Existing vs. PureData for Analytics Projects Existing PureData for Analytics Var. % EDW Hardened Projects % EDW Sandbox Projects % Total EDW Projects % More Projects in PureData for Analytics by 2016 because: 3 to 6 months storage related bottlenecks are completely eliminated 1,800 fewer FTE days are required to maintain and tune the system, and therefore most of this time can be re-invested back into new project development An environment engineered specifically for EDW enables more efficient and agile development. Therefore, it will cost $78K or 37% less to produce one medium-sized EDW Project from the FTE cost perspective Sandbox environments enable support of ad hoc projects (at least 10 projects per year) PureData for Analytics Existing

17 Additional Benefits ROI for advanced BI and analytical capability provided Additional Storage PureData for Analytics 96 TB Storage vs. Existing 30 TB Storage Support Sandbox environments and other growth Phase out development VMware instances Advanced monitoring capability provided in Brightlight Managed Services

18 SCH Data Volume Growth Analysis for the Next 5 Years Due to increasing needs for information and analytics at SCH, the data warehoused data volume may increase by 440% in the next 5 years In a traditional data warehouse environment, the EDW would reach 4.6 Tb to host data as well as indexes for performance optimization In an environment engineered specifically for DW, storage requirement could be potentially lower by 25% and the data volume could reach 3.7 Tb

19 Findings / DWA Impact EDW has an estimated yearly data volume growth of 135% on an average within the next 5 years The growth rate could be higher if the EDW organizational and technical constraints were lifted to enable higher BI projects throughput Additional unplanned data sources and environments could potentially emerge due to changed business priorities, including unstructured data Expanded service to more patients in existing and newly built facilities could result in an increased data flow

20 Assessment Conclusions DWA can offer high capacity solutions to accommodate large data volumes for initial historical data loads and future growth with minimum efforts to manage storage DWA can eliminate additional storage requirements needed in a traditional DW environment, e.g. support for indexes, temp space, aggregate tables, and cubes DWA simplifies the environment

21 DWA Project with IBM and Brightlight

22 Guiding Principles Create a data warehouse that is identified as stable and dependable by the business Reduce and simplify the data movement from one platform to another platform Consolidate data within the enterprise Maintain or improve the security of the business data Improve the flexibility and resilience of the data load processes and the data services Lower the long-term Total Cost of Ownership Turn business data into business information faster Create data warehouse services that provide for flexible information consumption Integrate with the existing self-service environment

23 DWA Project Scope Create a detailed plan for a Phase One implementation of a new BI/DW solution centered on a Data Warehouse Appliance (DWA), including plans to execute an initial POC to meet acceptance criteria clause Setup, configuration, and establishment of a new Linux/UNIX based Data Acquisition layer (Dev, Test and Prod) Installation, setup and configuration of the Brightlight Data Integration Framework (nzdif) for Development, Test, Pre-Production and Production Integrate SCH security and access requirements for the Landing Zone and DWA

24 DWA Project Scope, Continued Land source data sets, that were part of the existing EDW solution, into the new Data Acquisition layer Migrate the existing EDW ETL processes into the Brightlight Data Integration Framework Clone the reporting environments off of the existing EDW into new environments that will point at the DWA Execute the DWA criteria acceptance plan Grow team skill sets through knowledge transfer, best practices, and DWA subject matter expertise from Brightlight Consulting

25 DWA Project Schedule

26 Summary of POC Results

27 Gaps/Challenges Data Model Improvements to the Source Extraction Layer Data Governance Organizational Engagement/Governance Blob/Row Limits in PureData for Analytics 64k

28 Anticipated Project Impacts Established BI Architecture that will serve the organization for at least 5 years Decrease time to delivery for KM team Add Sandbox functionality to allow more participation in building and testing BI solutions within the organization Provide integrated data that the organization has never had before, to answer more complex questions, more efficiently Begin to address unstructured data large amount of clinical data is unstructured

29 Current EDW Architecture Source Replication DB Acquisition Integration Distribution BI Portals and Tools Tools Portals Cerner CIS_PRDLOGIC Epic Clarity ERP Lawson Portal (GL and Payroll Reporting) Lawson MSOW Excel Departmental Tableau BO BO InfoView Prod Portal (Clinical and Revenue Cycle Reporting) Active Dir SoftMed Center Point UMRA SoftMed (Backup) Center Point (Backup) KM EDW Stage KM EDW and Data Marts Target KM EDW Views and OLAP Cubes Crystal SSRS SQL Excel Knowledge Exchange SharePoint Portal (Inpatient Access, SC, HEAT Dashboards & Reports) EPSi CUMG Tableau Portal (Organizational Dashboards and Reporting) Genomic, M2M, EMR, HL7, HIE, Clinical/Regulatory 3 rd Party, DATSTAT, TSI, PHIS, CHARS,, Other Existing In Progress, Currently Approved Long Term Vision

30 Conclusion

31 PureData for Analytics: Setting SCH up for Success with Big Data As an internal and external demand for an integrated and high-quality data is growing exponentially, enabling self-service BI and advanced analytics have become the key EDW goals at SCH SCH requires a robust platform that enables insight into performance across multiple business processes and research EDW has major opportunities to provide deep insight into the SCH business centered around patient care, enhance the quality of research, and promote a metrics based decision-making

32 Q & A

33 Contact Information Wendy Soethe Manager, EDW & BI Knowledge Management Information Services Seattle Children s Hospital wendy.soethe@seattlechildrens.org

34 Thanks!