The IBM Reference Architecture for Healthcare and Life Sciences

Similar documents
IBM Spectrum Scale. Advanced storage management of unstructured data for cloud, big data, analytics, objects and more. Highlights

PRODUCT PRESENTATION

IBM Accelerating Technical Computing

MapR Pentaho Business Solutions

DELL EMC Isilon & ECS for Healthcare

Painless Migration from Centera to ECS

TECHNICAL WHITE PAPER. Rubrik and Microsoft Azure Technology Overview and How It Works

IBM Storage Reference Architecture for AI applied to Autonomous Driving (AD)

Software Defined is the new Black. Craig McKenna Director, Cloud & Cognitive Data Solutions IBM Systems, Asia Pacific

HPC 2.0 for Genomics. An Introduction to IBM HPDA Framework & Reference Architecture. Frank Lee, PhD IBM Systems

Managing Data Warehouse Growth in the New Era of Big Data

Accelerating Computing - Enhance Big Data & in Memory Analytics. Michael Viray Product Manager, Power Systems

SYMPOSIUM March 22-23, 2018

Applicazioni Cloud native

Features and Capabilities. Assess.

A Unified Data Platform for Big Data & Cognitive

Self-Driving Cloud: Let Smart Software Help You Win

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

St Louis CMG Boris Zibitsker, PhD

Nutanix Healthcare Vision. Our Approach. Our Mission. Healthcare Applications. Solution Engineering. Customer Results. Thought Leadership

Cloud-Scale Data Platform

Cognitive Data Warehouse and Analytics

Sr. Sergio Rodríguez de Guzmán CTO PUE

THE DATA PLATFORM FOR THE CLOUD ERA SCOTT DIETZEN, CEO MATT KIXMOELLER, VP PRODUCTS BRIAN GOLD, CHIEF ARCHITECT

Building a solid foundation for big data analytics

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Cloudy Jigsaw Puzzles

Pentaho 8.0 Overview. Pedro Alves

Information Server: 11.x Information Governance Catalog. Marc Haber Senior Offering Manager, Governance Catalog & Tools

JOURNEY TO AS A SERVICE


Building data-driven applications with SAP Data Hub and Amazon Web Services

Active Analytics Overview

Hospitals and Health Systems: Beginning the Journey to the Cloud with Medical Imaging

Store. Analyze. Preserve. Big Data Assets

5th Annual. Cloudera, Inc. All rights reserved.

ACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes

GO BEYOND THE LIMITS OF LIMS. TrakCare Lab Enterprise

HEALTHCARE ACTIVITIES FROM ANYWHERE ANYTIME

ENABLING GLOBAL HADOOP WITH DELL EMC S ELASTIC CLOUD STORAGE (ECS)

Microsoft FastTrack For Azure Service Level Description

Applying Artificial Intelligence to Medical Imaging

MQ on Cloud (AWS) Suganya Rane Digital Automation, Integration & Cloud Solutions. MQ Technical Conference v

Thriving in a Hybrid World. Dean J. Marsh Vice President, Client Success IBM Analytic Solutions

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

Forum on Analytics for Advanced Cancer Research Basel, 26 October 2016 Ted Slater Global Head, Healthcare & Life Sciences

IBM Data Management for ADAS Introduction

2017 IBM Elastic Storage Server - Update

Introduction to Big Data(Hadoop) Eco-System The Modern Data Platform for Innovation and Business Transformation

Data Center Operating System (DCOS) IBM Platform Solutions

Hortonworks Connected Data Platforms

Spark and Hadoop Perfect Together

Machine Learning For Enterprise: Beyond Open Source. April Jean-François Puget

Paul Chang Senior Consultant, Data Scientist, IBM Cloud tw.ibm.com

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Myths, good Bets, and Realities: Breaking the Health Digital Deadlock through Big Data and AI

Cloud Based Analytics for SAP

ELIXIR DK. Danish National Supercomputer for Life Sciences. European Life Sciences Infrastructure for Biological Information europe.

Architecting the Future with IT Infrastructure for the Cognitive Era. 26 April, 2017 Arif Kaleem Executive Architect Technical Sales Manager, MEP

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Red Bull Racing - Maximum. Think Brussels / DOC ID / October 04, 2018 / 2018 IBM Corporation. FilipVan den Neucker. Brussels

Hadoop on Shared, Software-defined Storage

Cask Data Application Platform (CDAP) Extensions

Analyze Big Data Faster and Store it Cheaper. Dominick Huang CenterPoint Energy Russell Hull - SAP

Welcome to. enterprise-class big data and financial a. Putting big data and advanced analytics to work in financial services.

SUSiEtec The Application Ready IoT Framework. Create your path to digitalization while predictively addressing your business needs

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

Precision Medicine: Harnessing Biomedical Data to Improve the Prediction, Prevention, Diagnosis and Treatment of Disease

Managed Cloud storage. Turning to Storage as a Service for flexibility

Common Customer Use Cases in FSI

Service Management for the Mobile Mainframe Delivered via Cloud Lunch and Learn

Meetup DB2 LUW - Madrid. IBM dashdb. Raquel Cadierno Torre IBM 1 de Julio de IBM Corporation

The Five Essential Elements of Self-Service Data Integration

Hadoop Integration Deep Dive

Solution Brief. The IBM Explorys Platform. Liberate your healthcare data

Lustre Beyond HPC. Toward Novel Use Cases for Lustre? Presented at LUG /03/31. Robert Triendl, DataDirect Networks, Inc.

Oracle Cloud Blueprint and Roadmap Service. 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

IBM Systems Unleashing the power of a software defined infrastructure

Enabling Hybrid Cloud Storage for IBM Spectrum Scale Using Transparent Cloud Tiering

Centricity 360 Suite Case Exchange Physician Access Patient Access

CHIA CENTER FOR HEALTH INFORMATION AND ANALYTICS RESEARCH, SOFTWARE, AND SERVICES

IBM Enterprise Content Management Cloud Offerings IBM Corporation

zdata Solutions BI / Advanced Analytic Platform and Pilot Programs

Analytics in Healthcare. Preparing for advance healthcare analytics

Unstructured Data in the cloud with ECS

A Storage Outlook for Energy Sciences:

In search of the Holy Grail?

Building a Single Source of Truth across the Enterprise An Integrated Solution

Mtell Reservoir a high performance repository for time-series data, maintenance and operational events, and other relationship data.

Cloudera Data Science and Machine Learning. Robin Harrison, Account Executive David Kemp, Systems Engineer. Cloudera, Inc. All rights reserved.

EMC ATMOS. Managing big data in the cloud A PROVEN WAY TO INCORPORATE CLOUD BENEFITS INTO YOUR BUSINESS ATMOS FEATURES ESSENTIALS

Microsoft Azure Essentials

White Paper. IBM FlashSystem Industry Spotlight: Healthcare. 89 Fifth Avenue, 7th Floor. New York, NY

Big Data & Clinical Informatics

How In-Memory Computing can Maximize the Performance of Modern Payments

Alessandra Brasca Il p unto di vista I BM B

Your Top 5 Reasons Why You Should Choose SAP Data Hub INTERNAL

Accelerating Cloud Value through Analytics

Analytics in Action transforming the way we use and consume information

Transcription:

The IBM Reference Architecture for Healthcare and Life Sciences Janis Landry-Lane IBM Systems Group World Wide Program Director janisll@us.ibm.com Doing It Right SYMPOSIUM March 23-24, 2017 Big Data Symposium March 24, 2017

The Era of Genomics Represents BIG DATA Big Data Symposium March 24, 2017 2

A New Era of Precision Healthcare Completion of the Human Genome Project in 2003 led to an expansion of research on the contributions of genomics in disease diagnosis, treatment, and prevention Green, ED et al (2011). Charting a course for genomic medicine from base pairs to bedside. Nature 470: 204-213 Key Client Interests Early Discovery What biological or environmental factors are causing disease? Can we design diagnostics and drugs to improve patient outcomes? University Research / Pharmaceutical R&D Clinical Genomics What does my patient s genomic information tell me about the treatment I should select? Hospital Systems Big Data Symposium March 24, 2017 3

Key Technical IT Challenges Big Data Evolving Frameworks & Databases Data Silos International Collaboration Complex Workload Big Data Symposium March 24, 2017 4

Reference Architecture for Healthcare & Life Science Analytics Industry Applications Applications & Frameworks Data Repositories and Databases Workload Orchestration Software-Defined Infrastructure Compute & Storage Servers Flash Disk Optimize utilization of compute resources across the enterprise Enterprise Data Management Improve data access and optimize storage utilization across the enterprise Tape x86 POWER VM IT Administration On / Off premises Hybrid Cloud Big Data Symposium March 24, 2017 5

Reference Architecture for Healthcare & Life Science Analytics Sample Workloads Big Data Repository Workload Orchestration Management Enterprise Enterprise Data Management Data Management Compute & Storage Servers Clinical Informatics EMR Workflow Admin Flash LIMS EDW / UDMH Disk CPOE Tape Genomic Analysis POSIX IBM Spectrum Computing IBM Spectrum Storage x86 Image Analysis - - - - - - Resource Allocation - Workload Monitoring - Metadata Collection - Information Lifecycle Management High-Performance I/O Data Sharing POWER Cognitive Analytics NGS Ref Databases Imaging / PACS RIS Literature Ontologies Omics DW VNA Knowledge Base - HDFS - POSI Object On / Off premises Hybrid Cloud - VM Cluster Provisioning Metadata Collection IT Administration Big Data Symposium March 24, 2017 6

A Hybrid Cloud Architecture IBM designs a hybrid cloud architecture that supports seamless communication of workflows across on-premise and cloud environments On-premise infrastructure Cloud infrastructure Workloads On-Premise Cluster Spectrum LSF Secure VPN tunnel Cloud Resident Cluster Spectrum Scale (GPFS) IBM Elastic Storage IBM Aspera FASP Spectrum Scale AFM AFM Spectrum Scale (GPFS) IBM Elastic Storage Big Data Symposium March 24, 2017 7

IBM Products Offer Flexibility for Customers Single platform for workload management for automated resource sharing Platform Process Manager Workflows/Pipelines Platform Symphony Applications Hadoop MapReduce Apps MPI/Batch Applications Spark Applications App1 App2 Platform Symphony Scheduler Job 1 Job 2 Spectrum Conductor for Spark Job 1 Job 2 Spectrum LSF Scheduler IBM Platform Computing - Resource Orchestration and Monitoring App1 App2 Spectrum Conductor for Spark IBM Spectrum Scale File System / Data Store Connectors POSIX NFS HDFS Object Flash Disk Storage rich servers Tape Single File System for POSIX, NFS, HDFS access for efficient data Sharing Big Data Symposium March 24, 2017 8

Case Study #1: Major Genomics Provider in NY: Mission: Deliver analysis to support personalized treatment to individual cancer patients where the Standard of Care has failed Requirements: a data architecture that provides Management, Resiliency, Scalability, Economics, and Long Term Retention IBM provided: Spectrum Scale for performance, data management, and resiliency Spectrum Archive to support movement of 1.2 PB per month to tape A robust scheduler that supports multi-thread processing stream to accelerate compute and can efficiently process unpredictable data streams An overall lowest cost of managing 10 s of PB of online data and an archive with site diversity Big Data Symposium March 24, 2017 9

The Timeline: Major Genomics Center in NY: (TB) 8000 7000 6000 DR Extension 5000 Infrastructure Scales 4000 Infrastructure Scales Incrementally for 3000 Quickly at Low $ Add Planned Growth 2000 1000 0 Month 1 Month 4 Month 11 Month 21 Month 24 Initial Assessment 3 Spectrum Archive 3 Spectrum Scale TS4500/V7000 6 TS1150 Inc Air Gap Security (one tape copy) Variety of All Data Sources Not Known & Growth Unpredictable Revised Assessment Add 6 TS1150 Add V7000 disk Volume of Data Known & Growth Predictable New Project Demands Add 2 Spectrum Archive Add 2 Spectrum Scale Add TS4500 Exp frame Add 8 TS1150 Add V7000 +SSD for Metadata in Cluster 3PB Peak Throughput/mo Value of Data Known Organic Growth No change Site Diversity Ingest Rate TB/Month Mission Critical (Planned) System critical production operation Site diversity Second site is at a related institution, with high bandwidth connectivity (Internet 2) Big Data Symposium March 24, 2017 10

Case Study #2: Alberta Children s Hospital Research Inst. Mission: Provide Precision Medicine for a variety of childhood diseases, including Care for Rare consortium. Make available a robust platform for new discovery and model organisms. Requirements: A cost-effective, scalable platform for supporting the breaking down of silos IBM provided: Spectrum Scale for performance, data management, and resiliency with an archive Compute enhancements to support projects with the ability to aggregate data into a data model for advanced analytics A Global namespace that provided the ability to break down silos and share amongst many research groups A robust scheduler that supports both existing as well as additional compute paradigms An overall architecture with the ability to add incrementally. https://www.ibm.com/news/ca/en/2016/06/15/u599716r24585k82.html Big Data Symposium March 24, 2017 11

Case Study #3: Mount Sinai School of Medicine Mission: To achieve the most effective solution for genomic workloads without rearchitecting the industry-standard software, we performed a rigorous analysis of usage statistics, benchmarks and available technologies to design a system for maximum throughput. IBM provided: Spectrum Scale for performance, data management, and resiliency. IBM Flash tier supported by Spectrum Scale for optimization of workflow performance with this tier of SCRATCH to support small file sizes A robust scheduler that could handle 700,000 jobs in the queue Data management to move data from the Flash tier to disk as soon as the workflow completes A Global namespace to provide support for a large and growing faculty/staff user base An overall architecture with the ability to add incrementally. https://hpc.mssm.edu/files/sc15-bode-paper.pdf Big Data Symposium March 24, 2017 12

Lessons Learned: Customer Success is derived from: Managing explosive growth. Managing application performance especially with clinical diagnostics and patients for whom the standard of care has failed A robust Data management/metadata search system that allows scientists to retrieve past data is a must as they will ultimately update their research with new algorithms or the data required for reproducible results. A Global namespace that breaks down silos, and provides a single copy of the data. A robust scheduler that supports both existing as well as additional compute paradigms is essential to enable sharing of a single IT environment with a myriad of applications APPLICATIONS, they come and go, but ARCHITECTURE ENDURES Big Data Symposium March 24, 2017 13

Big Data Symposium March 24, 2017 14