Applications of Big Data in Evidence-Based Medicine

Similar documents
Disruptive Innovation and the Future of Medicine: Are We Prepared?

Development Of Biomarkers For Precision Medicine In An Era Of Evolving Technology: Specimens, Standards, And Signatures

Preanalytical Variables in Blood Collection: Impact on Precision Medicine

Preanalytical Processing: The Biospecimen Quality Imperative

Transforming Big Data into Unlimited Knowledge

Our website:

SYMPOSIUM March 22-23, 2018

AIT - Austrian Institute of Technology

Emerging Impacts on Artificial Intelligence on Healthcare IT Session 300, February 20, 2017 James Golden, Ph.D., Christopher Ross, MBA

HOW TO USE ARTIFICIAL INTELLIGENCE & ANALYTICS IN AUDIT. Babu Jayendran

AI Today and Tomorrow

Artificial Intelligence and Machine Learning in IoT

Transforming Big Data into Unlimited Knowledge

Complex Adaptive Systems Forum: Transformative CAS Initiatives in Biomedicine

Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University

Introduction to Bioinformatics

Great new technologies that can Radically Transform your business!

COGNITIVE COGNITIVE COMPUTING _

Specimens and Standards: Banking on Gold for Biomarker Development in Neurofibromatosis

BIOINFORMATICS THE MACHINE LEARNING APPROACH

Big Data, Predictive Analytics, and Security

Personalized Human Genome Sequencing

6th Form Open Day 15th July 2015

Artificial Intelligence in Life Sciences: The Formula for Pharma Success Across the Drug Lifecycle

Artificial Intelligence and Machine Learning in Mobile Networks

Business Analytics and Optimization An IBM Growth Priority

Our view on cdna chip analysis from engineering informatics standpoint

Lesson Overview. Studying the Human Genome. Lesson Overview Studying the Human Genome

IBM s Analytics Transformation

Is Machine Learning the future of the Business Intelligence?

The Learning Pharmacovigilance System

Computational Challenges of Medical Genomics

Artificial Intelligence in Workforce Management Systems

Are Next-Generation HPC Systems Ready for Population-level Genomics Data Analytics?

University of Athens - Medical School. pmedgr. The Greek Research Infrastructure for Personalized Medicine

Artificial intelligence A CORNERSTONE OF MONTRÉAL S ECONOMIC DEVELOPMENT

Disruptive Technologies Data Science & AI-Machine Learning

Christoph Bock ICPerMed First Research Workshop Milano, 26 June 2017

Artificial Intelligence and Its Impact on the Global Travel Industry

IBM Briefing Hva med 2017?

The Top 5 List: Finding and Fixing the Most Important Specimen Compromisers for Biomarker Measurement

Biomedical Informatics in BIG DATA Era

Cory Brouwer, Ph.D. Xiuxia Du, Ph.D. Anthony Fodor, Ph.D.

The EORTC Molecular Screening programme SPECTA

M a x i m i z in g Value from NGS Analytics in t h e E n terprise

The Role of the NIH Clinical Center in Translational and Clinical Research

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

LARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES

Precision Medicine and the Biospecimen Quality Imperative

Translational Medicine in the Era of Big Data: Hype or Real?

Topic: Genome-Environment Interactions in Inflammatory Skin Disease

Molecular Diagnostics: Multiplexity, Complexity and Ambiguity

Understanding Complexity in Biological Systems: From Simple Description to Synthetic Design

Studying the Human Genome. Lesson Overview. Lesson Overview Studying the Human Genome

Data Mining for Biological Data Analysis

Quantitative biomarker for clinical decision making

Make data work for Healthcare

Digitisation Hype, Trend or Reality?

RESEARCH REPORT. Published 3Q ADITYA KAUL Research Director. CLINT WHEELOCK Managing Director

- OMICS IN PERSONALISED MEDICINE

Meet Watson at Danske Bank's Service Desk -artificial intelligence at work!

Molecular Diagnostics Regulation Shifting from a Biomarker-Based Approach to an Algorithm-Based Approach

Genomic Medicine in France

Data representation for clinical data and metadata

Introduction to Microarray Analysis

IBM s Business, Data and IP Strategies in the era of AI and Cloud

DATA ROBOTICS 1 REPLY

Paul Chang Senior Consultant, Data Scientist, IBM Cloud tw.ibm.com

stories in H2020 healthcare

Genetics and Bioinformatics

Artificial Intelligence for Enterprise Applications

Konica Minolta to Acquire Invicro (US)

Welcome to the NGS webinar series

The Power to Cure: Therapeutic Innovation in Academia

The Changing Role of Data & Data Quality for Community Health Centers

Innovative Medicines Initiative

Bioinformatics : Gene Expression Data Analysis

Chapter 11. Managing Knowledge

The digital Workforce of the future How Mayo Clinic leverages RPA & Bots

CS262 Lecture 12 Notes Single Cell Sequencing Jan. 11, 2016

Nanthealth Patient Informed Consent

Women in Bioinformatics Panelists

Genomes contain all of the information needed for an organism to grow and survive.

Information Technology for Genetic and Genomic Based Personalized Medicine. Submitted. April 23, 2008

BrainChip Holdings Ltd. Corporate Presentation

How Data Analytics and Cognitive Computing are changing the game for Financial Services

USTGlobal. Big Data Analysis Revolutionizing Healthcare Industry

Benno Pütz. MPI of Psychiatry

CS4491/CS 7265 BIG DATA ANALYTICS

New Frontiers in Personalized Medicine

Machine Learning & AI in Transport and Logistics

Accelerating Precision Medicine with High Performance Computing Clusters

Whole Genome Sequencing in Cancer Diagnostics (research) Nederlandse Pathologiedagen 19 & 20 November 2015

Introduction to Bioinformatics

Application of Deep Learning to Drug Discovery

Future of manufacturing View on enabling technologies Restricted Siemens AG All rights reserved

AI by Design. with IBM Watson. Jaume Miralles IBM Watson Data & AI Platform Spain, Portugal, Greece, Israel CTO

Knowledge Discovery and Data Mining

Transcription:

Applications of Big Data in Evidence-Based Medicine Carolyn Compton, MD, PhD Professor Life Sciences, Arizona State University Professor Laboratory Medicine and Pathology, Mayo Clinic Adjunct Professor of Pathology, Johns Hopkins CMO, National Biomarker Development Alliance CMO, Complex Adaptive Systems Institute Scottsdale, AZ Financial Disclosure Unpaid board positions with travel reimbursements AJCC Indivumed GmbH CloudLims Roche/Ventana BBMRI PAIGE.AI, Inc HealthTell Inc Royalties: Up-To-Date Learning Objectives Explain the difference between big data and other data Describe the interplay between big data and artificial intelligence in enabling precision oncology Recognize the pearls in big biomedical data Recognize the perils in biomedical data, big and small Predict how big data will change oncology practice

Big Data COMPUTING: Noun extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations Precision Oncology is a Big Data Domain Exhibit #1: Sequencing a Genome Human genome (normal haploid) = 3.2 billion base pairs variable Whole Genome Analysis (60x coverage) 1,000 GB/sample* *Cancer genome is more complex. ** Each biopsy sample of a cancer may have a different genome. How Big Is Extremely Large? One genome = 1,000 gigabytes = 1 terabyte 235 TB = entire contents of the Library of Congress 1,0000 genomes = 1,000 terabytes (1,000 genomes) = 1 petabyte The 1 PB zettabyte = entire era storage (1,000,000,000,000,000,000,000 capacity (hard drive) of the human bytes) brain has arrived 2 PB = content of all US research libraries 50 PB = entire written works of humankind from the beginning of time Large Hadron collider produces 100 PB per day!

The Zettabyte Era WE ARE HERE The Big Data Dilemma: It s Getting Even Bigger Variety (multi-dimensional genomic, phenotypic, clinical data - complexity will increase) Velocity (sheer rate of data - generation now exceeds Moore s Law) Veracity Is it true or Value not? Is it useful or not? Volume (unprecedented amounts - it s still early) Adapted from Laney: Gartner 2001, 2012 NSF/NIH 2012 BIG DATA Cancer Is the Best Example of Enterprise-Wide, Large-Scale Data Production and Usage Clinical data Pathology data: Cancer morphological care has become and morphometric a 4-M effort: - Multi-expert Imaging and functional imaging data - Multi-modality Molecular data: genomic, - Multi-institutional transcriptomic, epigenomic and proteomic data - Multi-million Microbiomic data Health care environmental data

Mastery of Big Data Enables Big Opportunities for Cancer Medicine Health Care Itself Is Now a Big-Business, Data-Data Enterprise Clinical data, Laboratory data, Imaging data, Outcomes data, Billing data, Operational data, Quality management data, Personnel data, Compliance data, Revenue data, Costs data, Logistics data, Malpractice data, Insurance data, etc, etc, etc. But It s Complicated: Data Quality and Data Sharing Challenges The standards needed to assure data quality and reproducibility are lacking The incentives to change this have been lacking Require time, effort and resources for standards development and enforcement We have been misled into thinking that data volume can compensate for poor or unknown data quality The incentives to share data have been weak and the incentives to hoard data have MIT Technical Review been strong (e.g., IP and privacy)

The Reality: Biomedical Data Have Far Outstripped the Cognitive Capacity of Human Beings Precision medicine is completely dependent on artificial intelligence systems that integrate and interpret huge amounts of data to aid in decision-making. >2,000,000 proteins* 10 7 SNPs in HapMap 25,000 genes *AMA Science 2014 Stead W. Beyond Expert Based Practice. Presented at: Institute Of Medicine; October 8, 2007. Much of the World s Medical Data is Unstructured IBM s Watson: Processes natural language and uses cognitive computing to make decisions Instantly reads all medical literature and medical records, makes associations to answer questions Machine learning: constantly adapting to shifts in knowledge Jeopardy 16 February 2011 Utility for medical purposes is currently under study at major cancer centers Human Brain vs. Neuromorphic Computing Current Scorecard: Wetware vs. Hardware Human brain: Excels at complex tasks with low energy usage No programming needed 100 billion neurons; 100 trillion synapses Super computer brain (in evolution): IBM s Sequoia (1.5M networked processor chips) simulates network communication in human brain but is highly energy inefficient Would require 12 gigawatts to perform at brain speed 12 gigawatts = combined power consumption of LA and NYC

Human Brain vs. Neuromorphic Computing Current Scorecard: Wetware vs. Hardware IBM s TrueNorth neuromorphic computer, (transistors wired to form 1M digital neurons with 256M synapses ) accomplishes complex tasks like pattern recognition at high speed and low energy: brain on a chip Google s Deep Mind technology: goal create intelligence by combining machine learning and neuroscience to build algorithms for decision-making What Is Artificial Intelligence? Machine Learning? Deep Learning? Artificial Intelligence (AI) is intelligence exhibited by machines A machine that mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving What Is Artificial Intelligence? Machine Learning? Deep Learning? Machine learning (ML), a fundamental part of AI, is the use of computer algorithms built on statistical techniques to improve the machine s performance automatically through experiences using data Unsupervised: ability to find patterns in a stream of data Supervised: the ability to generalize from a specific set of training data to a new set of data in a reasonable way Deep learning (ML subset): use of neural networks to understand large amounts of data and train itself to perform tasks

What Is Artificial Intelligence? Machine Learning? Deep Learning? Artificial Intelligence Equals or Outperforms Humans in Most Domains The Big Risks in AI: Variation, Vulnerability, Volition Variation The AI community is fragmented and diverse: there is no medical AI community Programmers Engineers Data Scientists Roboticists No universally employed best practices: variation reigns Strong individualism in AI research: no culture of sharing and harmonization

The Big Risks in AI: Variation, Vulnerability, Volition Vulnerability Hacking System failures (network crashes) Volition Can AI autonomously replace human decision??? Will AI ignore us? What about ethics? OpenAI (AI to benefit humanity and extension of Human Wills Is this enough?) Artificial Intelligence Equals or Outperforms Humans in Most Domains The Complexity Challenge of Cancer Dilemma: Extravagant Genomic Alterations Mutations per megabase tumor DNA (3K megabases in human genome) Average 10 per megabase for lung cancer and melanoma Copy number alterations in solid tumors

Landscape of Extreme Genomic Heterogeneity in Lung Cancer Each column is a separate cancer Malignant snowflakes : each cancer carries multiple unique mutations and other genome perturbations (such as epigenomic changes) Disturbing implications for therapeutic cure and development of new Rx Phylogenetic Profiles of Intratumoral Clonal Heterogeneity in 11 Lung Cancers: DIFFERENT CLONES IN SAME CANCER Trunk (Blue): Mutations present in all regions Branch (Green): Mutations present in some regions Private (Red): Mutations present in only one region J. Zhang et al. (2014) Science 346; 256 Malignant Snowflakes

Medicinal Complexity: Integrating the Complex Layers of Biology and Evolution as Disease Progresses Genomics Proteomics Molecular Pathways and Networks Network Regulatory Mechanisms ID of Causal Relationships Between Network Perturbations and Disease Patient-Specific Signals and Signatures of Disease or Predisposition to Disease Conclusion? Imperatives Ahead! - OpenMind May 2017 We can t do this without machines We must embrace this: learn to use machines to extend human capability We must control it We must build safety strong provisions Machines can t do this without data We must assure that the data are of high quality We must share the data: build a culture of data sharing The data are astronomically voluminous and complex This is just the beginning: more data flavors are on the way We MUST do this: patients are counting on us Applications of Big Data in Evidence-Based Medicine Carolyn Compton, MD, PhD Professor Life Sciences, Arizona State University Professor Laboratory Medicine and Pathology, Mayo Clinic Adjunct Professor of Pathology, Johns Hopkins CMO, National Biomarker Development Alliance CMO, Complex Adaptive Systems Institute Scottsdale, AZ