Introduction to Bioinformatics

Similar documents
FFPE in your NGS Study

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

Analytics Behind Genomic Testing

Frumkin, 2e Part 1: Methods and Paradigms. Chapter 6: Genetics and Environmental Health

Genomes contain all of the information needed for an organism to grow and survive.

Centro Nacional de Análisis Genómico. Where are the Bottlenecks of Genome Analysis Today? Teratec. Ecole Polytechnique, Palaiseau, F.

Genomic Technologies. Michael Schatz. Feb 1, 2018 Lecture 2: Applied Comparative Genomics

Biology 644: Bioinformatics

Welcome to the NGS webinar series

Bioinformatics Advice on Experimental Design

Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist

Assay Validation Services

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics

E2ES to Accelerate Next-Generation Genome Analysis in Clinical Research

DNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.

Chemical treatments cause cells and nuclei to burst The DNA is inherently sticky, and can be pulled out of the mixture This is called spooling DNA

Studying the Human Genome. Lesson Overview. Lesson Overview Studying the Human Genome

De novo assembly in RNA-seq analysis.

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Overcome limitations with RNA-Seq

High-throughput scale. Desktop simplicity.

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING

ILLUMINA SEQUENCING SYSTEMS

Genomic resources. for non-model systems

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Total genomic solutions for biobanks. Maximizing the value of your specimens.

TECHNOLOGIES, PRODUCTS & SERVICES for MOLECULAR DIAGNOSTICS, MDx ABA 298

DNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

G E N OM I C S S E RV I C ES

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

SEQUENCING. M Ataei, PhD. Feb 2016

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Genomic solutions for complex disease

14 March, 2016: Introduction to Genomics

Lesson Overview. Studying the Human Genome. Lesson Overview Studying the Human Genome

GREG GIBSON SPENCER V. MUSE

SNP calling and VCF format

RNA-Sequencing analysis

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

The Basics of Understanding Whole Genome Next Generation Sequence Data

GENOMICS for DUMMIES

7. For What Molecule Do Genes Contain The

Satellite Education Workshop (SW4): Epigenomics: Design, Implementation and Analysis for RNA-seq and Methyl-seq Experiments

Understanding the science and technology of whole genome sequencing

Reads to Discovery. Visualize Annotate Discover. Small DNA-Seq ChIP-Seq Methyl-Seq. MeDIP-Seq. RNA-Seq. RNA-Seq.

Admera Health RUO Services

DNA Studies (Genetic Studies) Disclosure Statement. Definition of Genomics. Outline. Immediate Present and Near Future

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Almac Diagnostics. NGS Panels: From Patient Selection to CDx. Dr Katarina Wikstrom Head of US Operations Almac Diagnostics

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Matthew Tinning Australian Genome Research Facility. July 2012

PMDA Perspectives on Companion Diagnostics Development in Japan. Reiko Yanagihara, Ph.D. Deputy Review Director Office of In Vitro Diagnostics, PMDA

Whole Human Genome Sequencing Report This is a technical summary report for PG DNA

- OMICS IN PERSONALISED MEDICINE

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

Capabilities & Services

Using New ThiNGS on Small Things. Shane Byrne

Computational Challenges in Life Sciences Research Infrastructures

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

Course Overview: Mutation Detection Using Massively Parallel Sequencing

NGS-based innovations within the Leiden Network

Next-generation sequencing technologies

Introduction to human genomics and genome informatics

Human Genomics. Higher Human Biology

Personalized Human Genome Sequencing

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Introducing combined CGH and SNP arrays for cancer characterisation and a unique next-generation sequencing service. Dr. Ruth Burton Product Manager

Introduction to BIOINFORMATICS

Next Generation Sequencing. Target Enrichment

PHYSICIAN RESOURCES MAPPED TO GENOMICS COMPETENCIES AND GAPS IDENTIFIED WITH CURRENT EDUCATIONAL RESOURCES AVAILABLE 06/04/14

An innovative approach to genetic testing for improved patient care

Bioinformatics. Dick de Ridder. Tuinbouw Digitaal, 12/11/15

Human Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased

Computational Challenges of Medical Genomics

RNA-SEQUENCING ANALYSIS

Bioinformatics Programming and Analysis CSC Dr. Garrett Dancik

Exome Sequencing Exome sequencing is a technique that is used to examine all of the protein-coding regions of the genome.

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

University of Athens - Medical School. pmedgr. The Greek Research Infrastructure for Personalized Medicine

LARGE DATA AND BIOMEDICAL COMPUTATIONAL PIPELINES FOR COMPLEX DISEASES

Introduction to 'Omics and Bioinformatics

Introduction to NGS. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Rapid Transcriptome Characterization for a nonmodel organism using 454 pyrosequencing

A Crash Course in NGS for GI Pathologists. Sandra O Toole

NGS to address ncrna and viruses

How much sequencing do I need? Emily Crisovan Genomics Core

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Webinar 1: Understanding Reimbursement Implications For 2013

Genetics and Bioinformatics

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES

Deep Sequencing technologies

Design a super panel for comprehensive genetic testing

How much sequencing do I need? Emily Crisovan Genomics Core September 26, 2018

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Practical Considerations for Implementation of Clinical Sequencing. Emily Winn-Deen, Ph.D. April 2017

Introduction to the UCSC genome browser

Corporate Medical Policy

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Transcription:

Introduction to Bioinformatics Richard Corbett Canada s Michael Smith Genome Sciences Centre Vancouver, British Columbia June 28, 2017

Our mandate is to advance knowledge about cancer and other diseases and to use our technologies to improve health through disease prevention, diagnosis, and therapeutic approaches. As a Process Development Coordinator I help ensure our laboratory and analytical approaches are providing the best possible results. 2

3

Bioinformatics in Practice GATCCTGGAGGATCCTGGGAG GATCCTGGAGGATCCTGGGAG GATCCTGGAGAATCCTGGGAG GATCCTGGAGAATCCTGGGAG Sequence Other experiments Bioinformatics Sequence data Scientific publications Improved disease understanding and treatment Computational models Environmental discoveries 4

Bioinformatics in Practice GATCCTGGAGGATCCTGGGAG GATCCTGGAGGATCCTGGGAG GATCCTGGAGAATCCTGGGAG GATCCTGGAGAATCCTGGGAG Sequence Other experiments Bioinformatics Sequence data Scientific publications Improved disease understanding and treatment Computational models Environmental discoveries 5

Illumina Sequencing 1.) Cells 4.) Sequencing 2.) DNA 5.) Ready for bioinformatics 3.) Sheared DNA, with sequencing adapters AAAAAAAAAAAAAAAAACCCAAA CCCTTTTTGGGGAAGGGGGGGTT TCCCCCCCCCCCCCCCAAAAAAAT AAAGGGAAAGGGGTTTCCCAAA 6

Sequencers at the Genome Sciences Centre Bases Per Second # Machines Total Bases / Sec. HiSeq X 8,700,000 5 43.5 million HiSeq 2500 3,100,000 4 12.4 million NextSeq 1,300,000 2 2.6 million MiSeq 50,000 3 150 thousand ~55 million bases per second 7

How much sequence is that? Human Genome : 3,000,000,000 bases (approx.) At the Genome Sciences Centre, we can sequence the number of bases in 1 human genome every: 3 billion bases / 55 million bases per sec = 54.5 sec The first human genome draft sequence took roughly 10 years to sequence and assemble 8

How do we extract meaning from the sequence data? 2,000,000,000 reads per sample 150 bases per read 3,000,000,000 base reference genome 9

Data Interpretation For efficiency and to help interpretation, we often describe a sample by how it differs from a reference sample To compare samples, we: align sequence reads to a reference genome find locations where our sample differs from the reference 10

AGCTAGCCCTAGGATAG CTAGCTAGCCCTAGGAT GCTAGCTAGCCCTAGGA GCTAGCTAGCCCTAGGA GCTAGCTAGCCCTAGGA GCTAGCTAGCCCTAGGA Alignment Reference Genome...ACTCGCTAGCTAGCCCTAGGATAGCTTAGAGACCCTCGCGAAATAGACCCTCGAT... AGCCCTAGGATAGCTTA ACCCTCGCGAAATAGAC AGCTAGCCCTAGGATAG ACCCTCGCGAAATAGAC TAGCTAGCCCTAGGATA GACCCTCGCGAAATAGA AGACCCTCGCGAAATAG TAGAGACCCTCGCGAAA CTTAGAGACCCTCGCGA CTTAGAGACCCTCGCGA 11

Real Alignments This SNP, in combination with two others (not shown) give this individual a 90% chance of having blue/green eyes* *Duffy, David L., et al. "A three single-nucleotide polymorphism haplotype in intron 1 of OCA2 explains most human eye-color variation." The American Journal of Human Genetics 80.2 (2007): 241-252. 12

AGCTAGCCCTAGGATAG Assembly AGCCCTAGGATAGCTTA ACCCTCGCGAAATAGAC AGCTAGCCCTAGGATAG ACCCTCGCGAAATAGAC TAGCTAGCCCTAGGATA GACCCTCGCGAAATAGA CTAGCTAGCCCTAGGAT AGACCCTCGCGAAATAG GCTAGCTAGCCCTAGGA TAGAGACCCTCGCGAAA...GCTAGCTAGCCCTAGGATAGCTTAGAGACCCTCGCGAAATAGAC... Alignment Contig ACTCGCTAGCTAGCCCTAGGATAGCTTA ------- GAGACCCTCGCGAAATAGAC Reference Genome...ACTCGCTAGCTAGCCCTAGGATAGCTTAAAAAAAGAGACCCTCGCGAAATAGACCCTCGAT... Contig 13

Summary So Far We sequence billions of reads per genome sample Useful / actionable results are identified via: Read alignment Read assembly We describe samples by how they differ from a reference sample 14

Other Data Types Epigenomics investigates the chromatin and methylation status of regions of the genome Proteomics measures protein expression and modifications Exome/Capture queries specific targeted regions of the genome RNA sequencing measures RNA expression and variation 15

Genome and Transcriptome 16

Genome and Transcriptome Genes encoded in the genome (DNA) Expressed genes (RNA) 17

Genome and Transcriptome Genome sequencing allow us to find: SNVs (single nucleotide variants) CCCTTTTGGGGAA CNVs (copy number variants) SVs (structural variants) Gene1 Gene2 The transcriptome can be sequenced to find: Gene expression estimates Gene fusions Gene1a Gene2b SNVs in expressed genes 18

Personalized Medicine Normal Tumour (melanoma) 40X genome coverage Bioinformatics 80X genome coverage and transcriptome library 19

Somatic SNV calling z Personalized Medicine Intermediate Results RNA expression correlation Somatic copy number Gene fusion analysis 20

Report A detailed report of results for a patient s sample is provided to the clinician allowing them to view the genetic landscape of a patient s disease. This approach has enabled treating clinicians to make informed clinical decisions based on the genomic information integrated with other clinical features. 21

Summary Bioinformatics is (among other things) the process through which the interpretation of billions of sequence observations yields a distilled list of actionable findings This is accomplished in equal parts by: Powerful computing infrastructure Advanced algorithms Trained individuals from a diverse set of fields 22

Upcoming webinars & training events can be found on the WestGrid website: https://www.westgrid.ca/events If attendees would like to learn more about Compute Canada / WestGrid high performance computing resources and services, please visit: https://docs.computecanada.ca 23