Herramientas para el diseño y el análisis de datos de paneles de genes

Similar documents
Introduction to human genomics and genome informatics

BABELOMICS: Microarray Data Analysis

The experience of Andalusia in ehealth and big data --- Big data in health - IMI s HARMONY project, European Parliament, Brussels, 19 June 2018

Prioritization: from vcf to finding the causative gene

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

Course Presentation. Ignacio Medina Presentation

Francisco García Quality Control for NGS Raw Data

Variant calling in NGS experiments

A NGS data analysis course

Centro Nacional de Análisis Genómico. Where are the Bottlenecks of Genome Analysis Today? Teratec. Ecole Polytechnique, Palaiseau, F.

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Reads to Discovery. Visualize Annotate Discover. Small DNA-Seq ChIP-Seq Methyl-Seq. MeDIP-Seq. RNA-Seq. RNA-Seq.

Annotating your variants: Ensembl Variant Effect Predictor (VEP) Helen Sparrow Ensembl EMBL-EBI 2nd November 2016

Read Mapping and Variant Calling. Johannes Starlinger

NGS in Pathology Webinar

Variant prioritization in NGS studies: Annotation and Filtering "

Our website:

Research Powered by Agilent s GeneSpring

Cory Brouwer, Ph.D. Xiuxia Du, Ph.D. Anthony Fodor, Ph.D.

BICF Variant Analysis Tools. Using the BioHPC Workflow Launching Tool Astrocyte

Introduction to Next Generation Sequencing (NGS) Andrew Parrish Exeter, 2 nd November 2017

Sanger vs Next-Gen Sequencing

BIMM 143: Introduction to Bioinformatics (Winter 2018)

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Gathering of pathogenicity evidence for novel variants. By Lewis Pang

OncoMD User Manual Version 2.6. OncoMD: Cancer Analytics Platform

Processing Ion AmpliSeq Data using NextGENe Software v2.3.0

High performance sequencing and gene expression quantification

DNA. Clinical Trials. Research RNA. Custom. Reports CLIA CAP GCP. Tumor Genomic Profiling Services for Clinical Trials

Identification of the Photoreceptor Transcriptional Co-Repressor SAMD11 as Novel Cause of. Autosomal Recessive Retinitis Pigmentosa

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

Alissa Interpret The next evolution of Cartagenia Bench

BGGN 213: Foundations of Bioinformatics (Fall 2017)

SUPPLEMENTARY INFORMATION

NGS Approaches to Epigenomics

Biomedical Network Research Centre on Rare Diseases

Genetics and Bioinformatics

Analytics Behind Genomic Testing

Ingenuity Pathway Analysis (IPA )

Introduction to the UCSC genome browser

Pioneering Clinical Omics

DNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.

Introduction to Bioinformatics

Introduction to Bioinformatics

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

Biomarker discovery. Enabling pharmaceutical and biotech partners to discover relevant biomarkers in diseases of interest

PeCan Data Portal. rnal/v48/n1/full/ng.3466.html

An Automated Pipeline for NGS Testing and Reporting in a Commercial Molecular Pathology Lab: The Genoptix Case

Implementing ACMG guidelines on sequence variant interpretation: software-assisted variant curation and filtering

Data representation for clinical data and metadata

The most trusted information for research and clinical genomics

Bioinformatics Core Facility IDENTIFYING A DISEASE CAUSING MUTATION

Bioinformatics for Proteomics. Ann Loraine

Protein Bioinformatics Part I: Access to information

MRC Stratified Medicine Initiative

The HL7 Clinical Genomics Work Group

How to use Variant Effects Report

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

Introduction to Bioinformatics

Clinician s Guide to Actionable Genes and Genome Interpretation

Genomics: Human variation

res2)=)colocalisation(eqtl_file="icoslg_eqtl.txt",) gwas_file="uc_gwas_icoslg_locus.txt",)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$

Introduction to BIOINFORMATICS

Bioinformatics for Cell Biologists

Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources

Transcriptome Assembly, Functional Annotation (and a few other related thoughts)

In silico variant analysis: Challenges and Pitfalls

Drug target identification. Enabling our pharmaceutical and biotech partners to effectively discover proteins or genes as novel targets

IDENTIFYING A DISEASE CAUSING MUTATION

Implementing ACMG guidelines on sequence variant interpretation: software-assisted variant curation and filtering

RNA-Seq Workshop AChemS Sunil K Sukumaran Monell Chemical Senses Center Philadelphia

CITATION FILE CONTENT / FORMAT

Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

An Automated Pipeline for NGS Testing and Reporting in a Commercial Molecular Pathology Lab: The Genoptix Case

Smart India Hackathon

2017 HTS-CSRS COMMUNITY PUBLIC WORKSHOP

Functional Annotation and Prioritization of Whole Exome and Whole Genome Sequencing Variants. Mulin Jun Li

Deleterious mutations

From Proteomics to Systems Biology. Integration of omics - information

SNP calling and VCF format

Terminology for personalized medicine

Pathway Analysis. Min Kim Bioinformatics Core Facility 2/28/2018

Suberoylanilide Hydroxamic Acid Treatment Reveals. Crosstalks among Proteome, Ubiquitylome and Acetylome

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

University of Athens - Medical School. pmedgr. The Greek Research Infrastructure for Personalized Medicine

Knowledge-Guided Analysis with KnowEnG Lab

Personalized. Health in Canada

Frumkin, 2e Part 1: Methods and Paradigms. Chapter 6: Genetics and Environmental Health

NGS sequence preprocessing. José Carbonell Caballero

earray 5.0 Create your own Custom Microarray Design

Final exam: Introduction to Bioinformatics and Genomics DUE: Friday June 29 th at 4:00 pm

- OMICS IN PERSONALISED MEDICINE

building better science

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

All research lines at the LMU have a strong methodological focus

City University of Hong Kong Course Syllabus. offered by Department of Biomedical Sciences with effect from Semester B 2015 /16

GeneWEB Tutorial. Enhancing Biological Research with Gene Networks Bioinformatics Department

Transcription:

Herramientas para el diseño y el análisis de datos de paneles de genes Hospital Sant Pau Barcelona, 16 Jun 2016 BIER Francisco García fgarcia@cipf.es Departamento de Genómica Computacional. CIPF

Príncipe Felipe Research Center Goal: biomedical research Basic research in genes, targets, molecular and cellular processes, Nanomedicine and Computational Medicine Translation into clinical practice: personalized medicine, cancer, rare diseases, metabolic and functional impairment http://www.cipf.es/ Introduction Príncipe Felipe Research Center (CIPF)

Who are we? The Computational Genomics Department, in Research Center Prince Felipe Team: multidisciplinary group of 14 researchers and technicians led by Joaquín Dopazo http://bioinfo.cipf.es/ Introduction Genomic Computational Department

Who are we? Introduction Genomic Computational Department

Why are we interested in Computational Genomics? The overall goal of the department: Apply computational methods biotechnological problems to biomedical and Research interests: The development and application of novel bioinformatics methods aimed at discovering new drugs Identification of genes or proteins may be considered therapeutic targets Personalized medicine: tools for discovering and diagnostic Introduction Genomic Computational Department

Why are we interested in Computational Genomics? Introduction Personalized Medicine

Why are we interested in Computational Genomics? + Introduction Personalized Medicine and Mendelian Diseases

Big Data Genomics Transcriptomics Metabolomics Lipidomics Proteomics Introduction Omics sciences Epigenomics

Big Data Biological knowledge Clinical knowledge Gene Ontology KEGG pathways MiRNA, CisRed Biocarta pathways InterPro Motifs HUMSAVAR Gene Expression in tissues Transcription Factor Binding Sites Bioentities from literature ClinVar HGMD COSMIC Introduction Regulatory elements Clinical and biological databases

How do we work? Our department collaborates in different research projects and converts researcher needs into bioinformatics solutions Free software for several reasons: Any customer can try our tools The scientific community can test our software This is the current trend in Computational Genomics Introduction Genomic Computational Department

How do we work? Introduction Genomic Computational Department

How do we work? El CIBER en su Área Temática de Enfermedades Raras (CIBERER) es el centro de referencia en España en investigación sobre enfermedades raras: http://www.ciberer.es/ Objetivo: coordinar y favorecer la investigación básica, clínica y epidemiológica, así como potenciar que la investigación que se desarrolla en los laboratorios llegue al paciente, y dé respuestas científicas a las preguntas nacidas de la interacción entre médicos y enfermos. El CIBERER se compone de un equipo humano de más de 700 profesionales e integra a 62 grupos de investigación. Introduction Genomic Computational Department BIER

How do we work? Curso CIBERER de análisis de datos genómicos, 2830 Sep 2016 en Valencia: http://bioinfo.cipf.es/mda15ciberer International course of Genomic Data Analysis, Mar 2017, Valencia: http://bioinfo.cipf.es/gda16/program/ http://bioinfo.cipf.es/courses Introduction Genomic Data Analysis courses

Web tools to analyze gene panel data BIER CIPF Genomic Computational Department

Outline 1) Introduction to NGS Data Analysis 2) TEAM 3) PanelMaps Web Tools Introduction to NGS data analysis

NGS technologies How do these technologies work? Reference genome Introduction NGS data analysis

Genomics Data Analysis Pipeline Primary Analysis Secondary Introduction 1. Sequence preprocessing Fastq 2. Mapping BAM 3. Variant calling VCF 4. Variant annotation VCF 5. Variant prioritization Studies of genomic variation

Fastq format We could say it is a fasta with qualities : 1. 2. 3. 4. Header (like the fasta but starting with @ ) Sequence (string of nt) + and sequence ID (optional) Encoded quality of the sequence @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT +!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 Introduction NGS data analysis: files format

BAM/SAM format @PG @SQ ID:HPG-Aligner VN:1.0 SN:20 LN:63025520 HWI-ST700660_138:2:2105:7292:79900#2@0/1 16 20 76703 254 76= * 0 0 GTTTAGATACTGAAAGGTACATACTTCTTTGTAGGAACAAGCTATCATGCTGCATTTCTATAATATCACATGAATA GIJGJLGGFLILGGIEIFEKEDELIGLJIHJFIKKFELFIKLFFGLGHKKGJLFIIGKFFEFFEFGKCKFHHCCCF AS:i:254 NH:i:1 NM:i:0 HWI-ST700660_138:2:2208:6911:12246#2@0/1 16 20 76703 254 76= * 0 0 GTTTAGATACTGAAAGGTACATACTTCTTTGTAGGAACAAGCTATCATGCTGCATTTCTATAATATCACATGAATA HHJFHLGFFLILEGIKIEEMGEDLIGLHIHJFIKKFELFIKLEFGKGHEKHJLFHIGKFFDFFEFGKDKFHHCCCF AS:i:254 HWI-ST700660_138:2:1201:2973:62218#2@0/1 0 20 76655 254 76M * 0 0 AACCCCAAAAATGTTGGAAGAATAATGTAGGACATTGCAGAAGACGATGTTTAGATACTGAAAGGGACATACTTCT FEFFGHHHGGHFKCCJKFHIGIFFIFLDEJKGJGGFKIHLFIJGIEGFLDEDFLFGEIIMHHIKL$BBGFFJIEHE AS:i:254 NH:i:1 NM:i:0 NH:i:1 NM:i:1 HWI-ST700660_138:2:1203:21395:164917#2@0/1 256 20 68253 254 4M1D72M * 0 0 NCACCCATGATAGACCAGTAAAGGTGACCACTTAAATTCCTTGCTGTGCAGTGTTCTGTATTCCTCAGGACACAGA #4@ADEHFJFFEJDHJGKEFIHGHBGFHHFIICEIIFFKKIFHEGJEHHGLELEGKJMFGGGLEIKHLFGKIKHDG AS:i:254 NH:i:3 NM:i:1 HWI-ST700660_138:2:1105:16101:50526#6@0/1 16 20 126103 246 53M4D23M * 0 0 AAGAAGTGCAAACCTGAAGAGATGCATGTAAAGAATGGTTGGGCAATGTGCGGCAAAGGGACTGCTGTGTTCCAGC FEHIGGHIGIGJI6FCFHJIFFLJJCJGJHGFKKKKGIJKHFFKIFFFKHFLKHGKJLJGKILLEFFLIHJIEIIB AS:i:368 NH:i:1 NM:i:4 SAM Specification: http://samtools.sourceforge.net/sam1.pdf Introduction NGS data analysis: files format

VCF format http://www.1000genomes.org/ Introduction NGS data analysis: files format

BED format https://genome.ucsc.edu/faq/faqformat.html#format1 Introduction NGS data analysis: files format

Genomics Data Analysis Pipeline Primary Analysis Secondary Introduction 1. Sequence preprocessing Fastq 2. Mapping BAM 3. Variant calling VCF 4. Variant annotation VCF PanelMaps BiERapp 5. Variant prioritization TEAM Studies of genomic variation CSVS

Outline 1) Introduction to NGS Data Analysis 2) TEAM 3) PanelMaps Web Tools Targeted Enrichment Analysis and Management

Can I interpret sequencing data for diagnostic? http://team.babelomics.org/beta/ BIER TEAM Targeted Enrichment Analysis and Management

Introduction Sequencing data Biological knowledge TEAM ClinVar HGMD HUMSAVAR COSMIC Diagnostic TEAM Targeted Enrichment Analysis and Management

How does TEAM work? 1. VCF files 2. Gene panel TEAM ClinVar HUMSAVAR TEAM HGMD COSMIC Targeted Enrichment Analysis and Management

Getting information SIFT SIFT predicts whether an amino acid substitution affects protein function Interpretation: 1 (tolerated) to 0 (not tolerated) http://sift.jcvi.org/ PolyPhen Polymorphism Phenotyping is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein Interpretation: 1 (probably damage) to 0 (benign) http://genetics.bwh.harvard.edu/pph2/index.shtml TEAM Targeted Enrichment Analysis and Management

Getting information Consequence type or effect http://www.ensembl.org/info/genome/variation/predicted_data.html TEAM Targeted Enrichment Analysis and Management

How does TEAM work? http://team.babelomics.org/beta/ 1. Defining panel 2. Uploading input data 3. Getting results TEAM Targeted Enrichment Analysis and Management

How to define a panel? 1. Name of panel 3. Adding: 2. Diseases - more genes - mutations 4. Save panel TEAM Targeted Enrichment Analysis and Management

How to define a panel? Adding new mutations Checking mutations from Genome Viewer TEAM Targeted Enrichment Analysis and Management

Web results 1. OVERVIEW TEAM 2. DIAGNOSTIC Targeted Enrichment Analysis and Management

Web results 3. SECONDARY FINDINGS TEAM Targeted Enrichment Analysis and Management

Reporting results 4. REPORT PDF TEAM Targeted Enrichment Analysis and Management

Remarks TEAM is a free web tool Easy-to-use and powerful TEAM helps you for diagnostic TEAM Targeted Enrichment Analysis and Management

More information TEAM Tutorial: BIER TEAM http://ciberer.es/bier/team Targeted Enrichment Analysis and Management

Outline 1) Introduction to NGS Data Analysis 2) TEAM 3) PanelMaps Web Tools Introduction to NGS data analysis

Can I visualize and detect deletions for gene panel? BIER PanelMaps A web tool to analyze gene panel data

How does PanelMaps work? 1. BAM files 2. BED file PanelMaps PanelMaps Detecting altered regions for targeted sequencing

How does PanelMaps work? PanelMaps Coverage description

How does PanelMaps work? http://panelmaps.juanes.xyz/dashboard/pmapsdemo PanelMaps Detection of altered regions

How does PanelMaps work? http://panelmaps.juanes.xyz/dashboard/pmapsdemo PanelMaps Detecting altered regions for targeted sequencing

Any comment or question? Web tools for gene panel data