Comparative Genomics: Background and Strategy

Size: px
Start display at page:

Download "Comparative Genomics: Background and Strategy"

Transcription

1 Comparative Genomics: Background and Strategy Parimala Devi Lori Gladley Emily Norris Dhruvi Patel Jennifer Pentz Ying Sha Yuehui Zhao

2 Meningococcal meningitis Disease caused by the bacterium Neisseria meningitidis Prominent in Sub-Saharan Africa - also known as the African meningitis belt Capable of causing epidemics Sub-Saharan African meningitis belt

3 Meningitis Vaccine Project Mission seeking to eliminate meningitis in subsaharan Africa Had lots of success in serogroup A vaccinations Increase in serogroup W in vaccinated countries Alarming because serogroup W may have the potential to cause epidemics CDC Meningitis Vaccination Project

4 Questions Is there a biological difference between epidemic and endemic strains? Can CRISPRs be used to type Neisseria meningitidis? What are the dynamics of the capsule switch between serogroups C and W?

5 Epidemic vs. Endemic Presence/absence of genes Presence/absence of antibiotic resistance genes Presence/absence of virulence genes SNP variation and distribution

6 Outline for presence/absence of genes

7 Novel online resource systematically cataloguing a comprehensive pan-genome of all microbial clades Organizes currently available draft and finished bacterial genomes into qualitycontrolled clades Provides a comprehensive non-redundant reference gene catalog Number of genomes in Neisseria meningitidis: 15 including Fam18, MC58 Number of non-redundant proteins of Neisseria Meningitidis: 2776

8 Generation of gene profile

9 Database search for endemic-specific genes 1. CARD Comprehensive antibiotic resistance database 2. VFDB Virulence Factors Database

10 Single Nucleotide Polymorphism (SNP) DNA sequence variation: when two species vary genetically by a single nucleotide Non-synonymous - change AA Synonymous - silent mutations that don t change AA

11 SNPs SNPs can affect the phenotype of an organism by altering the proteins encoded within the genome Provide an excellent opportunity to compare genes genomes within and across species Goal: Try to detect SNPs in our samples and look at SNP patterns between endemic vs. epidemic strains - look for mutually exlusive SNPs

12 SNP Detection Methods SAMTools Samtools mpileup Input format: Refrence sequence- Fasta format BAM file Output format: VCF file GATK (Genome Analysis Toolkit ) Freebayes

13 Questions Is there a biological difference between epidemic and endemic strains? Can CRISPRs be used to type Neisseria meningitidis? What are the dynamics of the capsule switch between serogroups C and W?

14 CRISPR in bacteria CRISPR: clustered regularly interspaced short palindromic repeats Prashant et al., Nature Methods, 2013 CRISPR RNAs (crrnas) Trans-activating RNA (TracrRNA) Cas9 Protospacer adjacent motif (PAM)

15 Cas9 structure Hiroshi et al., Cell, 2014

16 CRISPR Unit in Bacteria and Archaea A functional CRISPR-Cas system has two distinguishable components required for activity. Devaki Bhaya et al., Annual Review Genetics, 2011 The first recognizable feature is the CRISPR locus/array located on the genome (either chromosome or plasmid), which contains the hypervariable spacers acquired from virus or plasmid DNA. The second feature is a diverse group of Cas genes located in the vicinity of a CRISPR locus, which encode proteins (generically called Cas proteins) required for the multistep defense against invasive genetic elements.

17 Cas cascades Classfication Cas1 and Cas2 are essential genes constitute three distinct Cas cascades types. Type I: Cas3 Type II: Cas9 Bacteria N. meningitidis serogroup A Type III: Cas10 Common in Archaea Kira S. Makarova et al., Nature Reviews Microbiology, 2011

18 Major Cas proteins (Cas1-Cas10) Devaki Bhaya et al., Annual Review Genetics, 2011

19 CRISPR repeats classification Based on sequence similarity, CRISPR repeats can be classified into 12 groups. Victor Kunin et al,. Genome Biology, 2007

20 Victor Kunin et al,. Genome Biology, 2007 Four groups of CRISPR repeats clearly correspond to distinct CRISPR Cas subtypes: group 2 corresponds to subtype I-E systems, group 3 corresponds to subtype I -C systems, group 4 corresponds to subtype I -F systems, group 10 corresponds to type II systems. These four groups have the most stable operon structure. Other 8 groups are prone to recombination.

21 Comparative Biology Question Phylogenetics of CRISPR-Cas loci in N. meningitidis 1. Numbers, classifications and structures of CRISPR-Cas units. 2. Comparative analysis of Cas loci Phylogenetics of Cas cascades. 3. Comparative analysis of CRISPR loci Study evolution process of CRISPR loci of N. meningitidis based on viral infection history (prone to indicate geography classification and better classify subtypes of the same strain).

22 Strategy

23 Finding family-type for CRISPR-Cas loci in 25 strains Strain 1 Strain 2 Strain 3 Strain 25 Type I Type II Type III

24 Phylogenetic analysis of CRISPR for each strain Comparison the tree with meta-data i.e geogrphic location, time collected, epidemic/endemic, serogroups etc Test the power of using CRISPR as typing method

25 Questions Is there a biological difference between epidemic and endemic strains? Can CRISPRs be used to type Neisseria meningitidis? What are the dynamics of the capsule switch between serogroups C and W?

26 Capsule switching Was the capsule switch an independent event? What is the directionality of the capsule switch? Where is the location of the break point?

27 Was capsule switch an independent event or multiple events? Strategy: Use phylogenetics Capsule Background

28 Was capsule switch an independent event or multiple events?

29 Was capsule switch an independent event or multiple events? Just background Just capsule

30 Was capsule switch an independent event or multiple events? Independent event - capsule itself and background will have same evolutionary history Just background Just capsule

31 Was capsule switch an independent event or multiple events? Multiple events - capsule itself and background will have different evolutionary histories Just background Just capsule

32 Strategy Create tree of whole genome Create tree of background Create tree of just capsular region

33 Building whole genome trees Step 1: Finding orthologs Assemble a list of orthologous genes between our genomes using a reciprocal BLASTp approach Step 2: Perform Multiple sequence alignment Clustal Omega aligns the orthologs Step 3: Build a tree Supermatrix tree approach: concatenates all alignments and generates tree

34 Reciprocal BLAST Common method for predicting putative orthologous genes BLAST between sequence data from two species where each species is used as query and subject during the process BLAST query a against dataset B -> best hit b BLAST query b against dataset A -> best hit a If a=a then b is the best reciprocal hit and genes are said to be orthologous

35 Tree building methods Maximum parsimony - simplest tree wins

36 Tree building methods Maximum parsimony - simplest tree wins Neighbor-joining tree - distance matrix method Neighbors - two taxa that are connected by single node in unrooted tree

37 Tree building methods

38 Tree building methods Maximum parsimony - simplest tree wins Neighbor-joining tree - distance matrix method Neighbors - two taxa that are connected by single node in unrooted tree Fast - practical for analyzing large data sets Robust when bootstrapping Maximum Likelihood - want to find the tree with the highest likelihood Allows you to use statistical techniques Computationally intensive

39 What is the directionality of the capsule switch? Capsule switch occurs due to horizontal gene transfer IS elements, transposons, DNA competence genes We will also be using phylogentic studies to infer the direction of horizontal gene transfer

40 What we expect to see if C to W HGT event occured C C W W W W C C C A

41 Inferring Evolution of Strains from Multiple-Locus-Sequence-Typing (i.e. MLST)

42 MLST what?? Exploits unambiguous nucleotide variation of housekeeping genes Strains that share the same sequence type (ST) are likely to share a recent common ancestor Clonal complexes are families of ST s Advantageous because comparable methods (MLEE) use electrophoretic mobility patterns, which can be difficult to compare across laboratories

43 Epidemiology: Where does MLST fit in? Detect short term or local events Identifying how strains may be related in the event of a local outbreak PFGE can detect microvariation in local events because it utilizes rapidly evolving variation Detect long term or global events Identifying how strains causing disease in one part of the world could be related to strains causing disease more globally? MLST can detect more variation than MLEE >> enhanced resolution and detection of hyperviruent lineages that emerge However, slow diversification of these lineages may eventually make them hard to distinguish from the background of isolates.

44 How does MLST work?

45 Useful Programs Genes can be located using blastn and taking the best hit or pulled directly from the annotations Blast databases of each gene prediction file may be created with ncbi makeblastdb script Genes can then be extracted from the database files with ncbi blastdbcmd script and cat into one gene file Genes can be aligned individually in MEGA and concatenated; MEGA is great for tree-drawing PubMLST Neisseria may be used to search for alleles and STs

46 Neisseria MLST Loci 7 housekeeping genes (each~ bp) abcz- ABC transporter adk- adenylate kinase aroe- shikimate dehydrogenase fumc- Fumarate hydratase class II gdh- glutamate dehydrogenase pdhc- dihydrolipoamide acetyltransferase pgm- phosphoglucomutase

47 Neisseria MLST Hyper-virulent lineages generally have identical alleles at all loci, whereas the general population has more diverse sets of alleles Strains of the same sequence type (identical alleles at all loci) may cluster together in a phylogeny, indicating decent from a common ancestor The fact that hyperviruent serogroup W strains share the same ST has hypervirulent serogroup C ST11, indicates that a capsule switch likely occurred. The tree (s) will tell the story..stay tuned!

48 Capsule Recombination Breakpoint Want to know where the capsule switch occurred

49 Capsule Recombination Breakpoint Want to know where the capsule switch occurred

50 Breakpoint Detection Methods 1. HREfinder software Homologous Recombination Events Based on SNPs and SNP positions Apply dynamic programming algorithm to model if the changes are a result of HREs Wang W-B, Jiang T, Gardner S (2013) Detection of Homologous Recombination Events in Bacterial Genomes. PLoS ONE 8(10): e doi: /journal.pone

51

52

53 HREfinder software Input: SNP alleles and positional information, genome sequences, phylogenetic tree Based on capsule genome Output: Phylogenetic tree with number of possible HREs Locations of HREs

54 Breakpoint Detection Methods 2. Maximum χ2 Method BLAST using sliding window against known parental serogroups Use average bit scores for parental serogroups and sum for upstream and downstream windows Compute χ2 and p-values Plot against sample genome and take local maxima as breakpoints Rishishwar L, et al Genomic Basis of a Polyagglutinating Isolate of Neisseria meningitidis. J. Bacteriol. 194:

55

56 Breakpoint Detection Methods 3. RDP3 Recombination Detection Program Statistical identification of recombination events Uses multiple non-parametric recombination detection methods Does not use population genetic models and no attempt to estimate population recombination rate Examples 3SEQ:identify mosaic structure or recombination in nt sequence data 4SIS: uses chi-square value to optimize breakpoint detection based on phylogenetic clustering

57 Open rectangles are areas of recombination Height proportional to negative log of p-value Each color is distinct set of recombination events Wang, Q., Z. Shao, X. Wang, Y. Gao, M. Li, L. Xu, J. Xu, L. Wang Genetic Study of Capsular Switching between Neisseria meningitidis Sequence Type 7 Serogroup A and C Strains. Infect. Immun. 78: ,

58 Questions Is there a biological difference between epidemic and endemic strains? Can CRISPRs be used to type Neisseria meningitidis? What are the dynamics of the capsule switch between serogroups C and W?

59 Thank you!

Simulation-based statistical inference in computational biology

Simulation-based statistical inference in computational biology Simulation-based statistical inference in computational biology Case study: modeling the genome evolution of Streptococcus pneumoniae Pekka Marttinen, Nicholas J. Croucher, Michael U. Gutmann, Jukka Corander,

More information

CRISPR cas : Presented By: Pooya Rashvand Advised By: Dr. M.Aslanimehr

CRISPR cas : Presented By: Pooya Rashvand Advised By: Dr. M.Aslanimehr Journal Club & MSc Seminar CRISPR cas : Presented By: Pooya Rashvand Advised By: Dr. M.Aslanimehr CRISPR - cas : A New tool for Genetic Manipulations from Bacterial Immunity Systems Viral SS DNA RNA Guide

More information

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES We sequenced and assembled a genome, but this is only a long stretch of ATCG What should we do now? 1. find genes Gene calling The simpliest thing is to look

More information

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES We sequenced and assembled a genome, but this is only a long stretch of ATCG What should we do now? 1. find genes What are the starting and end points for

More information

Why study sequence similarity?

Why study sequence similarity? Sequence Similarity Why study sequence similarity? Possible indication of common ancestry Similarity of structure implies similar biological function even among apparently distant organisms Example context:

More information

Using CRISPR for genetic alteration

Using CRISPR for genetic alteration Using CRISPR for genetic alteration Joffrey Mianné. j.mianne@har.mrc.ac.uk Mary Lyon Centre, MRC Harwell. CRISPR/Cas origins Origin of the CRISPR/Cas system: Clustered-Regularly Interspaced Short Palindromic

More information

Technology. offer: New. method

Technology. offer: New. method Technology offer: New method to detect spacer acquisition in CRISPR structures Technology offer: New method to detect spacer acquisition in CRISPR structures. SUMMARY CRISPR structures are a component

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Flow chart of the study.

Nature Genetics: doi: /ng Supplementary Figure 1. Flow chart of the study. Supplementary Figure 1 Flow chart of the study. The strategy involving the four different approaches (epidemiological, clinical, genomics and experimental) used in this study is shown. C, clinical; F,

More information

Tutorial for Stop codon reassignment in the wild

Tutorial for Stop codon reassignment in the wild Tutorial for Stop codon reassignment in the wild Learning Objectives This tutorial has two learning objectives: 1. Finding evidence of stop codon reassignment on DNA fragments. 2. Detecting and confirming

More information

Multilocus Sequence Typing - MLST. Characterisation of bacteria. Hospital outbreak of resistant bacteria. Community outbreak of diarrhoea

Multilocus Sequence Typing - MLST. Characterisation of bacteria. Hospital outbreak of resistant bacteria. Community outbreak of diarrhoea Multilocus Sequence Typing - MLST Slides from Andreas Petersen Slide 1 Characterisation of bacteria Hospital outbreak of resistant bacteria Community outbreak of diarrhoea Global spread of respiratory

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Genome-wide longitudinal analysis of emm1 invasive Group A Streptococcus isolated from Belgian patients during

Genome-wide longitudinal analysis of emm1 invasive Group A Streptococcus isolated from Belgian patients during #O226 Genome-wide longitudinal analysis of emm1 invasive Group A Streptococcus isolated from Belgian patients during 1994 2013 J. Coppens 1, B. B. Xavier, J. Sabirova 1, C. Lammens 1, K. Loens 2, S. Malhotra-Kumar

More information

FUNCTIONAL BIOINFORMATICS

FUNCTIONAL BIOINFORMATICS Molecular Biology-2018 1 FUNCTIONAL BIOINFORMATICS PREDICTING THE FUNCTION OF AN UNKNOWN PROTEIN Suppose you have found the amino acid sequence of an unknown protein and wish to find its potential function.

More information

Theory and Application of Multiple Sequence Alignments

Theory and Application of Multiple Sequence Alignments Theory and Application of Multiple Sequence Alignments a.k.a What is a Multiple Sequence Alignment, How to Make One, and What to Do With It Brett Pickett, PhD History Structure of DNA discovered (1953)

More information

03-511/711 Computational Genomics and Molecular Biology, Fall

03-511/711 Computational Genomics and Molecular Biology, Fall 03-511/711 Computational Genomics and Molecular Biology, Fall 2010 1 Study questions These study problems are intended to help you to review for the final exam. This is not an exhaustive list of the topics

More information

Scoring Alignments. Genome 373 Genomic Informatics Elhanan Borenstein

Scoring Alignments. Genome 373 Genomic Informatics Elhanan Borenstein Scoring Alignments Genome 373 Genomic Informatics Elhanan Borenstein A quick review Course logistics Genomes (so many genomes) The computational bottleneck Python: Programs, input and output Number and

More information

Comparative Genomics Background and Strategy

Comparative Genomics Background and Strategy Comparative Genomics Background and Strategy Faction 1 3/29/2017 Comparative Genomics is Comparison of structure and feature composition Comparison of derived metrics Li et al., 2015; Chaudhry & Patil,

More information

The Relative Contributions of Recombination and Mutation to the Divergence of Clones of Neisseria meningitidis

The Relative Contributions of Recombination and Mutation to the Divergence of Clones of Neisseria meningitidis The Relative Contributions of Recombination and Mutation to the Divergence of Clones of Neisseria meningitidis Edward J. Feil,* Martin C. J. Maiden,* Mark Achtman, and Brian G. Spratt* *Wellcome Trust

More information

Mutagenesis for Studying Gene Function Spring, 2007 Guangyi Wang, Ph.D. POST103B

Mutagenesis for Studying Gene Function Spring, 2007 Guangyi Wang, Ph.D. POST103B Mutagenesis for Studying Gene Function Spring, 2007 Guangyi Wang, Ph.D. POST103B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm Overview of Last Lecture DNA microarray hybridization

More information

03-511/711 Computational Genomics and Molecular Biology, Fall

03-511/711 Computational Genomics and Molecular Biology, Fall 03-511/711 Computational Genomics and Molecular Biology, Fall 2011 1 Study questions These study problems are intended to help you to review for the final exam. This is not an exhaustive list of the topics

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing

Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing Underwood et al. BMC Microbiology 2013, 13:302 RESEARCH ARTICLE Open Access Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing

More information

The Basics of Understanding Whole Genome Next Generation Sequence Data

The Basics of Understanding Whole Genome Next Generation Sequence Data The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next

More information

MOLECULAR TYPING TECHNIQUES

MOLECULAR TYPING TECHNIQUES MOLECULAR TYPING TECHNIQUES RATIONALE Used for: Identify the origin of a nosocomial infection Identify transmission of disease between individuals Recognise emergence of a hypervirulent strain Recognise

More information

Challenges and opportunities for whole genome sequencing based surveillance of antibiotic resistance

Challenges and opportunities for whole genome sequencing based surveillance of antibiotic resistance Challenges and opportunities for whole genome sequencing based surveillance of antibiotic resistance Prof. Willem van Schaik Professor in Microbiology and Infection Institute of Microbiology and Infection

More information

Evolutionary Genetics: Part 1 Polymorphism in DNA

Evolutionary Genetics: Part 1 Polymorphism in DNA Evolutionary Genetics: Part 1 Polymorphism in DNA S. chilense S. peruvianum Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik Color code Color code: Red = Important result or definition

More information

SNP calling and VCF format

SNP calling and VCF format SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide

More information

Genomic epidemiology of bacterial pathogens. Sylvain BRISSE Microbial Evolutionary Genomics, Institut Pasteur, Paris

Genomic epidemiology of bacterial pathogens. Sylvain BRISSE Microbial Evolutionary Genomics, Institut Pasteur, Paris Genomic epidemiology of bacterial pathogens Sylvain BRISSE Microbial Evolutionary Genomics, Institut Pasteur, Paris Typing Population genetics Analysis of strain diversity within species Aim: Local epidemiology?

More information

Introduction to DNA-Sequencing

Introduction to DNA-Sequencing informatics.sydney.edu.au sih.info@sydney.edu.au The Sydney Informatics Hub provides support, training, and advice on research data, analyses and computing. Talk to us about your computing infrastructure,

More information

The use of bioinformatic analysis in support of HGT from plants to microorganisms. Meeting with applicants Parma, 26 November 2015

The use of bioinformatic analysis in support of HGT from plants to microorganisms. Meeting with applicants Parma, 26 November 2015 The use of bioinformatic analysis in support of HGT from plants to microorganisms Meeting with applicants Parma, 26 November 2015 WHY WE NEED TO CONSIDER HGT IN GM PLANT RA Directive 2001/18/EC As general

More information

Genetic Adaptation II. Microbial Physiology Module 3

Genetic Adaptation II. Microbial Physiology Module 3 Genetic Adaptation II Microbial Physiology Module 3 Topics Topic 4: Topic 5: Transposable Elements Exchange of Genetic Material Between Organisms Topic 5a: Protection Against Foreign DNA Aims and Objectives

More information

CRISPR/Cas9 From Yoghurt to Designer Babies. Source: nextbigfuture.com

CRISPR/Cas9 From Yoghurt to Designer Babies. Source: nextbigfuture.com CRISPR/Cas9 From Yoghurt to Designer Babies Source: nextbigfuture.com OBJECTIVES 1. Relate the functioning of bacterial CRISPR/Cas systems to acquired immunity 2. Describe how CRISPR/Cas9 cuts DNA 3. Explain

More information

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases Chapter 7: Similarity searches on sequence databases All science is either physics or stamp collection. Ernest Rutherford Outline Why is similarity important BLAST Protein and DNA Interpreting BLAST Individualizing

More information

BME 110 Midterm Examination

BME 110 Midterm Examination BME 110 Midterm Examination May 10, 2011 Name: (please print) Directions: Please circle one answer for each question, unless the question specifies "circle all correct answers". You can use any resource

More information

Introduction and History of Genome Modification. Adam Clore, PhD Director, Synthetic Biology Design

Introduction and History of Genome Modification. Adam Clore, PhD Director, Synthetic Biology Design Introduction and History of Genome Modification Adam Clore, PhD Director, Synthetic Biology Design Early Non-site Directed Genome Modification Homologous recombination in yeast TARGET GENE 5 Arm URA3 3

More information

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3 cdna libraries, EST clusters, gene prediction and functional annotation Biosciences 741: Genomics Fall, 2013 Week 3 1 2 3 4 5 6 Figure 2.14 Relationship between gene structure, cdna, and EST sequences

More information

Introduction to Bioinformatics Problem Set 7: Genome Analysis

Introduction to Bioinformatics Problem Set 7: Genome Analysis Introduction to Bioinformatics Problem Set 7: Genome Analysis 1. You are examining the sequence of a newly sequenced phage and run across the following features. In each case, you're wondering whether

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Functional Annotation: Preliminary Results

Functional Annotation: Preliminary Results Functional Annotation: Preliminary Results Vani Rajan Gena Tang Neha Varghese Kevin Lee Gabriel Mitchell Tripp Jones Robert Petit Shaupu Qin Outline Motivation Naming scheme Preliminary Program Results

More information

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in

More information

What I hope you ll learn. Introduction to NCBI & Ensembl tools including BLAST and database searching!

What I hope you ll learn. Introduction to NCBI & Ensembl tools including BLAST and database searching! What I hope you ll learn Introduction to NCBI & Ensembl tools including BLAST and database searching What do we learn from database searching and sequence alignments What tools are available at NCBI What

More information

Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast

Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast Rob Davey NCYC 2009 Introduction SGRP project Ribosomal DNA and variation Computational methods Preliminary Results Conclusions SGRP

More information

Introducing Your Students To Gene Editing With CRISPR

Introducing Your Students To Gene Editing With CRISPR Introducing Your Students To Gene Editing With CRISPR Brian Ell, Ph.D. Edvotek www.edvotek.com Follow @Edvotek EDVOTEK Biotech The Biotechnology Education Company Celebrating 30 years of science education!

More information

Supplementary Materials

Supplementary Materials Supplementary Materials The number of sequence variations among the known phage genomes can provide an estimate of the P. acnes phage population diversity within each skin community. We calculated the

More information

Why learn sequence database searching? Searching Molecular Databases with BLAST

Why learn sequence database searching? Searching Molecular Databases with BLAST Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

Bioinformatics for Biologists. Comparative Protein Analysis

Bioinformatics for Biologists. Comparative Protein Analysis Bioinformatics for Biologists Comparative Protein nalysis: Part I. Phylogenetic Trees and Multiple Sequence lignments Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research

More information

Whole Genome Sequence Data Quality Control and Validation

Whole Genome Sequence Data Quality Control and Validation Whole Genome Sequence Data Quality Control and Validation GoSeqIt ApS / Ved Klædebo 9 / 2970 Hørsholm VAT No. DK37842524 / Phone +45 26 97 90 82 / Web: www.goseqit.com / mail: mail@goseqit.com Table of

More information

Using Galaxy for the analysis of NGS-derived pathogen genomes in clinical microbiology

Using Galaxy for the analysis of NGS-derived pathogen genomes in clinical microbiology Using Galaxy for the analysis of NGS-derived pathogen genomes in clinical microbiology Anthony Underwood*, Paul-Michael Agapow, Michel Doumith and Jonathan Green. Bioinformatics Unit, Health Protection

More information

12/8/09 Comp 590/Comp Fall

12/8/09 Comp 590/Comp Fall 12/8/09 Comp 590/Comp 790-90 Fall 2009 1 One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one

More information

Bioinformatics: Sequence Analysis. COMP 571 Luay Nakhleh, Rice University

Bioinformatics: Sequence Analysis. COMP 571 Luay Nakhleh, Rice University Bioinformatics: Sequence Analysis COMP 571 Luay Nakhleh, Rice University Course Information Instructor: Luay Nakhleh (nakhleh@rice.edu); office hours by appointment (office: DH 3119) TA: Leo Elworth (DH

More information

2054, Chap. 13, page 1

2054, Chap. 13, page 1 2054, Chap. 13, page 1 I. Microbial Recombination and Plasmids (Chapter 13) A. recombination = process of combining genetic material from 2 organisms to produce a genotype different from either parent

More information

PLNT2530 (2018) Unit 9. Genome Editing

PLNT2530 (2018) Unit 9. Genome Editing PLNT2530 (2018) Unit 9 Genome Editing Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License Attribution Share-Alike 2.5 Canada 1 Genome Editing

More information

Comparative genomics of clinical isolates of Pseudomonas fluorescens, including the discovery of a novel disease-associated subclade.

Comparative genomics of clinical isolates of Pseudomonas fluorescens, including the discovery of a novel disease-associated subclade. Comparative genomics of clinical isolates of Pseudomonas fluorescens, including the discovery of a novel disease-associated subclade. by Brittan Starr Scales A dissertation submitted in partial fulfillment

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs.

Comparative Genomics. Page 1. REMINDER: BMI 214 Industry Night. We ve already done some comparative genomics. Loose Definition. Human vs. Page 1 REMINDER: BMI 214 Industry Night Comparative Genomics Russ B. Altman BMI 214 CS 274 Location: Here (Thornton 102), on TV too. Time: 7:30-9:00 PM (May 21, 2002) Speakers: Francisco De La Vega, Applied

More information

A Naturally Occurring Epiallele associates with Leaf Senescence and Local Climate Adaptation in Arabidopsis accessions He et al.

A Naturally Occurring Epiallele associates with Leaf Senescence and Local Climate Adaptation in Arabidopsis accessions He et al. A Naturally Occurring Epiallele associates with Leaf Senescence and Local Climate Adaptation in Arabidopsis accessions He et al. Supplementary Notes Origin of NMR19 elements Because there are two copies

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

Recent Advances Towards An Intraspecific Theory of Human Variation for Digital Models

Recent Advances Towards An Intraspecific Theory of Human Variation for Digital Models Recent Advances Towards An Intraspecific Theory of Human Variation for Digital Models By Bradly Alicea freejumper@yahoo.com Department of Telecommunication, Information Studies, and Media and Cognitive

More information

Functional annotation of metagenomes

Functional annotation of metagenomes Functional annotation of metagenomes Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Introduction Functional analysis Objectives:

More information

CRISPR/Cas9 Genome Editing: Transfection Methods

CRISPR/Cas9 Genome Editing: Transfection Methods CRISPR/ Genome Editing: Transfection Methods For over 20 years Mirus Bio has developed and manufactured high performance transfection products and technologies. That expertise is now being applied to the

More information

Research Article SSFinder: High Throughput CRISPR-Cas Target Sites Prediction Tool

Research Article SSFinder: High Throughput CRISPR-Cas Target Sites Prediction Tool BioMed, Article ID 742482, 4 pages http://dx.doi.org/10.1155/2014/742482 Research Article SSFinder: High Throughput CRISPR-Cas Target Sites Prediction Tool Santosh Kumar Upadhyay and Shailesh Sharma National

More information

Developing Tools for Rapid and Accurate Post-Sequencing Analysis of Foodborne Pathogens. Mitchell Holland, Noblis

Developing Tools for Rapid and Accurate Post-Sequencing Analysis of Foodborne Pathogens. Mitchell Holland, Noblis Developing Tools for Rapid and Accurate Post-Sequencing Analysis of Foodborne Pathogens Mitchell Holland, Noblis Agenda Introduction Whole Genome Sequencing Analysis Pipeline Sequence Alignment SNPs and

More information

Bacterial Genetics. Stijn van der Veen

Bacterial Genetics. Stijn van der Veen Bacterial Genetics Stijn van der Veen Differentiating bacterial species Morphology (shape) Composition (cell envelope and other structures) Metabolism & growth characteristics Genetics Differentiating

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Phytopathogenic bacteria. review of arms race Effector-triggered immunity Effector diversification

Phytopathogenic bacteria. review of arms race Effector-triggered immunity Effector diversification Phytopathogenic bacteria review of arms race Effector-triggered immunity Effector diversification Arms race of plant-pathogen interactions plants have different layers of immunity 1 st layer: PAMP-triggered

More information

GENETICS - CLUTCH CH.5 GENETICS OF BACTERIA AND VIRUSES.

GENETICS - CLUTCH CH.5 GENETICS OF BACTERIA AND VIRUSES. !! www.clutchprep.com CONCEPT: WORKING WITH MICROORGANISMS Bacteria are easy to with in a laboratory setting They are fast dividing, take up little space, and are easily grown in a lab - Plating is when

More information

Any questions? d N /d S ratio (ω) estimated across all sites is inefficient at detecting positive selection

Any questions? d N /d S ratio (ω) estimated across all sites is inefficient at detecting positive selection Any questions? Based upon this BlastP result, is the query used homologous to lysin? Blast Phylogenies Likelihood Ratio Tests The current size of the NR protein database is 680,984,053 and the PDB is 3,816,875.

More information

Sequence Analysis Lab Protocol

Sequence Analysis Lab Protocol Sequence Analysis Lab Protocol You will need this handout of instructions The sequence of your plasmid from the ABI The Accession number for Lambda DNA J02459 The Accession number for puc 18 is L09136

More information

ENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics

ENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics A very coarse introduction to bioinformatics In this exercise, you will get a quick primer on how DNA is used to manufacture proteins. You will learn a little bit about how the building blocks of these

More information

Application of Different Typing Methods for Detection of Microbial Contamination of Biological Products and Clean Rooms

Application of Different Typing Methods for Detection of Microbial Contamination of Biological Products and Clean Rooms Application of Different Typing Methods for Detection of Microbial Contamination of Biological Products and Clean Rooms Pejvak Khaki Department of Microbiology Razi Vaccine & Serum Research Institute Karaj,

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Supplementary Figure 1. Design of the control microarray. a, Genomic DNA from the

Supplementary Figure 1. Design of the control microarray. a, Genomic DNA from the Supplementary Information Supplementary Figures Supplementary Figure 1. Design of the control microarray. a, Genomic DNA from the strain M8 of S. ruber and a fosmid containing the S. ruber M8 virus M8CR4

More information

The Basics of Understanding Whole Genome Next Generation Sequence Data

The Basics of Understanding Whole Genome Next Generation Sequence Data The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next Generation

More information

Proc. Natl. Acad. Sci. USA Vol. 95, pp , March 1998 Microbiology

Proc. Natl. Acad. Sci. USA Vol. 95, pp , March 1998 Microbiology Proc. Natl. Acad. Sci. USA Vol. 95, pp. 3140 3145, March 1998 Microbiology Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms

More information

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical

More information

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics CMSC423: Bioinformatic Algorithms, Databases and Tools Some Genetics CMSC423 Fall 2009 2 Chapter 13 Reading assignment CMSC423 Fall 2009 3 Gene association studies Goal: identify genes/markers associated

More information

Annotation of contig27 in the Muller F Element of D. elegans. Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans.

Annotation of contig27 in the Muller F Element of D. elegans. Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans. David Wang Bio 434W 4/27/15 Annotation of contig27 in the Muller F Element of D. elegans Abstract Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans. Genscan predicted six

More information

Tracking Evolutionary Genetic Events in Group A Streptococcus Using Whole Genome Mapping

Tracking Evolutionary Genetic Events in Group A Streptococcus Using Whole Genome Mapping #B-2001 Tracking Evolutionary Genetic Events in Group A Streptococcus Using Whole Genome Mapping J. Coppens 1, J. Sabirova 1, C. Lammens 1, K. Loens 2, S. Malhotra-Kumar 1, H. Goossens 1 1 Laboratory of

More information

Why Use BLAST? David Form - August 15,

Why Use BLAST? David Form - August 15, Wolbachia Workshop 2017 Bioinformatics BLAST Basic Local Alignment Search Tool Finding Model Organisms for Study of Disease Can yeast be used as a model organism to study cystic fibrosis? BLAST Why Use

More information

Ultra-dense SNP genetic map construction and identification of SiDt gene controlling the

Ultra-dense SNP genetic map construction and identification of SiDt gene controlling the Ultra-dense SNP genetic map construction and identification of SiDt gene controlling the determinate growth habit in Sesamum indicum L. Haiyang Zhang*, Hongmei Miao, Chun Li, Libin Wei, Yinghui Duan, Qin

More information

CRISPR-Cas - introduction. John van der Oost

CRISPR-Cas - introduction. John van der Oost CRISPR-Cas - introduction John van der Oost CRISPR-Cas 2 classes cas operon leader CRISPR CRISPR clustered regularly interspaced palindromic repeats Cas CRISPR-associated genes & proteins present in genomes

More information

User Instructions:Transfection-ready CRISPR/Cas9 Reagents. Target DNA. NHEJ repair pathway. Nucleotide deletion. Nucleotide insertion Gene disruption

User Instructions:Transfection-ready CRISPR/Cas9 Reagents. Target DNA. NHEJ repair pathway. Nucleotide deletion. Nucleotide insertion Gene disruption User Instructions:Transfection-ready CRISPR/Cas9 Reagents Background Introduction to CRISPR/Cas9 genome editing In bacteria and archaea, clustered regularly interspaced short palindromic repeats (CRISPR)

More information

Bacteria Reproduce Asexually via BINARY FISSION

Bacteria Reproduce Asexually via BINARY FISSION An Introduction to Microbial Genetics Today: Intro to Microbial Genetics Lunch pglo! Bacteria Reproduce Asexually via BINARY FISSION But, Bacteria still undergo GENETIC RECOMBINATION (combining DNA from

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools News About NCBI Site Map

More information

CRISPR: hot, hot, hot

CRISPR: hot, hot, hot CRISPR: hot, hot, hot 166 CRISPR is the latest technique for genome engineering and is generating tons of excitement due to its versatility, high specificity, and ease of use. CRISPR stands for clustered

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

Corynebacterium pseudotuberculosis genome sequencing: Final Report

Corynebacterium pseudotuberculosis genome sequencing: Final Report Summary To provide an invaluable resource to assist in the development of diagnostics and vaccines against caseous lymphadenitis (CLA), the sequencing of the genome of a virulent, United Kingdom Corynebacterium

More information

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010

Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomics is a new and expanding field with an increasing impact

More information

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15 Introduction to Population Genetics Spezielle Statistik in der Biomedizin WS 2014/15 What is population genetics? Describes the genetic structure and variation of populations. Causes Maintenance Changes

More information

Bart Williams, PhD Van Andel Research Center

Bart Williams, PhD Van Andel Research Center A History of Genome Editing in the Laboratory Implications for Translational Applications Bart Williams, PhD Van Andel Research Center Introduction by Matthew Denenberg, MD DeVos Childrens Hospital Disclosures:

More information

Functional profiling of metagenomic short reads: How complex are complex microbial communities?

Functional profiling of metagenomic short reads: How complex are complex microbial communities? Functional profiling of metagenomic short reads: How complex are complex microbial communities? Rohita Sinha Senior Scientist (Bioinformatics), Viracor-Eurofins, Lee s summit, MO Understanding reality,

More information

Gene Prediction Group

Gene Prediction Group Group Ben, Jasreet, Jeff, Jia, Kunal TACCTGAAAAAGCACATAATACTTATGCGTATCCGCCCTAAACACTGCCTTCTTTCTCAA AGAAGATGTCGCCGCTTTTCAACCGAACGATGTGTTCTTCGCCGTTTTCTCGGTAGTGCA TATCGATGATTCACGTTTCGGCAGTGCAGGCACCGGCGCATATTCAGGATACCGGACGCT

More information

Introns early. Introns late

Introns early. Introns late Introns early Introns late Self splicing RNA are an example for catalytic RNA that could have been present in RNA world. There is little reason to assume that the RNA world was not plagued by self-splicing

More information

CSE/Beng/BIMM 182: Biological Data Analysis. Instructor: Vineet Bafna TA: Nitin Udpa

CSE/Beng/BIMM 182: Biological Data Analysis. Instructor: Vineet Bafna TA: Nitin Udpa CSE/Beng/BIMM 182: Biological Data Analysis Instructor: Vineet Bafna TA: Nitin Udpa Today We will explore the syllabus through a series of questions? Please ASK All logistical information will be given

More information

How Stable Are the Core Genes of Bacterial Pathogens?

How Stable Are the Core Genes of Bacterial Pathogens? How Stable Are the Core Genes of Bacterial Pathogens? High rates of recombination within the meningococci and pneumococci means even their core genomes are subject to rapid change Edward J. Feil Edward

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Analysis of natural and artificial yeast populations by next-generation sequencing

Analysis of natural and artificial yeast populations by next-generation sequencing Analysis of natural and artificial yeast populations by next-generation sequencing Anders Bergström Degree project in bioinformatics, 2012 Examensarbete i bioinformatik 30 hp till masterexamen, 2012 Biology

More information

(A) Extrachromosomal DNA (B) RNA found in bacterial cells (C) Is part of the bacterial chromosome (D) Is part of the eukaryote chromosome

(A) Extrachromosomal DNA (B) RNA found in bacterial cells (C) Is part of the bacterial chromosome (D) Is part of the eukaryote chromosome Microbiology - Problem Drill 07: Microbial Genetics and Biotechnology No. 1 of 10 1. A plasmid is? (A) Extrachromosomal DNA (B) RNA found in bacterial cells (C) Is part of the bacterial chromosome (D)

More information

Methods for Reverse genetics References:

Methods for Reverse genetics References: Methods for Reverse genetics References: 1. Alonso JM, Ecker JR. Moving forward in reverse: genetic technologies to enable genomewide phenomic screens in Arabidopsis. Nat Rev Genet. 2006 Jul;7(7):524-36.

More information