What is Bioinformatics?

Similar documents
Engineering Genetic Circuits

Introduction to BIOINFORMATICS

NGS Approaches to Epigenomics

Functional Genomics Research Stream. Research Meetings: November 2 & 3, 2009 Next Generation Sequencing

Contact us for more information and a quotation

Introduction to Bioinformatics

Matthew Tinning Australian Genome Research Facility. July 2012

Advanced Technology in Phytoplasma Research

Gene Expression Technology

Sequencing the Human Genome

GREG GIBSON SPENCER V. MUSE

Biochemistry. Dr. Shariq Syed. Shariq AIKC/FinalYB/2014

Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento

Genome Sequencing Technologies. Jutta Marzillier, Ph.D. Lehigh University Department of Biological Sciences Iacocca Hall

Two Mark question and Answers

Gene-centered databases and Genome Browsers

Gene-centered databases and Genome Browsers

Recent technology allow production of microarrays composed of 70-mers (essentially a hybrid of the two techniques)

Measuring and Understanding Gene Expression

Introduction to Next Generation Sequencing (NGS)

Next Generation Sequencing. Simon Rasmussen Assistant Professor Center for Biological Sequence analysis Technical University of Denmark

11/22/13. Proteomics, functional genomics, and systems biology. Biosciences 741: Genomics Fall, 2013 Week 11

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

Day 3. Examine gels from PCR. Learn about more molecular methods in microbial ecology

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Analysis of Microarray Data

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Expressed genes profiling (Microarrays) Overview Of Gene Expression Control Profiling Of Expressed Genes

Wheat CAP Gene Expression with RNA-Seq

Deep Sequencing technologies

Analysis of Microarray Data

Advances in analytical biochemistry and systems biology: Proteomics

The Expanded Illumina Sequencing Portfolio New Sample Prep Solutions and Workflow

Outline and learning objectives. From Proteomics to Systems Biology. Integration of omics - information

Bioinformatics: Microarray Technology. Assc.Prof. Chuchart Areejitranusorn AMS. KKU.

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson

BIMM 143: Introduction to Bioinformatics (Winter 2018)

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP

G E N OM I C S S E RV I C ES

DNA METHYLATION RESEARCH TOOLS

Chapter 10 Analytical Biotechnology and the Human Genome

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Gene expression microarrays and assays. Because your results can t wait

The Journey of DNA Sequencing. Chromosomes. What is a genome? Genome size. H. Sunny Sun

Introduction to NGS analyses

Overview of Next Generation Sequencing technologies. Céline Keime

BGGN 213: Foundations of Bioinformatics (Fall 2017)

BENG 183 Trey Ideker (the details )

NextGen Sequencing and Target Enrichment

Molecular Cloning. Genomic DNA Library: Contains DNA fragments that represent an entire genome. cdna Library:

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics

Microbial Metabolism Systems Microbiology

Frequently asked questions

Lecture #1. Introduction to microarray technology

7.1 Techniques for Producing and Analyzing DNA. SBI4U Ms. Ho-Lau

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

Sequencing techniques and applications

Chapter 6 - Molecular Genetic Techniques

Introduction into single-cell RNA-seq. Kersti Jääger 19/02/2014

Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future.

Biotechnology. Chapter 20. Biology Eighth Edition Neil Campbell and Jane Reece. PowerPoint Lecture Presentations for

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

Evidence of Purifying Selection in Humans. John Long Mentor: Angela Yen (Kellis Lab)

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

Introduction to RNA-Seq in GeneSpring NGS Software

Biology 644: Bioinformatics

Introduction to BIOINFORMATICS

Genome Analysis. Bacterial genome projects

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Single-Cell Whole Transcriptome Profiling With the SOLiD. System

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze

Executive Summary. Technologies

From Proteomics to Systems Biology. Integration of omics - information

Integrative Genomics 1a. Introduction

Welcome to the NGS webinar series

Introductory Next Gen Workshop

LECTURE-3. Protein Chemistry to proteomics HANDOUT. Proteins are the most dynamic and versatile macromolecules in a living cell, which

Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

B I O I N F O R M A T I C S

Next Generation Sequencing (NGS)

Applications of HMMs in Epigenomics

Ultrasequencing: Methods and Applications of the New Generation Sequencing Platforms

Capabilities & Services

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

1. Introduction Gene regulation Genomics and genome analyses

Lecture 2: High-Throughput Biology

Bioinformatics for Cell Biologists

Research school methods seminar Genomics and Transcriptomics

Introduction to Bioinformatics and Gene Expression Technology

Green Center Computational Core ChIP- Seq Pipeline, Just a Click Away

ChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015

Annotation. (Chapter 8)

Megan Schmidt Vice President of Product Management, CompuGroup Medical

The New Genome Analyzer IIx Delivering more data, faster, and easier than ever before. Jeremy Preston, PhD Marketing Manager, Sequencing

Transcription:

What is Bioinformatics? Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. - NCBI The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. - NCBI http://www.ncbi.nlm.nih.gov/about/primer/bioinformatics.html

DNA Sequencing 5 CHAIN TERMINATOR 3 A 3 hydroxyl group 2005 is Prentice essential Hall Inc. / for A Pearson chain Education elongation Company / Upper Saddle River, New Jersey 07458

Capillary Gel Electrophoresis The sequencing reaction is run out in a single capillary gel. The gel is scanned by a laser. The sequence is read automatically using computer software from the pattern of different wavelengths emitted by the fluorescent dyes.

2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

Automated sequencers: ABI 3700 Made by Applied Biosystems Most widely used automated sequencers: 96 capillaries robot loading from 384- well plates Two to three hours per run 600 700 bases per run robotic arm and syringe 96 glass capillaries 96 well plate load bar

Workflow of conventional vs. second-generation sequencing High-throughput shotgun Sanger sequencing Cyclic array shotgun sequencing Template amplification (Template amplification) Sanger cycle seq Capillary electrophoresis 96 or 384 long reads per run Template immobilization Seq by synthesis or hybridization Millions of short reads per run 6

Illumina Figure from M. Metzker, Nat Rev Genet, Jan. 2010 7

Cost of Sequence per megabase

Benefits of Next-gen sequencing https://genomevolution.org/wiki/images/1/16/plant_genome_growth.png

Why do we sequence? Genome Annotation: A complete genome sequence provides us with the raw data to construct a "parts list". Comparative Genomics: Conserved regions in the genome are more likely to play an important role in biology of the species. Functional Genomics: Sequencing the RNA provides us with an insight into the transcriptionally active regions of the genome. Population Genetics and Genomics: Genetic structure and diversity reveals history and distribution of phenotypic traits (e.g. disease susceptibility alleles) Genetic Analysis: Map and characterize molecular basis of allelic variants 10

We have the genome sequence, now what? Well...! We don t know how many genes there are!! We don t know where they are!! We don t know what they do!!

Definitions of Annotation Interpreting raw sequence data into useful biological information Information attached to genomic coordinates with start and end point, can occur at different levels Addition of as much reliable and up-to-date information as possible to describe a sequence Identification, structural description, characterization of putative protein products and other features in primary genomic sequence

Genome annotation Two Main Levels Structural annotation = Nucleotide-Protein level annotation. Finding genes and other biologically relevant sites thus building up a model of genome as objects with specific locations Functional annotation = Objects are used in database searches (and experiments) aim is attributing biologically relevant information to whole sequence and individual objects Large-scale genome analysis projects Rate-limiting step is annotation

How do we get from here 14

to here,

Summary of gene annotation steps

Gene prediction through comparative genomics Highly similar (Conserved) regions between two genomes are useful or else they would have diverged If genomes are too closely related all regions are similar, not just genes If genomes are too far apart, analogous regions may be too dissimilar to be found 17

Mouse-human comparison 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 From: J.W. Thomas et al - Nature 14 August 2003

The ENCODE Project Consortium (2011) A User s Guide to the Encyclopedia of DNA Elements (ENCODE). PLoS Biol 9(4)

Automated Manual Merged

Basic Distributed Annotation Systems (DAS)

Contents of an Integrated Database Experimental Data Microarray Chip-Chip Genome and Functional Annotation: Predicted genes, GO, MIPSFuncat Data to support modeling efforts Protein-protein interactions Protein-DNA interactions Pathways (KEGG, AraCyc)

Bioinformaticians integrate the data into one database 1) Find the data. Decentralized databases Data in different formats Experiments Function Models 2) Convert to a common format XML is a good idea (SBML) 3) Data integration. Manual: Excel sheet comparisons (Biologists) Automated: Perl Scripts (Informatician) Database: Queries e.g. SQL (High-production labs) 4) Gene list intersect. Annotation 5) Modeling Biological function in Gene list Need visualization and network modeling tools

UCSC browser

Examples of Large Genome Projects 1000 Genomes Project (www.1000genomes.org). An effort to sequence the genome of 1000 people to identify genetic variants that affect 1% of the human population. 1001 Arabidopsis thaliana Genomes Project ( www.1001genomes.org). Study the genomes and phenotypes of 1001 strains that can explain difference in phenotype caused by adaptation of different conditions. Metagenomics (http://commonfund.nih.gov/hmp/): Sequencing of DNA samples from environments, for example mouth, skin, and digestive system, to identify the different bacterial species present.

Your genome Personal Genome Sequencing: Several companies provide a service where you can submit your DNA to get sequenced. This can help you learn more about your heritage and also which diseases you are susceptible to. Medical Genomic Studies: There are already a collection of genetic testing procedures that look for specific genes. Unfortunately they are not accurate which can result in individuals making bad decisions. But hope is that with more genes, we can make better and more informed decisions.