Computational Methods in Bioinformatics

Similar documents
Introduction to BIOINFORMATICS

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Introduction to Bioinformatics

Basic modeling approaches for biological systems -II. Mahesh Bule

Introduction to Bioinformatics

Teaching Bioinformatics in the High School Classroom. Models for Disease. Why teach bioinformatics in high school?

Data Mining for Biological Data Analysis

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)

Comparative Bioinformatics. BSCI348S Fall 2003 Midterm 1

The Integrated Biomedical Sciences Graduate Program

Molecular Basis of Inheritance

Computational Genomics

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

In 1996, the genome of Saccharomyces cerevisiae was completed due to the work of

BIMM 143: Introduction to Bioinformatics (Winter 2018)

What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases.

Photosynthetic production of biofuels from CO 2 by cyanobacteria using Algenol s Direct to Ethanol process Strain development aspects

Class XII Chapter 6 Molecular Basis of Inheritance Biology

Engineering Genetic Circuits

Bioinformatics: Sequence Analysis. COMP 571 Luay Nakhleh, Rice University

Outline and learning objectives. From Proteomics to Systems Biology. Integration of omics - information

Computational Genomics ( )

Frumkin, 2e Part 1: Methods and Paradigms. Chapter 6: Genetics and Environmental Health

Klinisk kemisk diagnostik BIOINFORMATICS

Networks of Signal Transduction and Regulation in Cellular Systems

Cory Brouwer, Ph.D. Xiuxia Du, Ph.D. Anthony Fodor, Ph.D.

Genetics and Bioinformatics

Bioinformatics. Dick de Ridder. Tuinbouw Digitaal, 12/11/15

Introduction to BIOINFORMATICS

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies

High peformance computing infrastructure for bioinformatics

MARINE BIOINFORMATICS & NANOBIOTECHNOLOGY - PBBT305

Biology 644: Bioinformatics

Metabolism and metabolic networks. Metabolism is the means by which cells acquire energy and building blocks for cellular material

BGGN 213: Foundations of Bioinformatics (Fall 2017)

Challenging algorithms in bioinformatics

bioinformatica 6EF2F181AA1830ABC10ABAC56EA5E191 Bioinformatica 1 / 5

Introduction to Bioinformatics

Pioneering Clinical Omics

Microbial Metabolism Systems Microbiology

Practical Bioinformatics for Biologists (BIOS 441/641)

Q2 (1 point). How many carbon atoms does a glucose molecule contain?

M. Phil. (Computer Science) Programme < >

From Proteomics to Systems Biology. Integration of omics - information

Motivation From Protein to Gene

Genomes contain all of the information needed for an organism to grow and survive.

ECS 234: Introduction to Computational Functional Genomics ECS 234

Introduction to Bioinformatics

Practical Bioinformatics for Biologists (BIOS493/700)

Prentice Hall Science Explorer: Physical Science 2002 Correlated to: Idaho State Board of Education Achievement Standards for Science (Grades 9-12)

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson

21.5 The "Omics" Revolution Has Created a New Era of Biological Research

Correlations between the expression of genes encoding enzymes in the Krebs cycle and control of DNA replication in human cells

DOWNLOAD OR READ : UNDERSTANDING BIOINFORMATICS PDF EBOOK EPUB MOBI

Biology Registration Newsletter for Fall 2018 Courses

GREG GIBSON SPENCER V. MUSE

NGS Approaches to Epigenomics

Bioinformatics 2. Lecture 1

Drosophila White Paper 2003 August 13, 2003

Chapter 15 THE HUMAN GENOME PROJECT AND GENOMICS

Mixing and matching, the power of combinatorial evolution:

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Course Competencies Template Form 112

Lecture 1. Bioinformatics 2. About me... The class (2009) Course Outcomes. What do I think you know?

BIOL 205. Fall Term (2017)

C57BL/6N Female. C57BL/6N Male. Prl2 WT Allele. Prl2 Targeted Allele **** **** WT Het KO PRL bp. 230 bp CNX. C57BL/6N Male 30.

Genetics Lecture 21 Recombinant DNA

Name period date AP BIO- 2 nd QTR 6 Week Test Review

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

Human Genomics. Higher Human Biology

Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017

College- and Career Readiness Standards for Science Genetics

Dna Technology And Genomics Study Guide READ ONLINE

Introduction to Bioinformatics

Why learn sequence database searching? Searching Molecular Databases with BLAST

Introduction to Bioinformatics and Gene Expression Technology

Molecular Biotechnology. Principles and applications of recombinant DNA

From Poisson Approximations to the Blueprint of Life

Conference Feedback Report

ST 591: Introduction to Quantitative Genomics Syllabus

Introduction to 'Omics and Bioinformatics

Active Learning Exercise 9. The Hereditary Material: DNA

Examination Assignments

ECS 234: Introduction to Computational Functional Genomics ECS 234

STUDY GUIDE SECTION 13-1 DNA Technology

Introduction to Bioinformatics. Ulf Leser

Applications in Bio-informatics and Biomedical Engineering

OMICS International Conferences

Chapter 10 Genetic Engineering: A Revolution in Molecular Biology

1.1 What is bioinformatics? What is computational biology?

Introduction to Genome Biology

Expression of the genome. Books: 1. Molecular biology of the gene: Watson et al 2. Genetics: Peter J. Russell

Human Genomics. 1 P a g e

Department of Genetics

CHAPTER 21 LECTURE SLIDES

Transcription:

Computational Methods in Bioinformatics Ying Xu 2017/12/6 1

What We Intend to Teach A general introduction to the field of bioinformatics what problems people have been and are currently working on how people solve these problems what key computational techniques are used and needed how much help computing has provided to biological research A way of thinking -- tackling biological problem computationally how to look at a biological problem from a computational point of view how to collect statistics from biological data how to build a computational model how to solve a computational modeling problem algorithmically 2017/12/6 2

What We Intend to Teach Some exposure to computational biology and bioinformatics research what are some of the interesting research areas key challenges available tools and resources Basic topics of bioinformatics, covering computational genomics computational proteomics structural bioinformatics computational systems biology 2017/12/6 3

Detailed Topics Introduction to bioinformatics Biosequence alignment Computational gene finding Protein structure prediction transcription regulation motif finding gene-expression data analysis protein function prediction comparative genomics, including operon prediction pathways and network prediction 2017/12/6 4

Prerequisites A little knowledge in computer programming Basic knowledge of molecular biology Strong interests in learning bioinformatics and computational biology! 2017/12/6 5

Reference book References Current topics in computational molecular biology, T Jiang, Y Xu and MQ Zhang, MIT Press, 2002 Required reading Primer on Molecular Genetics, US Department of Energy Handouts by instructors http://csbl.bmb.uga.edu/mirrors/jlu/bioinformaticscourse/ 2017/12/6 6

Workload Regular class attendance One term project To analyze a bacterial genome to predict and summarize Protein encoding genes Functions Operons Structures 2017/12/6 7

A Brief Introduction to Bioinformatics and Computational Biology 2017/12/6 8

The Basics ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatc gtgtgggtagtagctgatatgatgcgaggtaggggataggatagc aacagatgagcggatgctgagtgcagtggcatgcgatgtcgatga tagcggtaggtagacttcgcgcataaagctgcgcgagatgattgc aaagragttagatgagctgatgctagaggtcagtgactgatgatcg atgcatgcatggatgatgcagctgatcgatgtagatgcaataagtc gatgatcgatgatgatgctagatgatagctagatgtgatcgatggta ggtaggatggtaggtaaattgatagatgctagatcgtaggta cell chromosome genome: DNA sequence genes protein metabolic pathway/network 2017/12/6 9

Information storage Working molecules 2017/12/6 10

Bioinformatics (or computational biology) ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatc gtgtgggtagtagctgatatgatgcgaggtaggggataggatagc aacagatgagcggatgctgagtgcagtggcatgcgatgtcgatga tagcggtaggtagacttcgcgcataaagctgcgcgagatgattgc aaagragttagatgagctgatgctagaggtcagtgactgatgatcg atgcatgcatggatgatgcagctgatcgatgtagatgcaataagtc gatgatcgatgatgatgctagatgatagctagatgtgatcgatggta ggtaggatggtaggtaaattgatagatgctagatcgtaggta This interdisciplinary science is about providing computational support to studies on linking the behavior of cells, organisms and populations to the information encoded in the genomes Temple Smith 2017/12/6 11

Bioinformatics It is about developing and using computational techniques to interpret biological data predict structures and functions of biological entities model the dynamic behavior of biological processes and systems People have used mathematical or computing techniques to solve biological problems since early 1900 s e.g., evolution and genetic analyses by R.A. Fisher, J.B.S. Haldane, S. Wright So what is new? 2017/12/6 12

A Historical Perspective Realization of the existence of genes in our cells by Hermann Müller, a student of Morgan (1921) Understanding of the physical natures of genes by F. Sanger (e.g., 1949), E. Chargaff (e.g., 1950), J. Kendrew (e.g., 1958) in 40 and 50 s 2017/12/6 13

A Historical Perspective Understanding of the double helical structure of DNA by James Watson and Frances Crick in 1953 Development of sequencing technology, first of proteins and then of genomic DNA, based on the work of F. Sanger on sequencing of insulin (1956), W Gilbert and A. Maxam on sequencing of Lactose operator (1977) which demonstrated that the genetic sequence of a genome, including human s, is sequence-able! 2017/12/6 14

A Historical Perspective Development of a science of analyzing protein and DNA sequences, particularly in protein sequence analyses and evolution by Margaret Dayhoff (60 s) phylogenetic analyses and comparative sequence analyses by W. Fitch and E. Margoliash (1967) and by R. Doolittle (1983) 2017/12/6 15

A Historical Perspective Development of sequence comparison algorithms Needleman and Wunsch (1970) Smith and Waterman (1981) Organization of biological data into databases Protein Data Bank (PDB, 1973) of protein structures GENBANK (1982) of DNA sequences Computational methods for gene finding in genomic sequences Work by Borodovsky, Claverie, Uberbacher from mid-80 s to early 90 s 2017/12/6 16

A Historical Perspective Sequencing of Human and other genomes (1986 2003) Development of high-throughput measurement technologies microarray/rna-seq for functional states of genes two-hybrid systems for protein-protein interactions DNA methylation data for epigenomic information structural genomics for structure determination ccgtacgtacgtagagtgctagtct agtcgtagcgccgtagtcgatcgt gtgggtagtagctgatatgatgcga ggtaggggataggatagcaacag atgagcggatgctgagtgcagtgg 2017/12/6 catgcgatgtgatagctagatgtga 17 tcgatggtaggtaggatggtaggt genes

A Historical Perspective These high-throughput probing technologies and others are being used to generate enormous amounts of data about the existence, the structure, the functional state, the relationship of biological molecules and machineries.. but what are these data telling us? 2017/12/6 18

A DNA Sequence ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgg gtagtagctgatatgatgcgaggtaggggataggatagcaacagatgagc ggatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttc gcgcataaagctgcgcgagatgattgcaaagragttagatgagctgatgcta gaggtcagtgactgatgatcgatgcatgcatggatgatgcagctgatcgatgt agatgcaataagtcgatgatcgatgatgatgctagatgatagctagatgtgat cgatggtaggtaggatggtaggtaaattgatagatgctagatcgtaggtagta gctagatgcagggataaacacacggaggcgagtgatcggtaccgggctg aggtgttagctaatgatgagtacgtatgaggcaggatgagtgacccgatgag gctagatgcgatggatggatcgatgatcgatgcatggtgatgcgatgctagat gatgtgtgtcagtaagtaagcgatgcggctgctgagagcgtaggcccgaga ggagagatgtaggaggaaggtttgatggtagttgtagatgattgtgtagttgta gctgatagtgatgatcgtag 2017/12/6 19

A Historical Perspective So what is new? It is the amount, the type of and relationships among the biological data about the cellular states, molecular structures and functions, generated by highthroughput technologies, that have driven the rapid advancement of bioinformatics! 2017/12/6 20

An Example of Computation for Biology Lactococcus is a premier model microorganism for a wide array of studies in molecular biology, and is nonpathogenic Streptococcus is closely related to Lactococcus and could become pathogenic upon triggers Question: What make one pathogenic and the other nonpathogenic? specific genes? unique pathways? different regulatory mechanisms? 2017/12/6 21

An Example of Computation for Biology X years ago,. to search for potential genes that possibly make the difference, researchers had to remove various parts of DNA sequence, then observe if they may have any relevance x x x x acggtcgtacgtacgtgttagccgataatccagtgtgagatacacatcatcgaaacacatgaggcgtgcgatagatgatcc...???? This could be a very lengthy process 2017/12/6 22

An Example of Computation for Biology Since the Human genome project (1986), computational scientists have developed computer programs to locate genes in genomic DNA sequence GRAIL, Gene-Scan, Glimmer,. acggtcgtacgtacgtgttagccgataatccagtgtgagatacacatcatcgaaacacatgaggcgtgcgatagatgatcc... genes With gene-prediction programs, researchers only need to knock-out regions predicted to be genes in their search for relevant genes 2017/12/6 23

Example of Computation for Biology Over the years, many genes have been thoroughly studied in different organisms, e.g., human, mouse, fly,., rice, their biological functions have been identified and documented Computational scientists have developed computer programs to associate newly identified genes to genes with known functions! existing methods can associate > 60-80% of newly identified genes to genes with known functions Now, researchers only need to knock-out genes with possibly relevant functions in their search for understanding of a

Example of Computation for Biology Computational programs have been springing out that can predict if two proteins interact with each other if a group of gene products work in the same pathway which regulators regulate a particular pathway functions of genes at a genome scale se capabilities allow researchers to study complex biology problems like erstanding the difference between Lactococcus and Streptococcus in a

Example of Computation for Biology comparative genome analysis entify unique genes in each genome entify unique metabolic pathways and regulatory networks

Examples of Computation for Biology omputer-assisted drug design 3D structure models of G protein-coupled receptors were used to computationally screen 100,000+ compounds as possible drug targets and 100 were selected Follow-up experiments confirmed a high hit rate of 12%-21%

Examples of Computation for Biology omputational analysis of Plasmodium falciparum metabolism Plasmodium causes human malaria computational prediction of metabolic pathways of plasmodium computational simulations have helped to identify 216 chokepoints in this pathway model among all 24 previously suggested drug targets, 21 target at the chokepoints among the three popular drugs for malaria, they all targeted at the chokepoints

Computation for Biology Though computation may not solve a biological problem directly, it can help quickly narrow down the search space rching a needle in a haystack

Change of Paradigm he human genome sequencing project has led to fundamental hanges in how biological science is done! It represents biology s first foray into big science Science, editorial, 2003 he coordinated efforts in high-throughput production of biological ata beyond sequences have fueled the rapid transition of biology om cottage industry science to big science functional genomic data structural genomic data epigenomic data proteomic data metabolonic data

Change of Paradigm h-throughput ta production data mining and interpretation models or hypotheses domain expertise rational experimental design specific data generation

Change of Paradigm Howdy, want to do biology together? omic data

amples of Integrated Computational and Experimental Biology ntification of disordered regions in proteins perimental data suggest that some portions of some protein uctures do not have rigid structures mputational studies suggested that disorder is a common enomena, particularly in signaling proteins, validated experimentally mputational prediction ggests that protein-protein eraction interfaces are often ordered

DH What Could Potentially be Done glucose CO 2 PEP 2NAD 2NADH lactate Biosynthesis in E coli pyruvate NAD NADH NAD NADH NAD NADH formate CO 2 acetyl CoA oxaloacetate ADP ATP 2NADH 2NAD acetate acetate malate citrate ethanol ethanol fumarate isocitrate

Biology for Computation hat computational science is to molecular iology is like what mathematics has been to hysics...

Look into the Future ike physics, where general rules and laws are taught at e start, biology will surely be presented to future enerations of students as a set of basic systems... uplicated and adapted to a very wide range of cellular nd organismic functions, following basic evolutionary rinciples constrained by Earth s geological history. Temple Smith, Current Topics in Computational Molecular Biology

Take-Home Message he driving force of bioinformatics is biological data roduction through high-throughput technologies omputation is becoming increasingly indispensable in iological research ombination of high-throughput data generation and omputation allows scientists to look at more complex iological problems at systems level