Bioinformatics (Globex, Summer 2015) Lecture 1

Similar documents
CISC 436/636 Computational Biology &Bioinformatics (Fall 2016) Lecture 1

Lecture 2: Central Dogma of Molecular Biology & Intro to Programming

Central Dogma. 1. Human genetic material is represented in the diagram below.

DNA. translation. base pairing rules for DNA Replication. thymine. cytosine. amino acids. The building blocks of proteins are?

90 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 4, 2006

Introduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute

translation The building blocks of proteins are? amino acids nitrogen containing bases like A, G, T, C, and U Complementary base pairing links

DNA Structure & the Genome. Bio160 General Biology

What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases.

AGENDA for 10/10/13 AGENDA: HOMEWORK: Due end of the period OBJECTIVES: Due Fri, 10-11

Chapter 13 - Concept Mapping

DNA is the genetic material. DNA structure. Chapter 7: DNA Replication, Transcription & Translation; Mutations & Ames test

AGENDA for 10/11/13 AGENDA: HOMEWORK: Due end of the period OBJECTIVES:

Adv Biology: DNA and RNA Study Guide

Introduction to Bioinformatics

Nucleic acids and protein synthesis

DNA replication. Begins at specific sites on a double helix. Proceeds in both directions. Is initiated at many points in eukaryotic chromosomes.

BIOLOGY LTF DIAGNOSTIC TEST DNA to PROTEIN & BIOTECHNOLOGY

Introducing Bioinformatics Concepts in CS1

NON MENDELIAN GENETICS. DNA, PROTEIN SYNTHESIS, MUTATIONS DUE DECEMBER 8TH

DNA and RNA 2/14/2017. What is a Nucleic Acid? Parts of Nucleic Acid. DNA Structure. RNA Structure. DNA vs RNA. Nitrogen bases.

THE COMPONENTS & STRUCTURE OF DNA

Name Class Date. Information and Heredity, Cellular Basis of Life Q: What is the structure of DNA, and how does it function in genetic inheritance?

DNA and RNA. Chapter 12

DNA replication: Enzymes link the aligned nucleotides by phosphodiester bonds to form a continuous strand.

DNA vs. RNA B-4.1. Compare DNA and RNA in terms of structure, nucleotides and base pairs.

STUDY GUIDE SECTION 10-1 Discovery of DNA

Nucleic Acids: DNA and RNA

O C. 5 th C. 3 rd C. the national health museum

Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes

Name: Date: Pd: Nucleic acids

Nucleic acids. What important polymer is located in the nucleus? is the instructions for making a cell's.

Bundle 5 Test Review

DNA Structure and Analysis. Chapter 4: Background

Division Ave. High School Ms. Foglia AP Biology. Nucleic acids. AP Biology Nucleic Acids. Information storage

Chapter 3 Nucleic Acids, Proteins, and Enzymes

Biochemistry study of the molecular basis of life

DNA - The Double Helix

Name Class Date. Practice Test

Nucleic acids AP Biology

Understanding DNA Structure

DNA - The Double Helix

Replication Review. 1. What is DNA Replication? 2. Where does DNA Replication take place in eukaryotic cells?

Molecular Biology Primer. CptS 580, Computational Genomics, Spring 09

DNA - The Double Helix

DNA - The Double Helix

MATH 5610, Computational Biology

Chapter 8: DNA and RNA

DNA, Replication and RNA

DNA Replication. Packet #17 Chapter #16

CHAPTER 17 FROM GENE TO PROTEIN. Section C: The Synthesis of Protein

Chromosomes. Chromosomes. Genes. Strands of DNA that contain all of the genes an organism needs to survive and reproduce

Prokaryotic Transcription

RNA & PROTEIN SYNTHESIS

DNA- THE MOLECULE OF LIFE. Link

DNA- THE MOLECULE OF LIFE

DNA, Genes and their Regula4on. Me#e Voldby Larsen PhD, Assistant Professor

Multiple choice questions (numbers in brackets indicate the number of correct answers)

By the end of today, you will have an answer to: How can 1 strand of DNA serve as a template for replication?

Chapter 13. From DNA to Protein

Examination Assignments

Molecular Genetics I DNA

PROTEIN SYNTHESIS Flow of Genetic Information The flow of genetic information can be symbolized as: DNA RNA Protein

AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1

Friday, April 17 th. Crash Course: DNA, Transcription and Translation. AP Biology

BIOINFORMATICS Introduction

UNIT 4. DNA, RNA, and Gene Expression

4) separates the DNA strands during replication a. A b. B c. C d. D e. E. 5) covalently connects segments of DNA a. A b. B c. C d. D e.

PowerPoint Notes on Chapter 9 - DNA: The Genetic Material

The Central Dogma of Molecular Biology

Basic Bioinformatics: Homology, Sequence Alignment,

For items 1-3 utilize the following information. If an answer cannot be derived write cannot be determined.

The Molecular Basis of Inheritance

Manipulating DNA. Nucleic acids are chemically different from other macromolecules such as proteins and carbohydrates.

Molecular Biology of the Gene

Chapter 10. DNA: The Molecule of Heredity. Lectures by Gregory Ahearn. University of North Florida. Copyright 2009 Pearson Education, Inc.

DNA: The Molecule of Heredity How did scientists discover that genes are made of DNA?

RNA Secondary Structure Prediction

RNA and Protein Synthesis

Griffith Avery Franklin Watson and Crick

Nucleic acids deoxyribonucleic acid (DNA) ribonucleic acid (RNA) nucleotide

Lecture 3 (FW) January 28, 2009 Cloning of DNA; PCR amplification Reading assignment: Cloning, ; ; 330 PCR, ; 329.

1.5 Nucleic Acids and Their Functions Page 1 S. Preston 1

DNA Replication AP Biology

CHAPTER 20 DNA TECHNOLOGY AND GENOMICS. Section A: DNA Cloning

Genome Sequence Assembly

Genome Architecture Structural Subdivisons

Protein Synthesis

Genetic material must be able to:

From Gene to Protein Transcription and Translation

GAUTENG DEPARTMENT OF EDUCATION SENIOR SECONDARY INTERVENTION PROGRAMME. LIFE SCIENCE Grade 12 Session 9: Nucleic Acids DNA and RNA (LEARNER NOTES)

DNA Structure and Protein synthesis

Chapter 10 - Molecular Biology of the Gene

An introduction to genetics and molecular biology

The common structure of a DNA nucleotide. Hewitt

Active Learning Exercise 9. The Hereditary Material: DNA

2. The instructions for making a protein are provided by a gene, which is a specific segment of a molecule.

Fig. 16-7a. 5 end Hydrogen bond 3 end. 1 nm. 3.4 nm nm

Transcription:

Bioinformatics (Globex, Summer 2015) Lecture 1 Course Overview Li Liao Computer and Information Sciences University of Delaware

Dela-where? 2

Administrative stuff Syllabus and tentative schedule Workload 4 homework assignments (40%) Mid-term exam (20%) Final exam (40%) Late policy: Only one day delay is allowed; subject to 15% penalty.

Bioinformatics Books Markketa Zvelebil and Jeremy Baum, Understanding Bioinformatics, Garland Science, 2008. Dan E. Krane & Michael L. Raymer, Fundamental Concepts of Bioinformatics, Benjamin Cummings 2002 R. Durbin, S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998. João Meidanis & João Carlos Setubal. Introduction to Computational Molecular Biology. PWS Publishing Company, Boston, 1996. Peter Clote and Rolf Backofen, Computational Molecular Biology: An Introduction, Willey 2000. Dan Gusfield. Algorithms on String, Trees, and Sequences. Cambridge University Press, 1997. P. Baldi and S. Brunak, Bioinformatics, The Machine Learning Approach, The MIT press, 1998. D.W. Mount, Bioinformaics: Sequence and Genome Analysis, CSHLP 2004.

Molecular Biology Books Free materials: Kimball's biology Lawrence Hunter: Molecular biology for computer scientists DOE s Molecular Genetics Primer Books: Instant Notes series: Biochemistry, Molecular Biology, and Genetics Molecular Biology of The Cell, by Alberts et al

Bioinformatics - use and develop computing methods to solve biological problems The field is characterized by an explosion of data Big data!!! difficulty in interpreting the data a large number of open problems until recently, relative lack of sophistication of computational techniques (compared with, say, signal processing, graphics, etc.)

Why is this course good for you? Expand your knowledge base Bioinformatics is a computational wing of biotechnology. Computational techniques in bioinformatics are widely useful elsewhere - Cybergenomics: detect, dissect malware Big data!!! Job market is strong

Industry is moving in IBM: BlueGene, the fastest computer with 1 million CPU Blueprint worldwide collects all the protein information Bioinformatics segment will be $40 billion in 2004 up from $22 billion in 2000 GlaxoSmithKline Celera Merck AstraZeneca

Computing and IT skills Algorithm design and model building Working with unix system/web server Programming (in PERL, Java, etc.) RDBMS: SQL, Oracle PL/SQL

People International Society for Computational Biology (www.iscb.org) ~ 3400 members (as of 2015) Severe shortage for qualified bioinformatians / bioinformaticists

Conferences ISMB (Intelligent Systems for Molecular Biology) started in 1992 RECOMB (International Conference on Computational Molecular Biology) started in 1997 PSB (Pacific Symposium on Biocomputing) started 1996 ACM-BCB BIBM...

Journals Bioinformatics BMC Bioinformatics PLOS Computational Biology Journal of Computational Biology Genome Biology Genomics Genome Research Nucleic Acids Research...

A short history after human genome: 2000 -- 2010

Human Genome at Ten:

Human Genome at Ten:

Human Genome at Ten:

Human Genome at Ten:

It is much easier to teach people with those skills about biology than to teach biologists how to code well.

Tree of Life

10-9 10-6 10-3

Organisms: three kingdoms -- eukaryotes, eubacteria, and archea Cell: the basic unit of life Chromosome (DNA) > circular, also called plasmid when small (for bacteria) > linear (for eukaryotes) Genes: segments on DNA that contain the instructions for organism's structure and function Proteins: the workhorse for the cell. > establishment and maintenance of structure > transport. e.g., hemoglobin, and integral transmembrane proteins > protection and defense. e.g., immunoglobin G > Control and regulation. e.g., receptors, and DNA binding proteins > Catalysis. e.g., enzymes

Bioinformatics in a cell

Central Dogma of Molecular Biology Genetic Information Flow:

Cellular processes are programmed, regulated, and yet are stochastic

Small molecules: > sugar: carbohydrate > fatty acids > nucleotides: A, C, G, T --> DNA (double helix, hydrogen bond, complementary bases A-T, G-C) four bases: adenine, cytosine, guanine, and thymidine (uracil) 5' end phosphate group 3' end is free 1' position is attached with the base double strand DNA sequences form a helix via hydrogen bonds between complementary bases hydrogen bond: - weak: about 3~5 kj/mol (A covalent C-C bond has 380 kj/mol), will break when heated - saturation: - specific:

Hack the life atcgggctatcgatagctatagcgcgatatatcgcgcgtatatgcgcgcatattag tagctagtgctgattcatctggactgtcgtaatatatacgcgcccggctatcgcgct atgcgcgatatcgcgcggcgctatataaatattaaaaaataaaatatatatatatgc tgcgcgatagcgctataggcgcgctatccatatataggcgctcgcccgggcgcga tgcatcggctacggctagctgtagctagtcggcgattagcggcttatatgcggcga gcgatgagagtcgcggctataggcttaggctatagcgctagtatatagcggctagc cgcgtagacgcgatagcgtagctagcggcgcgcgtatatagcgcttaagagcca aaatgcgtctagcgctataatatgcgctatagctatatgcggctattatatagcgca gcgctagctagcgtatcaggcgaggagatcgatgctactgatcgatgctagagca gcgtcatgctagtagtgccatatatatgctgagcgcgcgtagctcgatattacgcta cctagatgctagcgagctatgatcgtagca.

Genetic Code: codons

Information Expression 1-D information array 3-D biochemical structure

Challenges in Life Sciences Understanding correlation between genotype and phenotype Predicting genotype <=> phenotype Phenotypes: drug/therapy response drug-drug interactions for expression drug mechanism interacting pathways of metabolism

Profiling 14 Structure Prediction Credit:Kellis & Indyk

Topics Mapping and assembly Sequence analysis (Similarity -> Homology): Pairwise alignment (database searching) Multiple sequence alignment, profiling Gene prediction Pattern (Motif) discovery and recognition Phylogenetics analysis Character based Distance based Structure prediction RNA Secondary Protein Secondary & tertiary Network analysis: Gene expression Regulatory networks

How much should I know about biology? - Apparently, the more the better - The least, Pavzner's 3-page "All you need to know about Molecular biology". - Chapters 1 & 2 of the text. - We adopt an "object-oriented" scheme, namely, we will transform biological problems into abstract computing problems and hide unnecessary details. So another big goal of this course is learn how to do abstraction.

Goals? At the end of this course, you should be able to - Describe the main computational challenges in molecular biology. - Implement and use basic algorithms. - Describe several advanced algorithms. Sequence alignment using dynamics programming Hidden Markov models Hierarchical clustering K-means Gradient descent optimization - Know the existing resources: Databases, Software,