7.91 Lecture #1 Introduction to Bioinformatics

Size: px
Start display at page:

Download "7.91 Lecture #1 Introduction to Bioinformatics"

Transcription

1 7.91 Lecture #1 Introduction to Bioinformatics Focus on Kinases Michael Yaffe & Pairwise Sequence Comparisons ARDFSHGLLENKLLGCDSMRWE.::..:::..:::: :::. GRDYKMALLEQWILGCD-MRWD Reading: This lecture: Mount pp.1-7, 29-35, 45-48, Next lecture: Mount pp. 8-9, 65-89, , +more The Protein Data Bank (PDB - is the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data. Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne. The Protein Data Bank. Nucleic Acids Research 28 (2): (PDB Advisory Notice on using materials available in the archive:

2 Outline Review of biological fundamentals Genetics example Biological databases/ncbi resources Simplified sequence analysis where to start Simple sequence comparisons Definitions of related sequences Concepts and types of alignments the good, the bad, and the ugly Dot matrix alignments Computational efficiency Recursion and dynamic programming Substitution matrices: PAM, BLOSUM, Gonnet

3 Outline (cont) Gaps Applied dynamic programming: global alignments: Needleman-Wunsch Applied dynamic programming: local alignments Smith-Waterman Basic statistics of sequence alignments

4 Review of biological fundamentals Genetic material gene as a concept DNA as hereditary material DNA RNA Protein Critical cell functions

5 DNA structure A G C T U Bases

6 DNA structure A T G C Base pairing

7 DNA structure Nucleoside Nucleotide

8 DNA structure 5 3

9 DNA structure 5 3

10 DNA structure

11 RNA structure

12 Transcription OH 3 5 DNA OH RNA OH 5 3

13 Protein structure

14 Nonpolar, Hydrophobic R-groups H O H O H O H O H O H 3 N + C C H H O - 3 N + C C H CH O - 3 N + C C H H O - 3 N + 3 N + C C C C 3 CH CH O - H CH O C CH3 CH3 CH 3 CH 3 CH2 Glycine (Gly, G) Alanine (Ala, A) Valine (Val, V) CH 3 Leucine (Leu, L) Isoleucine (IIe, I) C C H O H O H O H O H 3 N + C C H 3 N + H 3 N + C C H 2 N + C C O CH - O - 2 CH2 CH 2 CH2 NH O - H 2 C CH 2 CH 2 O - S CH 3 Phenylalanine (Phe, F) Proline (Pro, P) Methionine (Met, M) Tryptophan (Trp, W) Polar, Hydrophillic R-groups H O H 3 N + C C O - CH2 OH Serine (Ser, S) H H O H O H O H O O H H H 3 N + C C H O - 3 N + C C H O - 3 N + C C 3 N + 3 N + C C C C O O - O CH2 CH - CH - CH2 CH2 2 OH CH C CH 3 SH 2 OH NH 2 O C NH 2 O Threonine (Thr, T) Cysteine (Cys, C) Tyrosine (Tyr, Y) Asparagine (Asn, N) Glutamine (Gln, Q) Electrically charged Acidic H O H 3 N + H C C O O - H C C CH2 3 N + H 3 N + O - CH2 C - O O CH 2 C - O O Glutamic acid (Glu, E) Aspartic acid (Asp, D) Basic H O H O C C H 3 N + C C CH2 O - CH2 O - CH 2 CH 2 CH 2 CH 2 CH 2 NH NH + 3 C NH + 2 Lysine (Lys, K) NH 2 Arginine (Arg, R) Aspartic acid (Asp, D) Arginine (Arg, R) O H C H 3 N + C O - CH2 NH NH + Histidine (His, H)

15 Codon Table (5')...pNpNpN...(3') in mrna Base at 5' End of Codon U Middle Base of Codon C A G Base at 3' End of Codon U phe (UUU) ser tyr cys U phe ser tyr cys C leu ser termination termination A leu ser termination trp G C leu leu leu pro pro pro his his gln arg arg arg U C A leu pro gln arg G A ile thr asn ser U ile thr asn ser C ile thr lys arg A met (and initiation) thr lys arg G G val val ala ala asp asp gly gly U C val ala glu gly A val ala glu gly G

16 DNA structure 3 5 Sense strand Anti-Sense strand 5 Convention: Sense strand Read 5 Æ 3 3 Reverse Complement

17 Genetics experiment: Isolate a yeat mutant that has increased chromosome number phenotype Can rescue the phenotype with a piece of DNA corresponds to a site of mutation in the yeasts DNA genotype CGTTTTTCCTGTAAAGCGCTTAATTTGTTTACCATTCTATAAAAACCTTGAGCTAAGGCCAACTGATGCA ATTGCTCAAGTGAATGCATAAACAAAGCAAGATCATTCTTAGCGCAAAAAAAACTGGGATTTTGAAATAC AACAAAAGAAAGAAGTAAAAAGGGAATGCAACGCAATAGTTTAGTAAATATCAAACTAAACGCTAATTCG CCATCGAAAAAGACCACAACAAGACCAAATACGTCCAGGATCAATAAACCATGGAGAATATCCCATTCGC CGCAGCAAAGAAACCCGAATTCAAAAATACCTTCACCTGTAAGAGAAAAATTGAACAGATTACCTGTAAA CAATAAGAAGTTTTTGGATATGGAAAGCTCCAAAATTCCATCACCTATAAGGAAAGCGACTTCTTCCAAA ATGATACACGAAAATAAGAAGCTACCTAAATTTAAATCCCTATCACTCGATGACTTTGAACTGGGGAAGA AATTAGGAAAGGGTAAATTCGGTAAAGTTTATTGCGTTCGGCACAGGAGTACAGGATATATTTGCGCACT GAAAGTAATGGAGAAGGAAGAAATAATAAAGTATAATTTACAGAAACAATTCAGAAGGGAGGTAGAAATA CAAACATCGCTAAATCATCCGAATCTAACTAAATCATACGGCTATTTTCATGATGAAAAAAGAGTGTACC TGCTAATGGAATACTTAGTCAATGGGGAAATGTATAAACTATTGAGGTTACACGGACCCTTCAACGATAT TTTAGCATCAGATTATATTTATCAAATTGCCAATGCCCTAGATTATATGCATAAAAAGAATATTATTCAT AGAGATATTAAACCTGAAAATATACTAATAGGGTTCAATAATGTCATTAAGTTAACGGACTTCGGATGGA GTATAATAAATCCGCCAGAAAATAGAAGGAAAACTGTCTGTGGGACAATTGACTACCTTTCTCCAGAAAT GGTGGAGTCAAGGGAATATGATCACACTATAGATGCATGGGCTCTTGGCGTCCTGGCGTTTGAACTACTG ACCGGTGCCCCTCCGTTCGAAGAAGAAATGAAAGATACTACATATAAAAGGATAGCAGCACTGGATATCA AAATGCCCAGTAACATTTCTCAGGATGCGCAAGATTTAATACTTAAACTACTAAAATACGACCCCAAAGA TAGAATGCGCCTTGGAGACGTAAAAATGCATCCTTGGATACTAAGAAACAAGCCCTTTTGGGAAAATAAG CGGTTATAGAATTAAAGTATGACAGAATCGTTTGAAGGGCACTATTAATCACTCCCGCACATATCACATA ATACTAAGTATCCATTTCTAATATTTCATTACTCTTTTCGGCATCGTATATTGCGATATTTGATTAAATT TTCTTGTTCATTTTTTCCTCTTTTCTTTCGCCTTTGTCGAAAGAAAAGAGGAAAACAAGCTGAAAATTGC TATGCATTAAAGTAGCAGATTTACTTTGTTGAGTTGGTTCTGATCAATAATAAGAGTAATGAAAGAAAGC AAAAAAATGGCTAAAGATAATTTAACTAATTTGCTCTCTCAATTGAACATTCAATTGTCTCAA

18 The National Center for Biotechnology Information Created as a part of NLM in 1988 Establish public databases Perform research in computational biology Develop software tools for sequence analysis Disseminate biomedical information

19 Molecular Databases Primary Databases Original submissions by experimentalists Database staff organize but don t add additional information Example: GenBank Derivative Databases Human curated compilation and correction of data Example: SWISS-PROT, NCBI RefSeq mrna Computationally Derived Example: UniGene Combinations Example: NCBI Genome Assembly

20 What is GenBank? NCBI s Primary Sequence Database Nucleotide only sequence database Archival in nature GenBank Data Direct submissions individual records (BankIt, Sequin) Batch submissions via (EST, GSS, STS) ftp accounts sequencing centers Data shared three collaborating databases GenBank DNA Database of Japan (DDBJ). European Molecular Biology Laboratory Database (EMBL) at EBI.

21 STSs BACs X Chromosome 163 Mb Relationships of chromosomes to genome sequencing markers. The X chromosome is about 163 Mb in length. In this diagram, there are 16 overlapping BAC clones that span the entire length. In reality, 1,48 BACs were needed to span the X chromosome. Arrows (top) mark STSs scattered throughout the chromosome and on overlapping BACs.

22 GenBank: NCBI s Primary Sequence Database Release 133 December 23 19,88,11 Records 28,57,99,166 Nucleotides 11, + Species full release every two months incremental and cumulative updates daily available only through internet ftp://ftp.ncbi.nih.gov/genbank/ 17.7 Gigabytes of data

23 GenBank Divisions Bulk Sequence Divisions PAT EST STS GSS HTG HTC CON Patent Expressed Sequence Tags Sequence Tagged Sites Genome Survey Sequences High Throughput Genome High Throughput cdna Contig Traditional Divisions BCT INV MAM PHG PLN PRI ROD SYN UNA VRL VRT

24 The International Sequence Database Collaboration NIH Submissions Updates NCBI Entrez GenBank EMBL Submissions Updates NIG CIB getentry DDBJ Submissions Updates SRS EBI EMBL

25 Web Access Text Entrez Sequence BLAST Structure VAST

26 Genetics experiment: Isolate a yeat mutant that has increased chromosome number phenotype Can rescue the phenotype with a piece of DNA corresponds to a site of mutation in the yeasts DNA genotype CGTTTTTCCTGTAAAGCGCTTAATTTGTTTACCATTCTATAAAAACCTTGAGCTAAGGCCAACTGATGCA ATTGCTCAAGTGAATGCATAAACAAAGCAAGATCATTCTTAGCGCAAAAAAAACTGGGATTTTGAAATAC AACAAAAGAAAGAAGTAAAAAGGGAATGCAACGCAATAGTTTAGTAAATATCAAACTAAACGCTAATTCG CCATCGAAAAAGACCACAACAAGACCAAATACGTCCAGGATCAATAAACCATGGAGAATATCCCATTCGC CGCAGCAAAGAAACCCGAATTCAAAAATACCTTCACCTGTAAGAGAAAAATTGAACAGATTACCTGTAAA CAATAAGAAGTTTTTGGATATGGAAAGCTCCAAAATTCCATCACCTATAAGGAAAGCGACTTCTTCCAAA ATGATACACGAAAATAAGAAGCTACCTAAATTTAAATCCCTATCACTCGATGACTTTGAACTGGGGAAGA AATTAGGAAAGGGTAAATTCGGTAAAGTTTATTGCGTTCGGCACAGGAGTACAGGATATATTTGCGCACT GAAAGTAATGGAGAAGGAAGAAATAATAAAGTATAATTTACAGAAACAATTCAGAAGGGAGGTAGAAATA CAAACATCGCTAAATCATCCGAATCTAACTAAATCATACGGCTATTTTCATGATGAAAAAAGAGTGTACC TGCTAATGGAATACTTAGTCAATGGGGAAATGTATAAACTATTGAGGTTACACGGACCCTTCAACGATAT TTTAGCATCAGATTATATTTATCAAATTGCCAATGCCCTAGATTATATGCATAAAAAGAATATTATTCAT AGAGATATTAAACCTGAAAATATACTAATAGGGTTCAATAATGTCATTAAGTTAACGGACTTCGGATGGA GTATAATAAATCCGCCAGAAAATAGAAGGAAAACTGTCTGTGGGACAATTGACTACCTTTCTCCAGAAAT GGTGGAGTCAAGGGAATATGATCACACTATAGATGCATGGGCTCTTGGCGTCCTGGCGTTTGAACTACTG ACCGGTGCCCCTCCGTTCGAAGAAGAAATGAAAGATACTACATATAAAAGGATAGCAGCACTGGATATCA AAATGCCCAGTAACATTTCTCAGGATGCGCAAGATTTAATACTTAAACTACTAAAATACGACCCCAAAGA TAGAATGCGCCTTGGAGACGTAAAAATGCATCCTTGGATACTAAGAAACAAGCCCTTTTGGGAAAATAAG CGGTTATAGAATTAAAGTATGACAGAATCGTTTGAAGGGCACTATTAATCACTCCCGCACATATCACATA ATACTAAGTATCCATTTCTAATATTTCATTACTCTTTTCGGCATCGTATATTGCGATATTTGATTAAATT TTCTTGTTCATTTTTTCCTCTTTTCTTTCGCCTTTGTCGAAAGAAAAGAGGAAAACAAGCTGAAAATTGC TATGCATTAAAGTAGCAGATTTACTTTGTTGAGTTGGTTCTGATCAATAATAAGAGTAATGAAAGAAAGC AAAAAAATGGCTAAAGATAATTTAACTAATTTGCTCTCTCAATTGAACATTCAATTGTCTCAA

27

28

29

30

31

32 Using Entrez An integrated database search and retrieval system

33 Entrez: Database Integration Word weight PubMed abstracts Phylogeny Taxonomy Genomes 3-D Structur e VAST BLAST Nucleotide sequences Protein sequences BLAST

34 The (ever) Expanding Entrez System UniGene PubMed Nucleotide Protein Journals Structure CDD Genome SNP Entrez PopSet 3D Domains UniSTS Taxonomy OMIM ProbeSet Books

35 Entrez Databases PubMed Biomedical literature Books Online textbooks Nucleotide GenBank, EMBL, DDBJ, RefSeq, PDB Protein [GenBank, EMBL, DDBJ], RefSeq, SWISS-PROT, PIR, PRF, PDB Genome Complete genomes Taxonomy Organisms in NCBI sequence databases Structure MMDB: experimental 3D structures Domains CDD: conserved protein domains 3D Domains Compact 3D protein domains in MMDB OMIM Online Mendelian Inheritance in Man SNP Single nucleotide polymorphisms UniSTS Sequence Tagged Site markers ProbeSet Gene expression and microarray datasets PopSet Population study datasets UniGene Gene-based expressed sequence clusters

36 A Traditional GenBank Record Definition =Title ACCESSION VERSION Version Number U7163 U GI:46243 Accession Number GI Number NCBI s Taxonomy

37 FASTA Format FASTA Definition Line >gi gb U SCU7163 > gi number Locus Name Database Identifiers gb GenBank emb EMBL dbj DDBJ sp SWISS-PROT pdb Protein Databank pir PIR prf PRF ref RefSeq Accession number

38 Abstract Syntax Notation: ASN.1 GenPept GenBank ASN.1 FASTA Protein FASTA Nucleotide

39 /************************************************************************ * * asn2ff.c * convert an ASN.1 entry to flat file format, using the FFPrintArray. * *********************************** Toolbox Sources ***************************************/ #include <accentr.h> #include "asn2ff.h" #include "asn2ffp.h" #include "ffprint.h" #include <subutil.h> ftp> open ftp.ncbi.nih.gov #include <objall.h> #include <objcode.h>. #include <lsqfetch.h> #include <explore.h> #ifdef ENABLE_ID1 #include <accid1.h> #endif FILE *fpl; NCBI Toolbox. ftp> cd toolbox ftp> cd ncbi_tools ftp://ftp.ncbi.nlm.gov/toolbox/ncbi_tools Args myargs[] = { {"Filename for asn.1 input","stdin",null,null,true,'a',arg_file_in,.,,null}, {"Input is a Seq-entry","F", NULL,NULL,TRUE,'e',ARG_BOOLEAN,.,,NULL}, {"Input asnfile in binary mode","f",null,null,true,'b',arg_boolean,.,,null}, {"Output Filename","stdout", NULL,NULL,TRUE,'o',ARG_FILE_OUT,.,,NULL}, {"Show Sequence?","T", NULL,NULL,TRUE,'h',ARG_BOOLEAN,.,,NULL},

40 /protein_id="aaa2496.1" /db_xref="gi:46244" GenPept Protein IDS

41

42

43

44

45

46

47

48

49

50

51

52

53 Why align sequences? Functional predictions based on identifying homologues. Assumes: conservation of sequence conservation of function BUT: Function carried out at level of proteins, i.e. 3-D structure Sequence conservation carried out at level of DNA 1-D sequence

54 Implicit Assumption of Evolution in Models of Sequence Homology Assume: Sequence conservation Structure conservation Note that the converse is NOT necessarily true! Structure conservation X Sequence conservation

55 Definitions Homologue: a relatively non-specific term (meaningless?) that conveys the idea that two sequences are somehow related Orthologue: Ortho = (greek) straight.implies direct descent, 1 ancestor Vav human Vav bovine Vav avian Vav elegans -or- human Vav avian Vav bovine Vav dinosaur Vav Ye olde Vav

56 Definitions Paralogue: Para = (greek) along side of.implies some type of gene duplication event Vav1 human Vav2human Vav3human -or-

57 Complex problem Chicken Vav Vav1 human Vav2human Vav3human Orthologue or Paralogue? Need to know evolutionary relationships Can try and get these from alignments as well

58 Alignments- Good Bad and Ugly HgbA-human GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL :+ +::+:::::+ :+++++::+: ::+::+ :: GNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKL HgbB-human

59 Alignments- Good Bad and Ugly HgbA-human SPURIOUS ALIGNMENT GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSD----LHAHKL ::+ ++: + ++:+: ++ :+ :+ : +:+ : ++::+ GSGYLVGDSLTFVDLL--VAQHTADLLAANAALLDEFPQFKAHQE Nematode glutathione S-transferase HgbA-human GSAQVKGHGKKVADALTNAVAHV---D--DMPNALSALSDLHAHKL :+ ::+ ++ +:++ + +: :+ +:+ : NNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKG Leghemoglobin, yellow lupin

60 Alignments Types: Local Global Ungapped Gapped (2 types- linear, affine) Methods: Dot matrix Dynamic Programming Word, k-tup

61 Simplest alignment representation Dot plot Model: Need a metric of similarity between amino acid pairs Simplest metric unitary matrix, identity matrix A A 1 C D E F G H C 1 D 1 E 1 F 1 G 1 H 1 I 1 K I K 1

62 Example self alignment G G 1 F D S F K F D 1 S 1 1 F 1 1 K 1 R 1 L 1 E 1 1 F 1 S 1 R L E F S E V

63 Example self alignment..with sliding window G G 1 F D S F K F D 1 S 1 1 F 1 1 K 1 R 1 L 1 E 1 1 F 1 S 1 R L E F S E V

64 Example self alignment G G 1 F D S F K F 1 1 D 1 S 1 F 1 K 1 R 1 L 1 E 1 F 1 S 1 R L E F S E V

65 Simple alignments Self alignment Dot Matrix DNA symmetric

66 Now align two different sequences * Consider other similarity matrices besides identity. Chemical similarity binary decision Amino acid conservation in aligned protein families min. similarity score (+/- window) Average of multiple scoring systems

67 Dot Matrix Alignments Sequence#1 Sequence#1 1 n 1 m Note NOT symmetric A Local Alignment

68 Dot Matrix Alignments Sequence#1 Sequence#1 1 1 n Insertion Insertion m A Global Alignment

69 General Rules for Dot Matrices Advantage let s your eyes/brain do the work VERY EFFICIENT!!!! DNA comparisons long windows and high stringencies (11/7, 15/11) Proteins short windows and stringencies (1/1) EXCEPT in looking for domains then longer windows and smaller stringencies (15/5). Note we can use these types of self-aligned matrices for more than just sequence comparisons i.e. distance between side Cα atoms in 3d-structures etc.maybe later in the course!

70 Computational Efficiency Measure efficiency in cpu run time and memory O() = big-oh notation Both scale as size of the problem, measured in number of units, n, in the problem, i.e. run time is f(n). Analyse the asymptotic worst-case running time. Sometimes just do the experiment and measure it. If problem scales as square of the number of units in the problem O(n 2 ) (=order n-squared)

71 Examples O(n k ) is polynomial time as long as K<3..tractable Consider our un-gapped dot matrix Global alignment: 1 m 1 n essentially an O(mn) problem

72 O.K. Examples O(n) better than O(n log(n)), better than O(n 2 ), better than O(n 3 ) Terrible Examples O(k n ) = exponential time.horrible!!!! NP problems- no known polynomial time Solutions = non-deterministic polynomial Problems.

11 questions for a total of 120 points

11 questions for a total of 120 points Your Name: BYS 201, Final Exam, May 3, 2010 11 questions for a total of 120 points 1. 25 points Take a close look at these tables of amino acids. Some of them are hydrophilic, some hydrophobic, some positive

More information

Basic concepts of molecular biology

Basic concepts of molecular biology Basic concepts of molecular biology Gabriella Trucco Email: gabriella.trucco@unimi.it Life The main actors in the chemistry of life are molecules called proteins nucleic acids Proteins: many different

More information

A Field Guide to GenBank and NCBI Molecular Biology Resources

A Field Guide to GenBank and NCBI Molecular Biology Resources A Field Guide to GenBank and NCBI Molecular Biology Resources slightly modified from Peter Cooper ftp://ftp.ncbi.nih.gov/pub/cooper/fieldguide/ Eric Sayers ftp://ftp.ncbi.nih.gov/pub/sayers/field_guide/u_penn/

More information

Bi Lecture 3 Loss-of-function (Ch. 4A) Monday, April 8, 13

Bi Lecture 3 Loss-of-function (Ch. 4A) Monday, April 8, 13 Bi190-2013 Lecture 3 Loss-of-function (Ch. 4A) Infer Gene activity from type of allele Loss-of-Function alleles are Gold Standard If organism deficient in gene A fails to accomplish process B, then gene

More information

Algorithms in Bioinformatics ONE Transcription Translation

Algorithms in Bioinformatics ONE Transcription Translation Algorithms in Bioinformatics ONE Transcription Translation Sami Khuri Department of Computer Science San José State University sami.khuri@sjsu.edu Biology Review DNA RNA Proteins Central Dogma Transcription

More information

DNA.notebook March 08, DNA Overview

DNA.notebook March 08, DNA Overview DNA Overview Deoxyribonucleic Acid, or DNA, must be able to do 2 things: 1) give instructions for building and maintaining cells. 2) be copied each time a cell divides. DNA is made of subunits called nucleotides

More information

Lecture 2 Introduction to Data Formats

Lecture 2 Introduction to Data Formats Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 2 Introduction to Data Formats Introduction to Data Formats Real world, data and formats Sequences and

More information

Basic concepts of molecular biology

Basic concepts of molecular biology Basic concepts of molecular biology Gabriella Trucco Email: gabriella.trucco@unimi.it What is life made of? 1665: Robert Hooke discovered that organisms are composed of individual compartments called cells

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools CAP 5510: Introduction to Bioinformatics : Bioinformatics Tools ECS 254A / EC 2474; Phone x3748; Email: giri@cis.fiu.edu My Homepage: http://www.cs.fiu.edu/~giri http://www.cs.fiu.edu/~giri/teach/bioinfs15.html

More information

EE550 Computational Biology

EE550 Computational Biology EE550 Computational Biology Week 1 Course Notes Instructor: Bilge Karaçalı, PhD Syllabus Schedule : Thursday 13:30, 14:30, 15:30 Text : Paul G. Higgs, Teresa K. Attwood, Bioinformatics and Molecular Evolution,

More information

Types of Databases - By Scope

Types of Databases - By Scope Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

APPENDIX. Appendix. Table of Contents. Ethics Background. Creating Discussion Ground Rules. Amino Acid Abbreviations and Chemistry Resources

APPENDIX. Appendix. Table of Contents. Ethics Background. Creating Discussion Ground Rules. Amino Acid Abbreviations and Chemistry Resources Appendix Table of Contents A2 A3 A4 A5 A6 A7 A9 Ethics Background Creating Discussion Ground Rules Amino Acid Abbreviations and Chemistry Resources Codons and Amino Acid Chemistry Behind the Scenes with

More information

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Bioinformatics ONE Introduction to Biology Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Biology Review DNA RNA Proteins Central Dogma Transcription Translation

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below). Protein Synthesis Instructions The purpose of today s lab is to: Understand how a cell manufactures proteins from amino acids, using information stored in the genetic code. Assemble models of four very

More information

Problem: The GC base pairs are more stable than AT base pairs. Why? 5. Triple-stranded DNA was first observed in 1957. Scientists later discovered that the formation of triplestranded DNA involves a type

More information

Chapter 2: Access to Information

Chapter 2: Access to Information Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI

More information

6-Foot Mini Toober Activity

6-Foot Mini Toober Activity Big Idea The interaction between the substrate and enzyme is highly specific. Even a slight change in shape of either the substrate or the enzyme may alter the efficient and selective ability of the enzyme

More information

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Textbook. Web sites. Literature references

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Textbook. Web sites. Literature references Introduction to Bioinformatics Who is taking this course? People with very diverse backgrounds in biology Some people with backgrounds in computer science and biostatistics Most people (will) have a favorite

More information

ENZYMES AND METABOLIC PATHWAYS

ENZYMES AND METABOLIC PATHWAYS ENZYMES AND METABOLIC PATHWAYS This document is licensed under the Attribution-NonCommercial-ShareAlike 2.5 Italy license, available at http://creativecommons.org/licenses/by-nc-sa/2.5/it/ 1. Enzymes build

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS 1 CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Jean Gao at UT Arlington Mingon Kang, PhD Computer Science, Kennesaw State University 2 Genetics The discovery of

More information

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks Rationale of Genetic Studies Some goals of genetic studies include: to identify the genetic causes of phenotypic variation develop genetic tests o benefits to individuals and to society are still uncertain

More information

Granby Transcription and Translation Services plc

Granby Transcription and Translation Services plc ompany Resources ranby Transcription and Translation Services plc has invested heavily in the Protein Synthesis business. mongst the resources available to new recruits are: the latest cellphones which

More information

Nucleic acid and protein Flow of genetic information

Nucleic acid and protein Flow of genetic information Nucleic acid and protein Flow of genetic information References: Glick, BR and JJ Pasternak, 2003, Molecular Biotechnology: Principles and Applications of Recombinant DNA, ASM Press, Washington DC, pages.

More information

Protein Structure Analysis

Protein Structure Analysis BINF 731 Protein Structure Analysis http://binf.gmu.edu/vaisman/binf731/ Secondary Structure: Computational Problems Secondary structure characterization Secondary structure assignment Secondary structure

More information

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016 Molecular Biology ONE Sami Khuri Department of Computer Science San José State University Biology Review DNA RNA Proteins Central Dogma Transcription Translation Genotype to Phenotype Protein Factory DNA

More information

DNA is normally found in pairs, held together by hydrogen bonds between the bases

DNA is normally found in pairs, held together by hydrogen bonds between the bases Bioinformatics Biology Review The genetic code is stored in DNA Deoxyribonucleic acid. DNA molecules are chains of four nucleotide bases Guanine, Thymine, Cytosine, Adenine DNA is normally found in pairs,

More information

G4120: Introduction to Computational Biology

G4120: Introduction to Computational Biology G4120: Introduction to Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology Lecture 3 February 13, 2003 Copyright 2003 Oliver Jovanovic, All Rights Reserved. Bioinformatics

More information

Using DNA sequence, distinguish species in the same genus from one another.

Using DNA sequence, distinguish species in the same genus from one another. Species Identification: Penguins 7. It s Not All Black and White! Name: Objective Using DNA sequence, distinguish species in the same genus from one another. Background In this activity, we will observe

More information

Protein Synthesis. Application Based Questions

Protein Synthesis. Application Based Questions Protein Synthesis Application Based Questions MRNA Triplet Codons Note: Logic behind the single letter abbreviations can be found at: http://www.biology.arizona.edu/biochemistry/problem_sets/aa/dayhoff.html

More information

Protein Bioinformatics Part I: Access to information

Protein Bioinformatics Part I: Access to information Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures

More information

CECS Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka

CECS Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka Dr. Awwad Abdoh Radwan Dept Pharm Organic Chemistry, Faculty of Pharmacy, Assiut University. 29/09/2005 Required Texts Image Source: http://www.amazon.com/ Other Bioinformatics Books Image Source: http://www.amazon.com/

More information

NCBI web resources I: databases and Entrez

NCBI web resources I: databases and Entrez NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table

More information

Dynamic Programming Algorithms

Dynamic Programming Algorithms Dynamic Programming Algorithms Sequence alignments, scores, and significance Lucy Skrabanek ICB, WMC February 7, 212 Sequence alignment Compare two (or more) sequences to: Find regions of conservation

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF BIOTECHNOLOGY Professor Bjørn E. Christensen, Department of Biotechnology

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF BIOTECHNOLOGY Professor Bjørn E. Christensen, Department of Biotechnology Page 1 NRWEGIAN UNIVERSITY F SCIENCE AND TECNLGY DEPARTMENT F BITECNLGY Professor Bjørn E. Christensen, Department of Biotechnology Contact during the exam: phone: 73593327, 92634016 EXAM TBT4135 BIPLYMERS

More information

What is necessary for life?

What is necessary for life? Life What is necessary for life? Most life familiar to us: Eukaryotes FREE LIVING Or Parasites First appeared ~ 1.5-2 10 9 years ago Requirements: DNA, proteins, lipids, carbohydrates, complex structure,

More information

What should you do to reinforce the lecture material?

What should you do to reinforce the lecture material? Bioinformatics and Functional Genomics Course Overview, Introduction of Bioinformatics, Biology Background Biol4230 Thurs, Jan 17, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Goals of today

More information

Fundamentals of Protein Structure

Fundamentals of Protein Structure Outline Fundamentals of Protein Structure Yu (Julie) Chen and Thomas Funkhouser Princeton University CS597A, Fall 2005 Protein structure Primary Secondary Tertiary Quaternary Forces and factors Levels

More information

03-511/711 Computational Genomics and Molecular Biology, Fall

03-511/711 Computational Genomics and Molecular Biology, Fall 03-511/711 Computational Genomics and Molecular Biology, Fall 2011 1 Problem Set 0 Due Tuesday, September 6th This homework is intended to be a self-administered placement quiz, to help you (and me) determine

More information

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Different user needs, different approaches

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Different user needs, different approaches Introduction to Bioinformatics Who is taking this course? Monday, November 19, 2012 Jonathan Pevsner pevsner@kennedykrieger.org Bioinformatics M.E:800.707 People with very diverse backgrounds in biology

More information

info@rcsb.org www.pdb.org Bioinformatics of Green Fluorescent Protein This bioinformatics tutorial explores the relationship between gene sequence, protein structure, and biological function in the context

More information

Unit 1. DNA and the Genome

Unit 1. DNA and the Genome Unit 1 DNA and the Genome Gene Expression Key Area 3 Vocabulary 1: Transcription Translation Phenotype RNA (mrna, trna, rrna) Codon Anticodon Ribosome RNA polymerase RNA splicing Introns Extrons Gene Expression

More information

Computational Biology and Bioinformatics

Computational Biology and Bioinformatics Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management

More information

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita Genome Informatics Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, 2008 Kiyoko F. Aoki-Kinoshita Introduction Genome informatics covers the computer- based modeling and data processing

More information

BME205: Lecture 2 Bio systems. David Bernick

BME205: Lecture 2 Bio systems. David Bernick BME205: Lecture 2 Bio systems David Bernick Bioinforma;cs Infer pa>erns from life biological sequences structures molecular pathways. Suggest hypotheses from inferred pa>erns Structure and Func;on Novel

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L-

36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L- 36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L- 37. The essential fatty acids are A. palmitic acid B. linoleic acid C. linolenic

More information

What is necessary for life?

What is necessary for life? Life What is necessary for life? Most life familiar to us: Eukaryotes FREE LIVING Or Parasites First appeared ~ 1.5-2 10 9 years ago Requirements: DNA, proteins, lipids, carbohydrates, complex structure,

More information

MATH 5610, Computational Biology

MATH 5610, Computational Biology MATH 5610, Computational Biology Lecture 2 Intro to Molecular Biology (cont) Stephen Billups University of Colorado at Denver MATH 5610, Computational Biology p.1/24 Announcements Error on syllabus Class

More information

Textbook Reading Guidelines

Textbook Reading Guidelines Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science

More information

Key Concept Translation converts an mrna message into a polypeptide, or protein.

Key Concept Translation converts an mrna message into a polypeptide, or protein. 8.5 Translation VOBLRY translation codon stop codon start codon anticodon Key oncept Translation converts an mrn message into a polypeptide, or protein. MIN IDES mino acids are coded by mrn base sequences.

More information

What You NEED to Know

What You NEED to Know What You NEED to Know Major DNA Databases NCBI RefSeq EBI DDBJ Protein Structural Databases PDB SCOP CCDC Major Protein Sequence Databases UniprotKB Swissprot PIR TrEMBL Genpept Other Major Databases MIM

More information

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity Homework Why cited articles are especially useful. citeulike science citation index When cutting and pasting less is more. Project Your protein: I will mail these out this weekend If you haven t gotten

More information

A Zero-Knowledge Based Introduction to Biology

A Zero-Knowledge Based Introduction to Biology A Zero-Knowledge Based Introduction to Biology Konstantinos (Gus) Katsiapis 25 Sep 2009 Thanks to Cory McLean and George Asimenos Cells: Building Blocks of Life cell, membrane, cytoplasm, nucleus, mitochondrion

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 5a Protein sequence databases In this lecture, we will mainly discuss on Protein Sequence

More information

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can

More information

Lecture 19A. DNA computing

Lecture 19A. DNA computing Lecture 19A. DNA computing What exactly is DNA (deoxyribonucleic acid)? DNA is the material that contains codes for the many physical characteristics of every living creature. Your cells use different

More information

Zool 3200: Cell Biology Exam 3 3/6/15

Zool 3200: Cell Biology Exam 3 3/6/15 Name: Trask Zool 3200: Cell Biology Exam 3 3/6/15 Answer each of the following questions in the space provided; circle the correct answer or answers for each multiple choice question and circle either

More information

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology CSE527 Computational Biology http://www.cs.washington.edu/527 He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Larry Ruzzo Autumn 2006 -- Chinese Proverb UW CSE Computational

More information

First&year&tutorial&in&Chemical&Biology&(amino&acids,&peptide&and&proteins)&! 1.&!

First&year&tutorial&in&Chemical&Biology&(amino&acids,&peptide&and&proteins)&! 1.&! First&year&tutorial&in&Chemical&Biology&(amino&acids,&peptide&and&proteins& 1.& a. b. c. d. e. 2.& a. b. c. d. e. f. & UsingtheCahn Ingold Prelogsystem,assignstereochemicaldescriptorstothe threeaminoacidsshownbelow.

More information

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München

Structure formation and association of biomolecules. Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München Structure formation and association of biomolecules Prof. Dr. Martin Zacharias Lehrstuhl für Molekulardynamik (T38) Technische Universität München Motivation Many biomolecules are chemically synthesized

More information

Aipotu II: Biochemistry

Aipotu II: Biochemistry Aipotu II: Biochemistry Introduction: The Biological Phenomenon Under Study In this lab, you will continue to explore the biological mechanisms behind the expression of flower color in a hypothetical plant.

More information

Sequence Databases and database scanning

Sequence Databases and database scanning Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.

More information

Name: TOC#. Data and Observations: Figure 1: Amino Acid Positions in the Hemoglobin of Some Vertebrates

Name: TOC#. Data and Observations: Figure 1: Amino Acid Positions in the Hemoglobin of Some Vertebrates Name: TOC#. Comparing Primates Background: In The Descent of Man, the English naturalist Charles Darwin formulated the hypothesis that human beings and other primates have a common ancestor. A hypothesis

More information

CHAPTER 1. DNA: The Hereditary Molecule SECTION D. What Does DNA Do? Chapter 1 Modern Genetics for All Students S 33

CHAPTER 1. DNA: The Hereditary Molecule SECTION D. What Does DNA Do? Chapter 1 Modern Genetics for All Students S 33 HPER 1 DN: he Hereditary Molecule SEION D What Does DN Do? hapter 1 Modern enetics for ll Students S 33 D.1 DN odes For Proteins PROEINS DO HE nitty-gritty jobs of every living cell. Proteins are the molecules

More information

7.014 Quiz II Handout

7.014 Quiz II Handout 7.014 Quiz II Handout Quiz II: Wednesday, March 17 12:05-12:55 54-100 **This will be a closed book exam** Quiz Review Session: Friday, March 12 7:00-9:00 pm room 54-100 Open Tutoring Session: Tuesday,

More information

NCBI Molecular Biology Resources. NCBI Resources

NCBI Molecular Biology Resources. NCBI Resources NBI Molecular Biology Resources A Field Guide NBI Resources The NBI Entrez System NBI Sequence Databases Primary data: GenBank Derivative data: RefSeq, Gene Protein Structure and Function Sequence polymorphisms

More information

Biology: The substrate of bioinformatics

Biology: The substrate of bioinformatics Bi01_1 Unit 01: Biology: The substrate of bioinformatics What is Bioinformatics? Bi01_2 handling of information related to living organisms understood on the basis of molecular biology Nature does it.

More information

Important points from last time

Important points from last time Important points from last time Subst. rates differ site by site Fit a Γ dist. to variation in rates Γ generally has two parameters but in biology we fix one to ensure a mean equal to 1 and the other parameter

More information

2018 Protein Modeling Exam Key

2018 Protein Modeling Exam Key 2018 Protein Modeling Exam Key Multiple Choice: 1. Which of the following amino acids has a negative charge at ph 7? a. Gln b. Glu c. Ser d. Cys 2. Which of the following is an example of secondary structure?

More information

The combination of a phosphate, sugar and a base forms a compound called a nucleotide.

The combination of a phosphate, sugar and a base forms a compound called a nucleotide. History Rosalin Franklin: Female scientist (x-ray crystallographer) who took the picture of DNA James Watson and Francis Crick: Solved the structure of DNA from information obtained by other scientist.

More information

Additional Case Study: Amino Acids and Evolution

Additional Case Study: Amino Acids and Evolution Student Worksheet Additional Case Study: Amino Acids and Evolution Objectives To use biochemical data to determine evolutionary relationships. To test the hypothesis that living things that are morphologically

More information

Proteins: Wide range of func2ons. Polypep2des. Amino Acid Monomers

Proteins: Wide range of func2ons. Polypep2des. Amino Acid Monomers Proteins: Wide range of func2ons Proteins coded in DNA account for more than 50% of the dry mass of most cells Protein func9ons structural support storage transport cellular communica9ons movement defense

More information

Why learn sequence database searching? Searching Molecular Databases with BLAST

Why learn sequence database searching? Searching Molecular Databases with BLAST Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results

More information

Steroids. Steroids. Proteins: Wide range of func6ons. lipids characterized by a carbon skeleton consis3ng of four fused rings

Steroids. Steroids. Proteins: Wide range of func6ons. lipids characterized by a carbon skeleton consis3ng of four fused rings Steroids Steroids lipids characterized by a carbon skeleton consis3ng of four fused rings 3 six sided, and 1 five sided Cholesterol important steroid precursor component in animal cell membranes Although

More information

Basic Concepts of Human Genetics

Basic Concepts of Human Genetics Basic Concepts of Human Genetics The genetic information of an individual is contained in 23 pairs of chromosomes. Every human cell contains the 23 pair of chromosomes. One pair is called sex chromosomes

More information

Genomics and Database Mining (HCS 604.3) April 2005

Genomics and Database Mining (HCS 604.3) April 2005 Genomics and Database Mining (HCS 604.3) April 2005 David M. Francis OARDC 1680 Madison Ave Wooster, OH 44691 e-mail: francis.77@osu.edu Introduction: Computers have changed the way biologists go about

More information

How life. constructs itself.

How life. constructs itself. How life constructs itself Life constructs itself using few simple rules of information processing. On the one hand, there is a set of rules determining how such basic chemical reactions as transcription,

More information

Name. Student ID. Midterm 2, Biology 2020, Kropf 2004

Name. Student ID. Midterm 2, Biology 2020, Kropf 2004 Midterm 2, Biology 2020, Kropf 2004 1 1. RNA vs DNA (5 pts) The table below compares DNA and RNA. Fill in the open boxes, being complete and specific Compare: DNA RNA Pyrimidines C,T C,U Purines 3-D structure

More information

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Tonight. Admin Stuff. CSEP590A Computational Biology

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Tonight. Admin Stuff. CSEP590A Computational Biology CSEP590A Computational Biology http://www.cs.washington.edu/csep590a He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Larry Ruzzo Summer 2006 -- Chinese Proverb UW

More information

THE GENETIC CODE Figure 1: The genetic code showing the codons and their respective amino acids

THE GENETIC CODE Figure 1: The genetic code showing the codons and their respective amino acids THE GENETIC CODE As DNA is a genetic material, it carries genetic information from cell to cell and from generation to generation. There are only four bases in DNA and twenty amino acids in protein, so

More information

LIST OF ACRONYMS & ABBREVIATIONS

LIST OF ACRONYMS & ABBREVIATIONS LIST OF ACRONYMS & ABBREVIATIONS ALA ARG ASN ATD CRD CYS GLN GLU GLY GPCR HIS hstr ILE LEU LYS MET mglur1 NHDC PDB PHE PRO SER T1R2 T1R3 TMD TRP TYR THR 7-TM VFTM ZnSO 4 Alanine, A Arginine, R Asparagines,

More information

Biochemistry and Cell Biology

Biochemistry and Cell Biology Biochemistry and Cell Biology Monomersare simple molecules that can be linked into chains. (Nucleotides, Amino acids, monosaccharides) Polymersare the long chains of monomers. (DNA, RNA, Cellulose, Protein,

More information

Bioinformatics for Proteomics. Ann Loraine

Bioinformatics for Proteomics. Ann Loraine Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data

More information

Chemistry 121 Winter 17

Chemistry 121 Winter 17 Chemistry 121 Winter 17 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State) E-mail: upali@latech.edu Office: 311 Carson Taylor Hall ; Phone: 318-257-4941;

More information

Aipotu I & II: Genetics & Biochemistry

Aipotu I & II: Genetics & Biochemistry Aipotu I & II: Genetics & Biochemistry Objectives: To reinforce your understanding of Genetics, Biochemistry, and Molecular Biology To show the connections between these three disciplines To show how these

More information

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical

More information

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos Daily Agenda Warm Up: Review Translation Notes Protein Synthesis Practice Redos 1. What is DNA Replication? 2. Where does DNA Replication take place? 3. Replicate this strand of DNA into complimentary

More information

Two Mark question and Answers

Two Mark question and Answers 1. Define Bioinformatics Two Mark question and Answers Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three

More information

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology CSE527 Computational Biology http://www.cs.washington.edu/527 He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Larry Ruzzo Autumn 2007 -- Chinese Proverb UW CSE Computational

More information

Translating the Genetic Code. DANILO V. ROGAYAN JR. Faculty, Department of Natural Sciences

Translating the Genetic Code. DANILO V. ROGAYAN JR. Faculty, Department of Natural Sciences Translating the Genetic Code DANILO V. ROGAYAN JR. Faculty, Department of Natural Sciences An overview of gene expression Figure 13.2 The Idea of A Code 20 amino acids 4 nucleotides How do nucleic acids

More information

Introduc)on to Databases and Resources Biological Databases and Resources

Introduc)on to Databases and Resources Biological Databases and Resources Introduc)on to Bioinforma)cs Online Course : IBT Introduc)on to Databases and Resources Biological Databases and Resources Learning Objec)ves Introduc)on to Databases and Resources - Understand how bioinforma)cs

More information

7.014 Problem Set 4 Answers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions will be posted on the web.

7.014 Problem Set 4 Answers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions will be posted on the web. MIT Department of Biology 7.014 Introductory Biology, Spring 2005 Name: Section : 7.014 Problem Set 4 Answers to this problem set are to be turned in. Problem sets will not be accepted late. Solutions

More information

Name Section Problem Set 3

Name Section Problem Set 3 Name Section 7.013 Problem Set 3 The completed problem must be turned into the wood box outside 68120 by 4:40 pm, Thursday, March 13. Problem sets will not be accepted late. Question 1 Based upon your

More information

Turning λ Cro into a

Turning λ Cro into a Turning λ Cro into a 3 Transcriptional Activator Figure by MIT pencourseware. Fred Bushman and Mark Ptashne Cell (1988) 54:191-197 Presented by atalie Kuldell for 0.90 February 4th, 009 Small patch of

More information

Connect the dots DNA to DISEASE

Connect the dots DNA to DISEASE Teachers Material Connect the dots DNA to DISEASE Developed by: M. Oltmann & A. James. California State Standards Cell Biology 1.d. Students know the central dogma of molecular biology outlines the flow

More information