Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Different user needs, different approaches

Size: px
Start display at page:

Download "Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Different user needs, different approaches"

Transcription

1 Introduction to Bioinformatics Who is taking this course? Monday, November 19, 2012 Jonathan Pevsner Bioinformatics M.E: People with very diverse backgrounds in biology Some people with backgrounds in computer science and biostatistics Most people (will) have a favorite gene, protein, or disease, or a high throughput dataset Different user needs, different approaches web-based or graphical command line user interface (GUI) What are the goals of the course? UCSC, Ensembl genome browsers NCBI EBI, central resources Partek MEGA5 RStudio Galaxy: web implementation of browser data, NGS tools Perl, R: manipulate data files Software for data analysis, large databases Linux: nextgeneration sequencing, other tools To provide an introduction to bioinformatics with a focus on the National Center for Biotechnology Information (NCBI), UCSC, and EBI To focus on the analysis of DNA, RNA and proteins To introduce you to the analysis of genomes To combine theory and practice to help you solve research problems Workflow #1: How do we analyze a protein? Workflow #2: How do we analyze a human genome? Suppose we are interested in learning about beta globin (HBB), a subunit of hemoglobin NCBI offers information at the levels of DNA (e.g. variation in disease), RNA (e.g. gene expression data from microarrays), protein (e.g. 3D structure), pathways EBI offers comparable information Other portals such as ExPASy offer hundreds of analysis tools Choose an individual (e.g. a patient), obtain informed consent, get blood, purify genomic DNA Obtain raw DNA reads by next-generation sequencing Align the reads to a reference human genome Call variants (single nucleotide variants, indels) Determine the functional significance of variants (deleterious or neutral) 1

2 Textbook The course textbook has no required textbook. I wrote Bioinformatics and Functional Genomics (Wiley-Blackwell, 2 nd edition 2009). The lectures in this course correspond closely to chapters. I will make pdfs of the chapters available to everyone. You can also purchase a copy at the bookstore, at amazon.com, or at Wiley with a 20% discount through the book s website Web sites The course website is reached via moodle: (or Google moodle bioinformatics ) --This site contains the powerpoints for each lecture, including black & white versions for printing --The weekly quizzes are here --You can ask questions via the forum --Audiovisual files of each lecture will be posted here The textbook website is: This has powerpoints, URLs, etc. organized by chapter. This is most useful to find web documents corresponding to each chapter. Themes throughout the course: the beta globin gene/protein family Computer labs We will use beta globin as a model gene/protein throughout the course. Globins including hemoglobin and myoglobin carry oxygen. We will study globins in a variety of contexts including --sequence alignment --gene expression --protein structure --phylogeny --homologs in various species There is no in-class computer lab, but the seven weekly quizzes function as a take-home computer lab. To solve the questions, you will need to go to websites, use databases, and use software. Most quizzes are due in 7 days. Because of Thanksgiving, the first quiz will be due in 9 days (November 28 th at noon). Grading Google moodle bioinformatics to get here; Click Bioinformatics to sign in; The enrollment key you need is 60% moodle quizzes (your top 6 out of 7 quizzes). Quizzes are taken at the moodle website, and are due one week after the relevant lecture. Special extended due date for quizzes due immediately after Thanksgiving and the New Year. 40% final exam Thursday, January 17 (2pm, in class). Closed book, cumulative, no computer, short answer / multiple choice. Past exams will be made available ahead of time. 2

3 Outline for the course (all on Mondays) 1. Accessing information about DNA and proteins Nov Pairwise alignment and BLAST Nov Advanced BLAST to next-generation sequencing Dec Multiple sequence alignment Dec Molecular phylogeny and evolution Dec Microarrays Jan Next-generation sequencing Jan. 14 Final exam (Thursday 2:00 pm) Jan. 17 Outline for today Learning objectives for today Definition of bioinformatics Overview of the NCBI website Accession numbers, RefSeq, and Entrez Gene Two genome browsers: UCSC and Ensembl From UCSC Table Browser to Galaxy [1] You should be able to explain what accession numbers are and why the RefSeq project is significant [2] You should be able to find accession numbers for any gene (or protein) from any organism via Entrez Gene [3] You should be able to locate any human gene using the UCSC Genome Browser [4] You should be able to locate information about genes (and proteins) using the UCSC Table Browser and the related resource Galaxy What is bioinformatics? Interface of biology and computers Analysis of proteins, genes and genomes using computer algorithms and computer databases Genomics is the analysis of genomes. The tools of bioinformatics are used to make sense of the billions of base pairs of DNA that are sequenced by genomics projects. Three perspectives on bioinformatics The cell The organism The tree of life Page 4 3

4 First perspective: the cell Central dogma of molecular biology DNA RNA protein genome transcriptome proteome Central dogma of bioinformatics and genomics Growth of GenBank DNA genomic DNA databases RNA cdna ESTs UniGene protein protein sequence databases phenotype Fig. 2.2 Page 18 Sequences (millions) Base pairs of DNA (millions) Fig. 2.1 Year Page 17 Sequence Read Archive (SRA) at NCBI: over 900 terabases of sequence (~1 petabase) Arrival of next-generation sequencing: We have sequenced hundreds of terabases (November 2012) Size, terabases 6 years ago GenBank celebrated reaching 100 billion base pairs of DNA Year A project to sequence 10,000 human genomes requires about 3 petabytes of storage (3,000 terabytes). To buy a server with 100 Tb now costs $10,000. Now when my lab sequences the genome of one patient, we obtain ~150 billion base pairs of sequence and ~1 terabyte of data for $3,000. This course will reflect the major impact of increased DNA sequencing on all areas of biology. 4

5 There are three major public DNA databases There storage of next-generation sequence data is an emerging challenge EMBL Housed at EBI European Bioinformatics Institute GenBank Housed at NCBI National Center for Biotechnology Information DDBJ Housed in Japan The underlying raw DNA sequences are identical NCBI offers a sequence read archive (SRA), but the best storage strategies are uncertain. The European Bioinformatics Institute (EBI) offers the European Nucleotide Archive (ENA), For individual labs, tens of terabytes of storage are routinely needed. Page 14 Second perspective: the organism Third perspective: the tree of life Time of development Body region, physiology, pharmacology, pathology Page 5 After Pace NR (1997) Science 276:734 Page 6 Taxonomy at NCBI: >250,000 species are represented in GenBank The most sequenced organisms in GenBank Homo sapiens 16.3 billion bases Mus musculus 10.0b Rattus norvegicus 6.5b Bos taurus 5.4b Zea mays 5.1b Sus scrofa 4.9b Danio rerio 3.1b Strongylocentrotus purpurata 1.4b Macaca mulatta 1.3b Oryza sativa (japonica) 1.3b 11/12 Page 16 Updated Nov GenBank release Excluding WGS, organelles, metagenomics Table 2-2 Page 17 5

6 Outline for today National Center for Biotechnology Information (NCBI) Definition of bioinformatics Overview of the NCBI website Accession numbers, RefSeq, and Entrez Gene Two genome browsers: UCSC and Ensembl From UCSC Table Browser to Galaxy Fig. 2.4 Page 24 NCBI key features: PubMed National Library of Medicine's search service 22 million citations in MEDLINE (as of 2012) links to participating online journals PubMed tutorial on the site or visit NLM: Be sure to access PubMed via Welch Library! Also get to know your informationists Peggy Gross, Rob Wright et al. Page 23 NCBI key features: Entrez search and retrieval system Entrez integrates the scientific literature; DNA and protein sequence databases; 3D protein structure data; population study data sets; assemblies of complete genomes Page 24 Outline for today Accession numbers are labels for sequences Definition of bioinformatics Overview of the NCBI website Accession numbers, RefSeq, and Entrez Gene Two genome browsers: UCSC and Ensembl From UCSC Table Browser to Galaxy NCBI includes databases (such as GenBank) that contain information on DNA, RNA, or protein sequences. You may want to acquire information beginning with a query such as the name of a protein of interest, or the raw nucleotides comprising a DNA sequence of interest. DNA sequences and other molecular data are tagged with accession numbers that are used to identify a sequence or other record relevant to molecular data. Page 26 6

7 What is an accession number? An accession number is label that used to identify a sequence. It is a string of letters and/or numbers that corresponds to a molecular sequence. Examples (all for beta globin, HBB): NCBI s important RefSeq project: best representative sequences RefSeq (accessible via the main page of NCBI) provides an expertly curated accession number that corresponds to the most stable, agreed-upon reference version of a sequence. X02775 NG_ rs GenBank genomic DNA sequence RefSeqGene dbsnp (single nucleotide polymorphism) AA An expressed sequence tag (1 of 2,345) NM_ RefSeq DNA sequence (from a transcript) NP_ CAA Q YE0 B RefSeq protein GenBank protein SwissProt protein Protein Data Bank structure record DNA RNA protein Page 27 RefSeq identifiers include the following formats: Complete genome NC_###### Complete chromosome NC_###### Genomic contig NT_###### mrna (DNA format) NM_###### e.g. NM_ Protein NP_###### e.g. NP_ Page 27 NCBI s RefSeq project: many accession number formats for genomic, mrna, protein sequences Access to sequences: Entrez Gene at NCBI Accession Molecule Method Note AC_ Genomic Mixed Alternate complete genomic AP_ Protein Mixed Protein products; alternate NC_ Genomic Mixed Complete genomic molecules NG_ Genomic Mixed Incomplete genomic regions NM_ mrna Mixed Transcript products; mrna NM_ mrna Mixed Transcript products; 9-digit NP_ Protein Mixed Protein products; NP_ Protein Curation Protein products; 9-digit NR_ RNA Mixed Non-coding transcripts NT_ Genomic Automated Genomic assemblies NW_ Genomic Automated Genomic assemblies NZ_ABCD Genomic Automated Whole genome shotgun data XM_ mrna Automated Transcript products XP_ Protein Automated Protein products XR_ RNA Automated Transcript products YP_ Protein Auto. & Curated Protein products ZP_ Protein Automated Protein products Gene is a great starting point: it collects key information on each gene/protein from major databases. It covers all major organisms. RefSeq provides a curated, optimal accession number for each DNA (NM_ for beta globin DNA corresponding to mrna) or protein (NP_ ) Page 29 From the NCBI home page, type beta globin and hit Search Fig. 2.5 Page 28 Follow the link to Gene Fig. 2.5 Page 28 7

8 Entrez Gene is in the header Note the Official Symbol HBB for beta globin Note the limits option Entrez Gene (top of page): Note a useful summary, and links to other databases Page 30 Gene page at NCBI offers a wealth of information Entrez Gene (bottom of page): non-refseq accessions (it s unclear what these are, highlighting usefulness of RefSeq) Genomic context Bibliography Phenotypes Gene Ontology (organizing principles of biological process, molecular function, cellular component) Reference sequences Additional (non-refseq sequences) Many, many links to NCBI resources (e.g. HomoloGene) Many, many links to external resources Page 29 Entrez Protein: accession, organism, literature Entrez Protein: features of a protein, and its sequence in the one-letter amino acid code Fig. 2.8 Page 31 Fig. 2.8 Page 31 8

9 You should learn the one-letter amino acid code! Entrez Protein: You can change the display (as shown) Name 3 Letter 1 Letter Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic acid Asp D Cysteine Cys C Glutamic Acid Glu E Glutamine Gln Q Glycine Gly G Histidine His H Isoleucine Ile I Name 3 Letter 1 Letter Leucine Leu L Lysine Lys K Methionine Met M Phenylalanine Phe F Proline Pro P Serine Ser S Threonine Thr T Tryptophan Trp W Tyrosine Tyr Y Valine Val V Page 31 FASTA format: versatile, compact with one header line followed by a string of nucleotides or amino acids in the single letter code While FASTA is one file format, there are many others FASTA Sequences in one letter DNA or protein code FASTQ DNA sequences with quality scores for each base BAM compressed binary version of SAM SAM Sequence Alignment/Map file (tab-delimited) VCF variant call format (genomic variants; indels) (See genome.ucsc.edu/faq/faqformat.html for the following:) BED WIG GFF a table including chromosome, start, end wiggle format (displays dense, continuous data) General Feature Format (tab separated) Fig. 2.9 Page 32 Also, besides Excel (.xls,.xlsx) spreadsheets can also be:.txt tab-delimited text file (or space delimited).csv comma separated text file FASTQ format specification The FASTQ format has four lines per read (and typically has millions of reads) Sequencing run information Sequence read (like FASTA) Quality scores (per base) Outline for today Definition of bioinformatics Overview of the NCBI website Accession numbers, RefSeq, and Entrez Gene Two genome browsers: UCSC and Ensembl From UCSC Table Browser to Galaxy 9

10 Genome Browsers: increasingly important resources Genomic DNA is organized in chromosomes. Genome browsers display ideograms (pictures) of chromosomes, with user-selected annotation tracks that display many kinds of information. Ensembl genome browser ( note BioMart The two most essential human genome browsers are at Ensembl and UCSC. We will focus on UCSC (but the two are equally important). The browser at NCBI is less commonly used. click human enter beta globin Ensembl output for beta globin includes views of chromosome 11 (top), the region (middle), and a detailed view (bottom). There are various horizontal annotation tracks. A quiz question about Ensembl/BioMart For this week s quiz/computer lab, one question asks you to use BioMart at Ensembl to find all the micrornas on chromosome 11, as well as their GC content. There are step-by-step instructions and it should take no more than a couple minutes. The UCSC Genome Browser: an increasingly important resource This browser s focus is on humans and other eukaryotes you can select which tracks to display (and how much information for each track) tracks are based on data generated by the UCSC team and by the broad research community you can create custom tracks of your own data! Just format a spreadsheet properly and upload it The Table Browser is equally important as the more visual Genome Browser, and you can move between the two 10

11 [1] Visit click Genome Browser Note that there are choices of assemblies such as hg19 [2] Choose organisms, enter query (beta globin), hit submit Page 36 An assembly (or build ) is a fixed version of a genome. Builds are released every several years. In practice, you should always be aware whether you are using hg18 or hg19 for the human genome. They are annotated with different types of information such as experimental data sets. To learn more visit [3] Choose the RefSeq beta globin gene (HBB) [4] On the UCSC Genome Browser: --choose which tracks to display --add custom tracks --the Table Browser is complementary Exploring the UCSC Genome Browser The human genome can be viewed with different assemblies (hg18, hg19). These contain different data sets. You can get information about a track by clicking its header (e.g. RefSeq Genes ). You can choose the density to display each track (e.g. hide, dense, squish, pack). Outline for today Definition of bioinformatics Overview of the NCBI website Accession numbers, RefSeq, and Entrez Gene Two genome browsers: UCSC and Ensembl From UCSC Table Browser to Galaxy 11

12 You can reach the Table Browser from the Genome Browser Use the UCSC Table Browser to get tabular data related to any of the Genome Browser tracks. It is quantitative, not visual. species assembly position (where you were on the Genome Browser) check box to send a BED file to Galaxy Get output Get a summary of the output Step 3: In history panel, click eye to view main panel; click edit attributes to change name to snps Tools panel Display panel History panel Step 2: click Send query to Galaxy Step 3 (continued): Thus, there are ~890,000 SNPs on human chromosome 11 Step 4 (continued): To get coding exons, set the group to Genes and the track to RefSeq Genes, then get output Step 4: To get coding exons, click Get Data (under Tools) then UCSC Main table browser 12

13 Step 4 (continued): Set the output to Coding Exons then send the query to Galaxy Step 5: In Galaxy, a new data set appears. Rename it coding exons by clicking the pencil. Under Tools click Operate on Genomic Intervals. Step 6: Click Intersect the intervals of two datasets. In the central panel, choose snps and coding exons. Execute. Step 7. Here s the answer. There are 11,652 SNPs in coding exons. Finally, click display at UCSC main. Step 7. This returns us to the UCSC Genome Browser, where our Galaxy results are displayed as a custom track (entitled User Supplied Track ). Summary: Galaxy Galaxy s URL is Its features include: Integration with UCSC and many other major genomics resources such as BioMart It is intuitive and does not require knowledge of computer programming, but it offers access to sophisticated software There are excellent videocasts and tutorials It fosters reproducibility. Your history and workflows are saved. See the User tab and sign up for a free account! 13

14 Summary: today s lecture [1] NCBI is a central resource for bioinformatics. Reminder: Please enroll! Google moodle bioinformatics to get here; click Bioinformatics to sign in; The enrollment key is [2] We described accession numbers and the RefSeq project, which offers a trusted version of a sequence. Use Entrez Gene as a key source of gene information. [3] We used the UCSC Genome Browser to visualize information along chromosomes. [4] We then used the UCSC Table Browser to obtain tabular outputs of queries. The underlying data are the same in the Genome and Table Browsers. [5] Next we used Galaxy to solve a problem, and sent the final result back to the UCSC Genome Browser. 14

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Textbook. Web sites. Literature references

Introduction to Bioinformatics. What are the goals of the course? Who is taking this course? Textbook. Web sites. Literature references Introduction to Bioinformatics Who is taking this course? People with very diverse backgrounds in biology Some people with backgrounds in computer science and biostatistics Most people (will) have a favorite

More information

Chapter 2: Access to Information

Chapter 2: Access to Information Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics 260.602.01 September 1, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Teaching assistants Hugh Cahill (hugh@jhu.edu) Jennifer Turney (jturney@jhsph.edu) Meg Zupancic

More information

Computational Biology and Bioinformatics

Computational Biology and Bioinformatics Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can

More information

Protein Bioinformatics Part I: Access to information

Protein Bioinformatics Part I: Access to information Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures

More information

11 questions for a total of 120 points

11 questions for a total of 120 points Your Name: BYS 201, Final Exam, May 3, 2010 11 questions for a total of 120 points 1. 25 points Take a close look at these tables of amino acids. Some of them are hydrophilic, some hydrophobic, some positive

More information

Algorithms in Bioinformatics ONE Transcription Translation

Algorithms in Bioinformatics ONE Transcription Translation Algorithms in Bioinformatics ONE Transcription Translation Sami Khuri Department of Computer Science San José State University sami.khuri@sjsu.edu Biology Review DNA RNA Proteins Central Dogma Transcription

More information

Basic concepts of molecular biology

Basic concepts of molecular biology Basic concepts of molecular biology Gabriella Trucco Email: gabriella.trucco@unimi.it Life The main actors in the chemistry of life are molecules called proteins nucleic acids Proteins: many different

More information

NCBI web resources I: databases and Entrez

NCBI web resources I: databases and Entrez NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

Basic concepts of molecular biology

Basic concepts of molecular biology Basic concepts of molecular biology Gabriella Trucco Email: gabriella.trucco@unimi.it What is life made of? 1665: Robert Hooke discovered that organisms are composed of individual compartments called cells

More information

APPENDIX. Appendix. Table of Contents. Ethics Background. Creating Discussion Ground Rules. Amino Acid Abbreviations and Chemistry Resources

APPENDIX. Appendix. Table of Contents. Ethics Background. Creating Discussion Ground Rules. Amino Acid Abbreviations and Chemistry Resources Appendix Table of Contents A2 A3 A4 A5 A6 A7 A9 Ethics Background Creating Discussion Ground Rules Amino Acid Abbreviations and Chemistry Resources Codons and Amino Acid Chemistry Behind the Scenes with

More information

DNA.notebook March 08, DNA Overview

DNA.notebook March 08, DNA Overview DNA Overview Deoxyribonucleic Acid, or DNA, must be able to do 2 things: 1) give instructions for building and maintaining cells. 2) be copied each time a cell divides. DNA is made of subunits called nucleotides

More information

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012

Bioinformatics. ONE Introduction to Biology. Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Bioinformatics ONE Introduction to Biology Sami Khuri Department of Computer Science San José State University Biology/CS 123A Fall 2012 Biology Review DNA RNA Proteins Central Dogma Transcription Translation

More information

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks

BIOSTAT516 Statistical Methods in Genetic Epidemiology Autumn 2005 Handout1, prepared by Kathleen Kerr and Stephanie Monks Rationale of Genetic Studies Some goals of genetic studies include: to identify the genetic causes of phenotypic variation develop genetic tests o benefits to individuals and to society are still uncertain

More information

EE550 Computational Biology

EE550 Computational Biology EE550 Computational Biology Week 1 Course Notes Instructor: Bilge Karaçalı, PhD Syllabus Schedule : Thursday 13:30, 14:30, 15:30 Text : Paul G. Higgs, Teresa K. Attwood, Bioinformatics and Molecular Evolution,

More information

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools CAP 5510: Introduction to Bioinformatics : Bioinformatics Tools ECS 254A / EC 2474; Phone x3748; Email: giri@cis.fiu.edu My Homepage: http://www.cs.fiu.edu/~giri http://www.cs.fiu.edu/~giri/teach/bioinfs15.html

More information

Bi Lecture 3 Loss-of-function (Ch. 4A) Monday, April 8, 13

Bi Lecture 3 Loss-of-function (Ch. 4A) Monday, April 8, 13 Bi190-2013 Lecture 3 Loss-of-function (Ch. 4A) Infer Gene activity from type of allele Loss-of-Function alleles are Gold Standard If organism deficient in gene A fails to accomplish process B, then gene

More information

ENZYMES AND METABOLIC PATHWAYS

ENZYMES AND METABOLIC PATHWAYS ENZYMES AND METABOLIC PATHWAYS This document is licensed under the Attribution-NonCommercial-ShareAlike 2.5 Italy license, available at http://creativecommons.org/licenses/by-nc-sa/2.5/it/ 1. Enzymes build

More information

DNA is normally found in pairs, held together by hydrogen bonds between the bases

DNA is normally found in pairs, held together by hydrogen bonds between the bases Bioinformatics Biology Review The genetic code is stored in DNA Deoxyribonucleic acid. DNA molecules are chains of four nucleotide bases Guanine, Thymine, Cytosine, Adenine DNA is normally found in pairs,

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS 1 CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Jean Gao at UT Arlington Mingon Kang, PhD Computer Science, Kennesaw State University 2 Genetics The discovery of

More information

BGGN 213: Foundations of Bioinformatics (Fall 2017)

BGGN 213: Foundations of Bioinformatics (Fall 2017) BGGN 213: Foundations of Bioinformatics (Fall 2017) Course Instructor: Dr. Barry J. Grant ( bjgrant@ucsd.edu ) Course Website: https://bioboot.github.io/bggn213_f17/ DRAFT: 2017-08-10 (15:02:30 PDT on

More information

BIMM 143: Introduction to Bioinformatics (Winter 2018)

BIMM 143: Introduction to Bioinformatics (Winter 2018) BIMM 143: Introduction to Bioinformatics (Winter 2018) Course Instructor: Dr. Barry J. Grant ( bjgrant@ucsd.edu ) Course Website: https://bioboot.github.io/bimm143_w18/ DRAFT: 2017-12-02 (20:48:10 PST

More information

Using DNA sequence, distinguish species in the same genus from one another.

Using DNA sequence, distinguish species in the same genus from one another. Species Identification: Penguins 7. It s Not All Black and White! Name: Objective Using DNA sequence, distinguish species in the same genus from one another. Background In this activity, we will observe

More information

Gene-centered resources at NCBI

Gene-centered resources at NCBI COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving

More information

Problem: The GC base pairs are more stable than AT base pairs. Why? 5. Triple-stranded DNA was first observed in 1957. Scientists later discovered that the formation of triplestranded DNA involves a type

More information

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html UCSC

More information

Unit 1. DNA and the Genome

Unit 1. DNA and the Genome Unit 1 DNA and the Genome Gene Expression Key Area 3 Vocabulary 1: Transcription Translation Phenotype RNA (mrna, trna, rrna) Codon Anticodon Ribosome RNA polymerase RNA splicing Introns Extrons Gene Expression

More information

Granby Transcription and Translation Services plc

Granby Transcription and Translation Services plc ompany Resources ranby Transcription and Translation Services plc has invested heavily in the Protein Synthesis business. mongst the resources available to new recruits are: the latest cellphones which

More information

Nucleic acid and protein Flow of genetic information

Nucleic acid and protein Flow of genetic information Nucleic acid and protein Flow of genetic information References: Glick, BR and JJ Pasternak, 2003, Molecular Biotechnology: Principles and Applications of Recombinant DNA, ASM Press, Washington DC, pages.

More information

BIOLOGY. Monday 14 Mar 2016

BIOLOGY. Monday 14 Mar 2016 BIOLOGY Monday 14 Mar 2016 Entry Task List the terms that were mentioned last week in the video. Translation, Transcription, Messenger RNA (mrna), codon, Ribosomal RNA (rrna), Polypeptide, etc. Agenda

More information

Hands-On Four Investigating Inherited Diseases

Hands-On Four Investigating Inherited Diseases Hands-On Four Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement

More information

Protein Synthesis. Application Based Questions

Protein Synthesis. Application Based Questions Protein Synthesis Application Based Questions MRNA Triplet Codons Note: Logic behind the single letter abbreviations can be found at: http://www.biology.arizona.edu/biochemistry/problem_sets/aa/dayhoff.html

More information

BIOL 274 Introduction to Bioinformatics Fall 2016

BIOL 274 Introduction to Bioinformatics Fall 2016 Instructor: Eric S. Ho (hoe@lafayette.edu) Office: Kunkel 13 Office hours: TTh 2-4 pm Lecture: MWF 11:00-11:50 am, Venue: Kunkel 117 Lab: M/W 1:10-4:00 pm, Venue: Kunkel 313B TAs: Amy Boles (bolesa@lafayette.edu),

More information

Bioinformatics for Proteomics. Ann Loraine

Bioinformatics for Proteomics. Ann Loraine Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data

More information

Types of Databases - By Scope

Types of Databases - By Scope Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of

More information

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP

BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in

More information

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below). Protein Synthesis Instructions The purpose of today s lab is to: Understand how a cell manufactures proteins from amino acids, using information stored in the genetic code. Assemble models of four very

More information

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016

Molecular Biology. Biology Review ONE. Protein Factory. Genotype to Phenotype. From DNA to Protein. DNA à RNA à Protein. June 2016 Molecular Biology ONE Sami Khuri Department of Computer Science San José State University Biology Review DNA RNA Proteins Central Dogma Transcription Translation Genotype to Phenotype Protein Factory DNA

More information

6-Foot Mini Toober Activity

6-Foot Mini Toober Activity Big Idea The interaction between the substrate and enzyme is highly specific. Even a slight change in shape of either the substrate or the enzyme may alter the efficient and selective ability of the enzyme

More information

What should you do to reinforce the lecture material?

What should you do to reinforce the lecture material? Bioinformatics and Functional Genomics Course Overview, Introduction of Bioinformatics, Biology Background Biol4230 Thurs, Jan 17, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Goals of today

More information

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu.   handouts, papers, datasets Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger

More information

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html

More information

Investigating Inherited Diseases

Investigating Inherited Diseases Investigating Inherited Diseases The purpose of these exercises is to introduce bioinformatics databases and tools. We investigate an important human gene and see how mutations give rise to inherited diseases.

More information

CECS Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka

CECS Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka Dr. Awwad Abdoh Radwan Dept Pharm Organic Chemistry, Faculty of Pharmacy, Assiut University. 29/09/2005 Required Texts Image Source: http://www.amazon.com/ Other Bioinformatics Books Image Source: http://www.amazon.com/

More information

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF BIOTECHNOLOGY Professor Bjørn E. Christensen, Department of Biotechnology

NORWEGIAN UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF BIOTECHNOLOGY Professor Bjørn E. Christensen, Department of Biotechnology Page 1 NRWEGIAN UNIVERSITY F SCIENCE AND TECNLGY DEPARTMENT F BITECNLGY Professor Bjørn E. Christensen, Department of Biotechnology Contact during the exam: phone: 73593327, 92634016 EXAM TBT4135 BIPLYMERS

More information

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita

Genome Informatics. Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, Kiyoko F. Aoki-Kinoshita Genome Informatics Systems Biology and the Omics Cascade (Course 2143) Day 3, June 11 th, 2008 Kiyoko F. Aoki-Kinoshita Introduction Genome informatics covers the computer- based modeling and data processing

More information

BIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1

BIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1 BIOINF525: INTRODUCTION TO BIOINFORMATICS LAB SESSION 1 Bioinformatics Databases http://bioboot.github.io/bioinf525_w17/module1/#1.1 Dr. Barry Grant Jan 2017 Overview: The purpose of this lab session is

More information

ab initio and Evidence-Based Gene Finding

ab initio and Evidence-Based Gene Finding ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene

More information

Briefly, this exercise can be summarised by the follow flowchart:

Briefly, this exercise can be summarised by the follow flowchart: Workshop exercise Data integration and analysis In this exercise, we would like to work out which GWAS (genome-wide association study) SNP associated with schizophrenia is most likely to be functional.

More information

36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L-

36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L- 36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L- 37. The essential fatty acids are A. palmitic acid B. linoleic acid C. linolenic

More information

BME205: Lecture 2 Bio systems. David Bernick

BME205: Lecture 2 Bio systems. David Bernick BME205: Lecture 2 Bio systems David Bernick Bioinforma;cs Infer pa>erns from life biological sequences structures molecular pathways. Suggest hypotheses from inferred pa>erns Structure and Func;on Novel

More information

Aipotu II: Biochemistry

Aipotu II: Biochemistry Aipotu II: Biochemistry Introduction: The Biological Phenomenon Under Study In this lab, you will continue to explore the biological mechanisms behind the expression of flower color in a hypothetical plant.

More information

Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12

Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12 Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12 What s Galaxy? Bringing Developers And Biologists Together. Reproducible Science Is Our Goal An open, web-based platform for data intensive

More information

INTRODUCTION TO BIOINFORMATICS. SAINTS GENETICS Ian Bosdet

INTRODUCTION TO BIOINFORMATICS. SAINTS GENETICS Ian Bosdet INTRODUCTION TO BIOINFORMATICS SAINTS GENETICS 12-120522 - Ian Bosdet (ibosdet@bccancer.bc.ca) Bioinformatics bioinformatics is: the application of computational techniques to the fields of biology and

More information

CHAPTER 1. DNA: The Hereditary Molecule SECTION D. What Does DNA Do? Chapter 1 Modern Genetics for All Students S 33

CHAPTER 1. DNA: The Hereditary Molecule SECTION D. What Does DNA Do? Chapter 1 Modern Genetics for All Students S 33 HPER 1 DN: he Hereditary Molecule SEION D What Does DN Do? hapter 1 Modern enetics for ll Students S 33 D.1 DN odes For Proteins PROEINS DO HE nitty-gritty jobs of every living cell. Proteins are the molecules

More information

Your Name. Mean ± SD = 72.3 ± 17.6 Range = out of 112 points N = Points

Your Name. Mean ± SD = 72.3 ± 17.6 Range = out of 112 points N = Points Introduction to Biochemistry Final Examination - Individual (Part I) Thursday, 24 May 2012 3:30 5:30 PM, 206 Brown Lab H. B. White Instructor Your Name Mean ± SD = 72.3 ± 17.6 Range = 23 99 out of 112

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos Daily Agenda Warm Up: Review Translation Notes Protein Synthesis Practice Redos 1. What is DNA Replication? 2. Where does DNA Replication take place? 3. Replicate this strand of DNA into complimentary

More information

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Today. Admin Stuff. CSE527 Computational Biology CSE527 Computational Biology http://www.cs.washington.edu/527 He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Larry Ruzzo Autumn 2006 -- Chinese Proverb UW CSE Computational

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

What is necessary for life?

What is necessary for life? Life What is necessary for life? Most life familiar to us: Eukaryotes FREE LIVING Or Parasites First appeared ~ 1.5-2 10 9 years ago Requirements: DNA, proteins, lipids, carbohydrates, complex structure,

More information

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 11, 2011 1 1 Introduction Grundlagen der Bioinformatik Summer 2011 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a) 1.1

More information

Protein Structure Analysis

Protein Structure Analysis BINF 731 Protein Structure Analysis http://binf.gmu.edu/vaisman/binf731/ Secondary Structure: Computational Problems Secondary structure characterization Secondary structure assignment Secondary structure

More information

Gene-centered databases and Genome Browsers

Gene-centered databases and Genome Browsers COURSE OF BIOINFORMATICS a.a. 2015-2016 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about

More information

Gene-centered databases and Genome Browsers

Gene-centered databases and Genome Browsers COURSE OF BIOINFORMATICS a.a. 2016-2017 Gene-centered databases and Genome Browsers We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about

More information

Genomics and Database Mining (HCS 604.3) April 2005

Genomics and Database Mining (HCS 604.3) April 2005 Genomics and Database Mining (HCS 604.3) April 2005 David M. Francis OARDC 1680 Madison Ave Wooster, OH 44691 e-mail: francis.77@osu.edu Introduction: Computers have changed the way biologists go about

More information

user s guide Question 1

user s guide Question 1 Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s

More information

Introduction to NGS analyses

Introduction to NGS analyses Introduction to NGS analyses Giorgio L Papadopoulos Institute of Molecular Biology and Biotechnology Bioinformatics Support Group 04/12/2015 Papadopoulos GL (IMBB, FORTH) IMBB NGS Seminar 04/12/2015 1

More information

Additional Case Study: Amino Acids and Evolution

Additional Case Study: Amino Acids and Evolution Student Worksheet Additional Case Study: Amino Acids and Evolution Objectives To use biochemical data to determine evolutionary relationships. To test the hypothesis that living things that are morphologically

More information

Lecture 19A. DNA computing

Lecture 19A. DNA computing Lecture 19A. DNA computing What exactly is DNA (deoxyribonucleic acid)? DNA is the material that contains codes for the many physical characteristics of every living creature. Your cells use different

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

Array-Ready Oligo Set for the Rat Genome Version 3.0

Array-Ready Oligo Set for the Rat Genome Version 3.0 Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.

More information

Basic Bioinformatics: Homology, Sequence Alignment,

Basic Bioinformatics: Homology, Sequence Alignment, Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi

More information

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 5a Protein sequence databases In this lecture, we will mainly discuss on Protein Sequence

More information

Lecture 2 Introduction to Data Formats

Lecture 2 Introduction to Data Formats Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 2 Introduction to Data Formats Introduction to Data Formats Real world, data and formats Sequences and

More information

Guided tour to Ensembl

Guided tour to Ensembl Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org

More information

Today He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

Today He who asks is a fool for five minutes, but he who does not ask remains a fool forever. CSE527 Computational Biology http://www.cs.washington.edu/527 Larry Ruzzo Autumn 2009 UW CSE Computational Biology Group Today He who asks is a fool for five minutes, but he who does not ask remains a

More information

A Zero-Knowledge Based Introduction to Biology

A Zero-Knowledge Based Introduction to Biology A Zero-Knowledge Based Introduction to Biology Konstantinos (Gus) Katsiapis 25 Sep 2009 Thanks to Cory McLean and George Asimenos Cells: Building Blocks of Life cell, membrane, cytoplasm, nucleus, mitochondrion

More information

How life. constructs itself.

How life. constructs itself. How life constructs itself Life constructs itself using few simple rules of information processing. On the one hand, there is a set of rules determining how such basic chemical reactions as transcription,

More information

Biology: The substrate of bioinformatics

Biology: The substrate of bioinformatics Bi01_1 Unit 01: Biology: The substrate of bioinformatics What is Bioinformatics? Bi01_2 handling of information related to living organisms understood on the basis of molecular biology Nature does it.

More information

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Tonight. Admin Stuff. CSEP590A Computational Biology

He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Tonight. Admin Stuff. CSEP590A Computational Biology CSEP590A Computational Biology http://www.cs.washington.edu/csep590a He who asks is a fool for five minutes, but he who does not ask remains a fool forever. Larry Ruzzo Summer 2006 -- Chinese Proverb UW

More information

Bioinformatics for Cell Biologists

Bioinformatics for Cell Biologists Bioinformatics for Cell Biologists 15 19 March 2010 Developmental Biology and Regnerative Medicine (DBRM) Schedule Monday, March 15 09.00 11.00 Introduction to course and Bioinformatics (L1) D224 Helena

More information

This software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part

This software/database/presentation is a United States Government Work under the terms of the United States Copyright Act. It was written as part This software/database/presentation is a "United States Government Work" under the terms of the United States Copyright Act. It was written as part of the author's official duties as a United States Government

More information

Fundamentals of Protein Structure

Fundamentals of Protein Structure Outline Fundamentals of Protein Structure Yu (Julie) Chen and Thomas Funkhouser Princeton University CS597A, Fall 2005 Protein structure Primary Secondary Tertiary Quaternary Forces and factors Levels

More information

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015 Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck

More information

Key Concept Translation converts an mrna message into a polypeptide, or protein.

Key Concept Translation converts an mrna message into a polypeptide, or protein. 8.5 Translation VOBLRY translation codon stop codon start codon anticodon Key oncept Translation converts an mrn message into a polypeptide, or protein. MIN IDES mino acids are coded by mrn base sequences.

More information

03-511/711 Computational Genomics and Molecular Biology, Fall

03-511/711 Computational Genomics and Molecular Biology, Fall 03-511/711 Computational Genomics and Molecular Biology, Fall 2011 1 Problem Set 0 Due Tuesday, September 6th This homework is intended to be a self-administered placement quiz, to help you (and me) determine

More information

Forensic Science: DNA Evidence Unit

Forensic Science: DNA Evidence Unit Day 2 : Cooperative Lesson Topic: Protein Synthesis Duration: 55 minutes Grade Level: 10 th Grade Forensic Science: DNA Evidence Unit Purpose: The purpose of this lesson is to review and build upon prior

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

Sequence Based Function Annotation

Sequence Based Function Annotation Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation 1. Given a sequence, how to predict its biological

More information

Aipotu I & II: Genetics & Biochemistry

Aipotu I & II: Genetics & Biochemistry Aipotu I & II: Genetics & Biochemistry Objectives: To reinforce your understanding of Genetics, Biochemistry, and Molecular Biology To show the connections between these three disciplines To show how these

More information

What is necessary for life?

What is necessary for life? Life What is necessary for life? Most life familiar to us: Eukaryotes FREE LIVING Or Parasites First appeared ~ 1.5-2 10 9 years ago Requirements: DNA, proteins, lipids, carbohydrates, complex structure,

More information

Introduction to Bioinformatics for Computer Scientists. Lecture 2

Introduction to Bioinformatics for Computer Scientists. Lecture 2 Introduction to Bioinformatics for Computer Scientists Lecture 2 Preliminaries Knowledge test I didn't expect you to know all that stuff Oral exam We can do it in German if you like Summer seminar: Hot

More information

info@rcsb.org www.pdb.org Bioinformatics of Green Fluorescent Protein This bioinformatics tutorial explores the relationship between gene sequence, protein structure, and biological function in the context

More information