CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

Similar documents
Nucleic acids and protein synthesis

DNA RNA PROTEIN SYNTHESIS -NOTES-

DNA. translation. base pairing rules for DNA Replication. thymine. cytosine. amino acids. The building blocks of proteins are?

DNA Structure and Replication, and Virus Structure and Replication Test Review

DNA and RNA. Chapter 12

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

Nucleic Acids: DNA and RNA

DNA - DEOXYRIBONUCLEIC ACID

Chapter 10: Gene Expression and Regulation

What happens after DNA Replication??? Transcription, translation, gene expression/protein synthesis!!!!

Chapter 12. DNA TRANSCRIPTION and TRANSLATION

Bundle 5 Test Review

BIOLOGY LTF DIAGNOSTIC TEST DNA to PROTEIN & BIOTECHNOLOGY

Protein Synthesis. Application Based Questions

Nucleic acids deoxyribonucleic acid (DNA) ribonucleic acid (RNA) nucleotide

Neurospora mutants. Beadle & Tatum: Neurospora molds. Mutant A: Mutant B: HOW? Neurospora mutants

Daily Agenda. Warm Up: Review. Translation Notes Protein Synthesis Practice. Redos

36. The double bonds in naturally-occuring fatty acids are usually isomers. A. cis B. trans C. both cis and trans D. D- E. L-

Adv Biology: DNA and RNA Study Guide

DNA & Protein Synthesis UNIT D & E

Protein Synthesis & Gene Expression

Key Concept Translation converts an mrna message into a polypeptide, or protein.

Do you remember. What is a gene? What is RNA? How does it differ from DNA? What is protein?

PROTEIN SYNTHESIS Flow of Genetic Information The flow of genetic information can be symbolized as: DNA RNA Protein

Chapter 13 - Concept Mapping

Chapter 13. From DNA to Protein

11 questions for a total of 120 points

Create a model to simulate the process by which a protein is produced, and how a mutation can impact a protein s function.

Unit VII DNA to RNA to protein The Central Dogma

Chapter 10 - Molecular Biology of the Gene

Biology: The substrate of bioinformatics

Problem Set Unit The base ratios in the DNA and RNA for an onion (Allium cepa) are given below.

From Gene to Protein Transcription and Translation

Unit #5 - Instructions for Life: DNA. Background Image

UNIT (12) MOLECULES OF LIFE: NUCLEIC ACIDS

Flow of Genetic Information

6- Important Molecules of Living Systems. Proteins Nucleic Acids Taft College Human Physiology

DNA: The Molecule of Heredity

Protein Synthesis. DNA to RNA to Protein

DNA is the genetic material. DNA structure. Chapter 7: DNA Replication, Transcription & Translation; Mutations & Ames test

Independent Study Guide The Blueprint of Life, from DNA to Protein (Chapter 7)

Gene Expression Transcription/Translation Protein Synthesis

DNA Structure and Protein synthesis

From Gene to Protein via Transcription and Translation i

Name: Class: Date: ID: A

DNA Begins the Process

The Central Dogma of Molecular Biology

Transcription and Translation

DNA and RNA. Chapter 12

Protein Synthesis

PROTEIN SYNTHESIS. copyright cmassengale

Nucleic Acids: Structure and Function

translation The building blocks of proteins are? amino acids nitrogen containing bases like A, G, T, C, and U Complementary base pairing links

Prokaryotic Transcription

From Gene to Protein Transcription and Translation i

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

What is DNA??? DNA = Deoxyribonucleic acid IT is a molecule that contains the code for an organism s growth and function

Bio11 Announcements. Ch 21: DNA Biology and Technology. DNA Functions. DNA and RNA Structure. How do DNA and RNA differ? What are genes?

6. Which nucleotide part(s) make up the rungs of the DNA ladder? Sugar Phosphate Base

RNA & PROTEIN SYNTHESIS

Bundle 6 Test Review

DNA, RNA, PROTEIN SYNTHESIS, AND MUTATIONS UNIT GUIDE Due December 9 th. Monday Tuesday Wednesday Thursday Friday 16 CBA History of DNA video

DNA Replication and Protein Synthesis

RNA and PROTEIN SYNTHESIS. Chapter 13

UNIT 4. DNA, RNA, and Gene Expression

DNA, Replication and RNA

Higher Human Biology Unit 1: Human Cells Pupils Learning Outcomes

Transcription. Unit: DNA. Central Dogma. 2. Transcription converts DNA into RNA. What is a gene? What is transcription? 1/7/2016

Protein Synthesis: Transcription and Translation

DNA vs. RNA B-4.1. Compare DNA and RNA in terms of structure, nucleotides and base pairs.

DNA- THE MOLECULE OF LIFE

Ch 10.4 Protein Synthesis

Review of Protein (one or more polypeptide) A polypeptide is a long chain of..

DNA- THE MOLECULE OF LIFE. Link

Frederick Griffith. Dead Smooth Bacteria. Live Smooth Bacteria. Live Rough Bacteria. Live R+ dead S Bacteria

Chapter 8: DNA and RNA

The Genetic Code and Transcription. Chapter 12 Honors Genetics Ms. Susan Chabot

DNA and RNA 2/14/2017. What is a Nucleic Acid? Parts of Nucleic Acid. DNA Structure. RNA Structure. DNA vs RNA. Nitrogen bases.

Protein Synthesis. OpenStax College

UNIT I RNA AND TYPES R.KAVITHA,M.PHARM LECTURER DEPARTMENT OF PHARMACEUTICS SRM COLLEGE OF PHARMACY KATTANKULATUR

DNA, RNA and Protein Synthesis

C. Incorrect! Threonine is an amino acid, not a nucleotide base.

Replication Review. 1. What is DNA Replication? 2. Where does DNA Replication take place in eukaryotic cells?

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

STUDY GUIDE SECTION 10-1 Discovery of DNA

Summary 12 1 DNA RNA and Protein Synthesis Chromosomes and DNA Replication. Name Class Date

CHAPTER 1. DNA: The Hereditary Molecule SECTION D. What Does DNA Do? Chapter 1 Modern Genetics for All Students S 33

RNA : functional role

2. The instructions for making a protein are provided by a gene, which is a specific segment of a molecule.

Name 10 Molecular Biology of the Gene Test Date Study Guide You must know: The structure of DNA. The major steps to replication.

DNA/RNA STUDY GUIDE. Match the following scientists with their accomplishments in discovering DNA using the statement in the box below.

RNA and Protein Synthesis

THE GENETIC CODE Figure 1: The genetic code showing the codons and their respective amino acids

NON MENDELIAN GENETICS. DNA, PROTEIN SYNTHESIS, MUTATIONS DUE DECEMBER 8TH

Activity A: Build a DNA molecule

Thr Gly Tyr. Gly Lys Asn

Transcription:

1 CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Jean Gao at UT Arlington Mingon Kang, PhD Computer Science, Kennesaw State University

2 Genetics The discovery of DNA (deoxyribonucleic acid) In the early 1950s, Rosalind Franklin and Maurice Wilkins, studied DNA using X-Rays. James Watson and Francis Crick studied the 3D structure of DNA. X-ray diffraction image of DNA (Omikron/Science Source)

3 Genetics The human genome has approximately 3 billion base pairs of DNA Represented by a single letter: A (adenine), C (cytosine), G (guanine), and T (thymine) Arranged into 23 pairs of chromosomes in a cell

4 Human Genome Project The world's largest collaborative biological project mainly funded ($2.7 billion in 1991 dollars) by National Institutes of Health (NIH) Identify and Map the nucleotides contained in a human haploid reference genome Officially launched in 1990 and completed in 2003 10 years to sequence whole human genomes

5 DNA Sequencing Cost Sequencing cost was below $1,000 in 2015 Ref. https://www.genome.gov/sequencingcosts/

6 Human Genetic Disorders About 99% of DNA is same between humans and chimpanzees. About 99.9% of human genome (3 billions) are same Only 0.1% (3 hundred million) genetic variations makes the uniqueness of each individual. 99% same

7 Human Genetic Disorders Disease caused by a change, or mutation, in an individual s DNA sequence Many human diseases have genetic components. Autism Parkinson's disease Cancer (Colon 2, Breast 3, Prostate 5, Skin) * : the * th leading cause of death in America

8 Core Research Question? What DNA variations are implicated in a human disease? What are Genetic Biomarkers of human diseases? What are Genetic Mechanisms of a human disease? Consist of complex interactions of multiple biological processes Genetic, Epigenetic, and Transcriptional regulations

9 Biological Systems in Genetics The Central Dogma of molecular biology Transcription and Translation DNA RNA Protein DNA: Encodes hereditary information RNA: Intermediary between DNA and protein Protein: determines cell activities

image from the DOE Human Genome Program http://www.ornl.gov/hgmis 10

image from the DOE Human Genome Program http://www.ornl.gov/hgmis 11

image from the DOE Human Genome Program http://www.ornl.gov/hgmis 12

Bioinformatics Schematic of a Cell Lipid membrane Proteins DNA Nucleolus Nucleus

14 Chromosomes Chromatin: a fine network of threads composed of DNA in association with proteins. When a cell divides, the chromatin threads become highly condensed to form chromosomes. prokaryotes (single-celled organisms lacking nuclei) have a single circular chromosome eukaryotes (organisms with nuclei) have a speciesspecific number of linear chromosomes

DNA packs in the nucleus to form chromosome

16 Eg: Human Chromosomes Ref: https://www.guam.net/pub/sshs/depart/science/mancuso/apbiolecture/12_chromohg/humangen.htm

17 DNA can be thought of as the blueprint for an organism four different nucleotides distinguished by the four bases: adenine (A), cytosine (C), guanine (G) and thymine (T)

18 DNA a single strand of DNA can be thought of as a string composed of the four letters: A, C, G, T ctgctggaccgggtgctaggaccctgactgcccgg ggccgggggtgcggggcccgctgag

19 The Double Helix DNA molecules usually consist of two strands arranged in the famous Watson-Crick double helix

20 Watson-Crick Base Pairs in double-stranded DNA A always bonds to T C always bonds to G A-T, G-C: stabilized by hydrogen bonds. (2 between A-T paring, 3 between G-C paring.) Sequence = order of base pairs

21 The Double Helix each strand of DNA has a direction at one end, the terminal carbon atom in the backbone is the 5 carbon atom of the terminal sugar linked to a free phosphate group. at the other end, the terminal carbon atom is the 3 carbon atom of the terminal sugar attached to an unlinked hydroxy group. therefore we can talk about the 5 and the 3 ends of a DNA strand The direction from the 5 end to the 3 end is call FORWARD or POSITIVE. in a double helix, the strands are antiparallel (arrows drawn from the 5 end to the 3 end go in opposite directions)

22 RNA (Ribonucleic acid) RNA is like DNA except: backbone is a little different (Not deoxy) usually single stranded (folds up) the base uracil (U) is used in place of thymine (T) a strand of RNA can be thought of as a string composed of the four letters: A, C, G, U

23 Genes Gene: a segment of DNA sequence corresponding to a single protein (or to a single catalytic or structural RNA molecule for those genes that produce RNA but not protein). a gene is said to encode a protein A single molecule of DNA contains many genes. genes are the basic units of heredity Let s check UCSC Genome Browser

24 Genes Start codon: genes begins with certain combinations of three DNA bases In Eukaryotes, most start codon is ATG In Prokaryotes, ATG(83%), GTG (14%), TTG (3%) and possibly ATT and CTG Stop codon: ends with TAG, TAA, and TGA. Exon/Intron: parts of genes Exons: DNA converted into mrna, coding regions Intron: DNAs do not directly code for protein, non-coding regions. Introns are only found in eukaryotic cells. Prokaryotes have only exons

Genes Codon: a three-base sequence in mrna that specifies one amino acid. Each codon is complimentary to a three-base code word in DNA.

Codons and Reading Frames 26

27 DNA encodes protein DNA uses an alphabet of 4 letters (ATCG), mostly common called bases. Long sequences of these 4 letters are linked together to create GENES and CONTROL INFORMATION. Each of the twenty protein amino acids can be specified by 3 consecutive DNA bases, which is a triplet code word. 61 of the 64 possible triplets are used to specify 20 amino acids. This means a given AA is usually specified by more than one code word. The 3 triplets that don t specify amino acids are known as termination code words. (ACT, ATT, ATC)

28 Genomes Genome: a total collection of genetic information in an organism that specifies a particular organism s (species ) structure and function. Can be thought of most directly as the base sequences of all the DNA molecules in a cell. The human genome consists of 46 chromosomes, containing between 50,000 and 100,000 genes. (old: 1995) (New discovery: less than 30,000 genes (year 2004)) Every cell (except sex cells and mature red blood cells) contains the complete genome of an organism

Gene Density not all of the DNA in a genome encodes protein: microbes 90% coding kb/gene human 3% coding 35kb/gene

30 The Central Dogma The mechanism for expressing genetic information in all living organisms. transcription translation DNA RNA Protein

31 Transcription Transcription: the genetic message is passed from DNA to mrna in the nucleus. RNA that is transcribed from a gene is called messenger RNA (mrna) we ll talk about other varieties of RNA later in the course RNA polymerase is the enzyme that builds an RNA strand from a gene

Transcription 32

RNA Splicing in eukaryotes 33

34 Transcription Question: Which of the two DNA strands is used as template for RNA synthesis? Answer: This is determined by a specific sequence of nucleotides in DNA, called promoter, located at the beginning of each gene. Promoter is present in only one of the two DNA strands. Genes are transcribed only when RNA polymerase can bind to their promoter sites.

35 Translation Process that RNAs makes Proteins Ribosomes are the machines in the cytoplasm that synthesize proteins from mrna The grouping of codons is called the reading frame Translation begins with the start codon Translation ends with the stop codon

Translation short summary 36

37 Transfer RNA (trna) trna is synthesized in the nucleus by base-paring with DNA nucleotides at specific trna genes, and then moves to the cytoplasm. Smallest of the three types of RNA (about 80 nucleotides long). Key role: can combine with both a specific amino acid and a codon in ribosome-bound mrna specific for that AA.

38 Translation Ribosome

Genetic Code The 64 mappings of 3 bases to 1 amino acid is called the GENETIC CODE and is universal (on earth...). Ex:

image from the DOE Human Genome Program http://www.ornl.gov/hgmis 40

41 The Central Dogma https://youtu.be/j3hvvi2k2no

42 Protein Functions Structural support Storage of amino acids Transport of other substances Coordination of an organism s activities Response of cell to chemical stimuli Movement Protection against disease Selective acceleration of chemical reactions

43 Proteins The subunit of protein structure is an amino acid. Cells build their proteins from 20 different amino acids A peptide is a compound consisting of 2 or more amino acids. Polypeptides and proteins are chains of 10 or more amino acids, but peptides consisting of more than 50 amino acids are classified as proteins. proteins account for about 50% of the organic material in the body

Amino Acids Alanine Ala A Arginine Arg R Aspartic Acid Asp D Asparagine Asn N Cysteine Cys C Glutamic Acid Glu E Glutamine Gln Q Glycine Gly G Histidine His H Isoleucine Ile I Leucine Leu L Lysine Lys K Methionine Met M Phenylalanine Phe F Proline Pro P Serine Ser S Threonine Thr T Tryptophan Trp W Tyrosine Tyr Y Valine Val V

Genetic Code for AA If 3 RNA bases code for 1 AA, RNA could code for 64 AA. More than enough coding capacity for 20 AA Code is redundant for most AA. Reference: http://www.slideshare.net/cgales/central-dogma-and-protein-synthesis

Peptide Bond Peptide bond: the bond formed between the amino and carboxyl groups. Polypeptide: a sequence of amino acids linked by peptide bonds. Peptide: If the number of amino acids in a polypeptide is 50 or less, the molecule is known as peptide. Protein: If the sequence is more than 50 amino acid units long, the polypeptide is known as a protein.

Four levels of Protein Structure Reference: http://www.slideshare.net/kaizettedetaza/central-dogma-12027441

48 Mutation Reference: http://www.slideshare.net/kaizettedetaza/central-dogma-12027441

SNP Single-Nucleotide Polymorphisms (SNP) DNA Variation occurring commonly within a population (e.g., 1%) in a single nucleotide. There are two alleles, A and G, at this locus 49/28

SNP Examine DNA variations at prespecified loci (about 10 millions) Bi-allele (major/minor allele) Assumed that Minor allele may cause a specific rare phenotype (e.g., diseases) while Major allele is mainly observed in a population. Data Encoding SNP1 SNP2 SNP3 G A A C C T G G A A C C G A C C C T SNP1 SNP2 SNP3 1 1 1 0 0 0 1 2 1 50/28

51 GWAS Genome-Wide Association Study Focus on associations between SNPs and a phenotype (e.g., cancer) Conventionally, statistically compare SNPs of two groups (case and control) based on a linear regression model. Pairwise univariate analysis Perform t-test between each SNP and a phenotype individually X y y = xx x 12 β 3 ββ 12 3 + εε ε SNP SNP SNP. Phenotype 1 2 3 0 2 0. 1 0 1 1. 1 0 0 0. 0 y = x i β i + ε y

52 eqtl expression Quantitative Trait Loci (eqtl) mapping study Association between SNPs and mrna Find genomic loci that regulate expression levels of mrnas (gene expression) Capture the insight of the genetic architecture of gene expression. X SNP SNP SNP. Gene1 1 2 3 0 2 0. 3.25 0 1 1. 3.10 0 0 0. 1.11 Y Gene2 2.22 2.32 2.42