NB536: Bioinformatics
|
|
- Gerard Parsons
- 5 years ago
- Views:
Transcription
1 NB536: Bioinformatics
2 Instructor Prof. Jong Kyoung Kim Department of New Biology Office: E Homepage:
3 Course website /nb536-bioinformatics/
4 Office hours Monday 17:00 18:00
5 Evaluation Midterm exam 30% Final exam 30% Homework 40%
6 Objectives We will learn about 1. a statistical and computational framework for representing, analyzing and integrating the high-throughput sequencing data 2. a suite of tools and resources that are widely used in analyzing the sequencing data and their assumptions and limitations within the probabilistic and statistical framework
7 Analyzing and manipulating DNA
8 Nucleotide: subunit of the nucleic acids A nucleotide consists of a nitrogen-containing base, a fivecarbon sugar, and one or more phosphate groups.
9 Nucleotide and nucleoside P Nucleotide: Base + Sugar + Phosphate Nucleoside: Base + Sugar
10 Nucleotide
11 Nucleotides are joined together to form nucleic acids
12 Complementary base pairs in the DNA double helix
13 Recombinant DNA technology The ability to manipulate DNA with precision in a test tube or an organism: 1. Cleavage of DNA at specific sites by restriction nucleases 2. DNA ligation, which makes it possible to seamlessly join together DNA molecules from widely different sources 3. DNA cloning in which a portion of a genome is purified away from the remainder of the genome by repeatedly copying it to generate many billions of identical molecules
14 4. Nucleic acid hybridization, which makes it possible to identify any specific sequence of DNA or RNA with great accuracy and sensitivity based on its ability to selectively bind a complementary nucleic acid sequence. 5. DNA synthesis, which makes it possible to chemically synthesize DNA molecules with any sequence of nucleotides, whether or not the sequence occurs in nature. 6. Rapid determination of the sequence of nucleotides of any DNA or RNA molecule.
15 Cleavage of DNA How?
16 Restriction nucleases GGCC CCGG HaeIII GG CC + GG CC GAATTC CTTAAG EcoRI G CTTAA + AATTC G Restriction nucleases restrict the transfer of foreign DNA into bacteria. Different bacterial species produce different restriction nucleases, each cutting at a different, specific nucleotide sequence. These target sequences are short (4-8 bp), many sites of cleavage will occur by chance in any long DNA sequence.
17 Gel electrophoresis
18 Restriction map
19 Mapping with restriction map Cleavage sites for restriction nucleases A, B, C, and D Random fragments
20 DNA ligation 5 3 OH P P OH 3 5 Ligase + ATP Ligase + ATP Sticky ends Blunt ends
21 DNA cloning DNA cloning refers to 1. the act of making many identical copies of a DNA molecule 2. the isolation of a particular stretch of DNA from the rest of the cell s genome
22 DNA cloning
23 Genomic DNA library DNA library: the collection of cloned plasmid molecules Genomic DNA library: the DNA fragments derived directly from the chromosomal DNA of the organism of interest, representing the entire genome of that organism.
24 Hybridization DNA double helices Heat Denaturation to single strands Slowly cool Renaturation (Hybridization) to DNA double helices
25 The chemistry of DNA synthesis
26 Polymerase chain reaction
27 First-generation sequencing technologies
28 Sanger sequencing
29 Sanger sequencing
30 Sanger sequencing Developed by Dr. Frederick Sanger in 1977 Read-length up to 1000 bp* Per-base accuracies as high as %* Low-throughput High-cost (0.5$ per kb) *Nature Biotechnology 26, (2008)
31 Shutgun sequencing
32 Top-down sequencing
33 Next-generation sequencing technologies
34 Sequencing cost per Mbp
35 Sequencing cost per genome
36 Overview Sequencing by ligation Short-read NGS Sequencing by synthesis Illumina NGS Single-molecule approach Long-read NGS Synthetic approach
37 General principles of short-read NGS Construct a library of fragments Generate clonal template populations Massively parallel DNA sequencing reactions Analyze data
38 Illumina: Library preparation
39 Illumina: Library preparation
40 Illumina: Cluster amplification
41 Illumina: Cluster amplification
42 Illumina: Sequencing by synthesis
43 Illumina: Summary