Primer Design Workshop. École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria

Similar documents
Lecture 10, 20/2/2002: The process of solution development - The CODEHOP strategy for automatic design of consensus-degenerate primers for PCR

Materials Protein synthesis kit. This kit consists of 24 amino acids, 24 transfer RNAs, four messenger RNAs and one ribosome (see below).

Disease and selection in the human genome 3

NAME:... MODEL ANSWER... STUDENT NUMBER:... Maximum marks: 50. Internal Examiner: Hugh Murrell, Computer Science, UKZN

ORFs and genes. Please sit in row K or forward

Lecture 11: Gene Prediction

Lecture 19A. DNA computing

Homework. A bit about the nature of the atoms of interest. Project. The role of electronega<vity

Det matematisk-naturvitenskapelige fakultet

Project 07/111 Final Report October 31, Project Title: Cloning and expression of porcine complement C3d for enhanced vaccines

G+C content. 1 Introduction. 2 Chromosomes Topology & Counts. 3 Genome size. 4 Replichores and gene orientation. 5 Chirochores.

Arabidopsis actin depolymerizing factor AtADF4 mediates defense signal transduction triggered by the Pseudomonas syringae effector AvrPphB

Figure S1. Characterization of the irx9l-1 mutant. (A) Diagram of the Arabidopsis IRX9L gene drawn based on information from TAIR (the Arabidopsis

Codon Bias with PRISM. 2IM24/25, Fall 2007

Electronic Supplementary Information

Protein Structure Analysis

Supplemental Data Supplemental Figure 1.

Hes6. PPARα. PPARγ HNF4 CD36

Supplementary Figure 1A A404 Cells +/- Retinoic Acid

PGRP negatively regulates NOD-mediated cytokine production in rainbow trout liver cells

for Programmed Chemo-enzymatic Synthesis of Antigenic Oligosaccharides

Supplementary. Table 1: Oligonucleotides and Plasmids. complementary to positions from 77 of the SRα '- GCT CTA GAG AAC TTG AAG TAC AGA CTG C

National PHL TB DST Reference Center PSQ Reporting Language Table of Contents

Supporting Information. Copyright Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, 2006

Supporting information for Biochemistry, 1995, 34(34), , DOI: /bi00034a013

Expression of Recombinant Proteins

Supplement 1: Sequences of Capture Probes. Capture probes were /5AmMC6/CTG TAG GTG CGG GTG GAC GTA GTC

PROTEIN SYNTHESIS Study Guide

Y-chromosomal haplogroup typing Using SBE reaction

Genomic Sequence Analysis using Electron-Ion Interaction

Supplemental material

Supporting Information

Lezione 10. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Cat. # Product Size DS130 DynaExpress TA PCR Cloning Kit (ptakn-2) 20 reactions Box 1 (-20 ) ptakn-2 Vector, linearized 20 µl (50 ng/µl) 1


Supplemental Data. mir156-regulated SPL Transcription. Factors Define an Endogenous Flowering. Pathway in Arabidopsis thaliana

Add 5µl of 3N NaOH to DNA sample (final concentration 0.3N NaOH).

Gene synthesis by circular assembly amplification

Multiplexing Genome-scale Engineering

FROM DNA TO GENETIC GENEALOGY Stephen P. Morse

Creation of A Caspese-3 Sensing System Using A Combination of Split- GFP and Split-Intein

Supplementary Materials for

Supplemental Data. Bennett et al. (2010). Plant Cell /tpc

Supplementary Information. Construction of Lasso Peptide Fusion Proteins

Table S1. Bacterial strains (Related to Results and Experimental Procedures)

ΔPDD1 x ΔPDD1. ΔPDD1 x wild type. 70 kd Pdd1. Pdd3

strain devoid of the aox1 gene [1]. Thus, the identification of AOX1 in the intracellular

SAY IT WITH DNA: Protein Synthesis Activity by Larry Flammer

Dierks Supplementary Fig. S1

Supporting Online Information

II 0.95 DM2 (RPP1) DM3 (At3g61540) b

SUPPORTING INFORMATION

RPA-AB RPA-C Supplemental Figure S1: SDS-PAGE stained with Coomassie Blue after protein purification.

Introduction to Bioinformatics Dr. Robert Moss

Supporting Information

Converting rabbit hybridoma into recombinant antibodies with effective transient production in an optimized human expression system

Thr Gly Tyr. Gly Lys Asn

evaluated with UAS CLB eliciting UAS CIT -N Libraries increase in the

Legends for supplementary figures 1-3

Overexpression Normal expression Overexpression Normal expression. 26 (21.1%) N (%) P-value a N (%)

DNA sentences. How are proteins coded for by DNA? Materials. Teacher instructions. Student instructions. Reflection

2

PCR analysis was performed to show the presence and the integrity of the var1csa and var-

Quantitative reverse-transcription PCR. Transcript levels of flgs, flgr, flia and flha were

Sampling Random Bioinformatics Puzzles using Adaptive Probability Distributions

Supplemental Table 1. Mutant ADAMTS3 alleles detected in HEK293T clone 4C2. WT CCTGTCACTTTGGTTGATAGC MVLLSLWLIAAALVEVR

MacBlunt PCR Cloning Kit Manual

Supporting Information

Nucleic Acids Research

hcd1tg/hj1tg/ ApoE-/- hcd1tg/hj1tg/ ApoE+/+

SUPPLEMENTAL MATERIAL GENOTYPING WITH MULTIPLEXING TARGETED RESEQUENCING

S4B fluorescence (AU)

Additional Table A1. Accession numbers of resource records for all rhodopsin sequences downloaded from NCBI. Species common name

A Circular Code in the Protein Coding Genes of Mitochondria

Supplemental Table 1. Primers used for PCR.

PCR-based Markers and Cut Flower Longevity in Carnation

Complexity of the Ruminococcus flavefaciens FD-1 cellulosome reflects an expansion of family-related protein-protein interactions

SUPPLEMENTARY INFORMATION

SUPPLEMENTAL TABLE S1. Additional descriptions of plasmid constructions and the oligonucleotides used Plasmid or Oligonucleotide

Evolution of protein coding sequences

Supplemental Information. Human Senataxin Resolves RNA/DNA Hybrids. Formed at Transcriptional Pause Sites. to Promote Xrn2-Dependent Termination

INTRODUCTION TO THE MOLECULAR GENETICS OF THE COLOR MUTATIONS IN ROCK POCKET MICE

Search for and Analysis of Single Nucleotide Polymorphisms (SNPs) in Rice (Oryza sativa, Oryza rufipogon) and Establishment of SNP Markers

FAT10 and NUB1L bind the VWA domain of Rpn10 and Rpn1 to enable proteasome-mediated proteolysis

11th Meeting of the Science Working Group. Lima, Peru, October 2012 SWG-11-JM-11

Supplemental Information. Target-Mediated Protection of Endogenous. MicroRNAs in C. elegans. Inventory of Supplementary Information

Supplemental Figure 1 Real-time RT-PCR analysis of 16 genes other than those shown in Fig. 6C.

SUPPLEMENTARY MATERIALS AND METHODS. E. coli strains, plasmids, and growth conditions. Escherichia coli strain P90C (1)

Supplementary Figures

Supplemental Data. Distinct Pathways for snorna and mrna Termination

BioInformatics and Computational Molecular Biology. Course Website

Chapter 13 Chromatin Structure and its Effects on Transcription

Appendix 1a. Microsatellite analysis of P1-hyg, P2-neo and their. Amplified cdna (base pairs) P1-hyg P2-neo All progeny A A

Genomics and Gene Recognition Genes and Blue Genes

Table S1. Sequences of mutagenesis primers used to create altered rdpa- and sdpa genes

NESTED Sequence-based Typing (SBT) protocol for epidemiological typing of Legionella pneumophila directly from clinical samples

Molecular Level of Genetics

SUPPORTING INFORMATION FILE

1. DNA, RNA structure. 2. DNA replication. 3. Transcription, translation

Transcription:

Primer Design Workshop École d'été en géné-que des champignons 2012 Dr. Will Hintz University of Victoria

Scenario You have discovered the presence of a novel endophy5c organism living inside the cells of a tropical palm species They synthesize alkaloids that give some protec5on to the plant against insect a;ack The organism cannot be cultured but is thought to be related to the Clavicitaceae, Ascomycetes (a fungus)

Bio mining Several genes from endophy5c fungi related to toxin produc5on are now being cloned and studied in depth i.e. (DMAT Synthase) dime5lail tryptophane synthase the first step in ergot synthesis in Claviceps purpurea Mycotoxin genes are oken clustered First you have to iden5fy the endophyte

Challenges The endophyte may be difficult to dissociate from the host and may unculturable You need to dis5nguish between the host genome and the endophyte genome You have absolutely no gene5c informa5on on the endophyte nor any idea of its iden5ty First step: How is this organism related to others in its phylum?

Approach: Reverse Transla5on from compiled consensus pep5de sequence How can you derive the DNA sequence for a gene when you don t have any pep5de or DNA sequence informa5on? Infer the polypep5de sequence by alignment of similar sequences from other closely related organisms Reverse translate with op5mal condi5ons Today we will use as an example the gene for dihydrofolate reductase

Why DHFR? Ubiquitous enzyme found in all organisms Dihydrofolate reductase catalyzes reduc5on of dihydrofolic acid to tetrahydrofolic acid (folate) an essen5al vitamin B9 h;p://proteopedia.org/ wiki/index.php/

In Class Assignment Locate two regions of the DHFR gene suitable for primer design support your choice Derive oligonucleo5de primers specific to these regions for PCR amplifica5on of the intervening DNA Give the an5cipated length of the amplifica5on product and explain how you would confirm its iden5ty

Reverse Transla5on Steps Align several DHFR genes from database Iden5fy conserved domains Reverse translate the protein into short DNA oligomers Develop (op5mized) PCR primers to amplify gene from the unknown organism (and possibly host) Do Blast search and construct phylogene5c tree for DHRF sequences to possibly gain iden5ty of unknown organism

Compile DHFR Sequence List

Align all known pep5de sequences h;p://www.ebi.ac.uk/tools/msa/clustalw2/ Note only three sequences shown

Iden5fy highly conserved amino acids Note only three sequences shown

Also look at less conserved regions Note only three sequences shown

Re check You can also recheck the alignment using other programs to see if you get consistent results Checking with COBALT h;p://www.ncbi.nlm.nih.gov/tools/cobalt/ gave much the same result Next derive a consensus sequence for the aligned regions

Consensus sequence You can draw a consensus sequence from the alignment or have Clustal do it for you

Reverse translate the pep5de Many simple sokware tools to do this (i.e. h;p://www.vivo.colostate.edu/molkit/rtranslate/index.html

Codon Usage Database h;p://www.kazusa.or.jp/codon/ Search the database for a codon preference for an organism which you suspect is related to your unknown

G Gly GGG 0.16 W Trp TGG 1.00 G Gly GGA 0.20 * End TGA 0.44 G Gly GGT 0.30 C Cys TGT 0.31 G Gly GGC 0.34 C Cys TGC 0.72 E Glu GAG 0.62 * End TAG 0.17 E Glu GAA 0.38 * End TAA 0.39 D Asp GAT 0.50 Y Tyr TAT 0.34 D Asp GAC 0.50 Y Tyr TAC 0.66 V Val GTG 0.21 L Leu TTG 0.13 V Val GTA 0.09 L Leu TTA 0.07 V Val GTT 0.29 F Phe TTT 0.30 V Val GTC 0.37 F Phe TTC 0.70 A Ala GCG 0.19 S Ser TCG 0.16 A Ala GCA 0.17 S Ser TCA 0.14 A Ala GCT 0.29 S Ser TCT 0.21 A Ala GCC 0.35 S Ser TCC 0.23 R Arg AGG 0.10 R Arg CGG 0.18 R Arg AGA 0.08 R Arg CGA 0.17 S Ser AGT 0.09 R Arg CGT 0.20 S Ser AGC 0.17 R Arg CGC 0.27 Construct codon preference table for highly expressed genes K Lys AAG 0.70 Q Gln CAG 0.61 K Lys AAA 0.30 Q Gln CAA 0.39 N Asn AAT 0.29 H His CAT 0.45 N Asn AAC 0.71 H His CAC 0.55 M Met ATG 1.00 L Leu CTG 0.24 I Ile ATA 0.11 L Leu CTA 0.10 I Ile ATT 0.31 L Leu CTT 0.20 I Ile ATC 0.58 L Leu CTC 0.26 T Thr ACG 0.22 P Pro CCG 0.22 T Thr ACA 0.19 P Pro CCA 0.22 T Thr ACT 0.25 P Pro CCT 0.27 T Thr ACC 0.34 P Pro CCC 0.28

G Gly GGG 0.16 W Trp TGG 1.00 G Gly GGA 0.20 * End TGA 0.44 G Gly GGT 0.30 C Cys TGT 0.31 G Gly GGC 0.34 C Cys TGC 0.72 E Glu GAG 0.62 * End TAG 0.17 E Glu GAA 0.38 * End TAA 0.39 D Asp GAT 0.50 Y Tyr TAT 0.34 D Asp GAC 0.50 Y Tyr TAC 0.66 V Val GTG 0.21 L Leu TTG 0.13 V Val GTA 0.09 L Leu TTA 0.07 V Val GTT 0.29 F Phe TTT 0.30 V Val GTC 0.37 F Phe TTC 0.70 A Ala GCG 0.19 S Ser TCG 0.16 A Ala GCA 0.17 S Ser TCA 0.14 A Ala GCT 0.29 S Ser TCT 0.21 A Ala GCC 0.35 S Ser TCC 0.23 R Arg AGG 0.10 R Arg CGG 0.18 R Arg AGA 0.08 R Arg CGA 0.17 S Ser AGT 0.09 R Arg CGT 0.20 S Ser AGC 0.17 R Arg CGC 0.27 Iden5fy low redundancy codons and highly preferred codons ( > 58 % usage) K Lys AAG 0.70 Q Gln CAG 0.61 K Lys AAA 0.30 Q Gln CAA 0.39 N Asn AAT 0.29 H His CAT 0.45 N Asn AAC 0.71 H His CAC 0.55 M Met ATG 1.00 L Leu CTG 0.24 I Ile ATA 0.11 L Leu CTA 0.10 I Ile ATT 0.31 L Leu CTT 0.20 I Ile ATC 0.58 L Leu CTC 0.26 T Thr ACG 0.22 P Pro CCG 0.22 T Thr ACA 0.19 P Pro CCA 0.22 T Thr ACT 0.25 P Pro CCT 0.27 T Thr ACC 0.34 P Pro CCC 0.28

G Gly GGG 0.16 W Trp TGG 1.00 G Gly GGA 0.20 * End TGA 0.44 G Gly GGT 0.30 C Cys TGT 0.31 G Gly GGC 0.34 C Cys TGC 0.72 E Glu GAG 0.62 * End TAG 0.17 E Glu GAA 0.38 * End TAA 0.39 D Asp GAT 0.50 Y Tyr TAT 0.34 D Asp GAC 0.50 Y Tyr TAC 0.66 V Val GTG 0.21 L Leu TTG 0.13 V Val GTA 0.09 L Leu TTA 0.07 V Val GTT 0.29 F Phe TTT 0.30 V Val GTC 0.37 F Phe TTC 0.70 A Ala GCG 0.19 S Ser TCG 0.16 A Ala GCA 0.17 S Ser TCA 0.14 A Ala GCT 0.29 S Ser TCT 0.21 A Ala GCC 0.35 S Ser TCC 0.23 R Arg AGG 0.10 R Arg CGG 0.18 R Arg AGA 0.08 R Arg CGA 0.17 S Ser AGT 0.09 R Arg CGT 0.20 S Ser AGC 0.17 R Arg CGC 0.27 K Lys AAG 0.70 Q Gln CAG 0.61 K Lys AAA 0.30 Q Gln CAA 0.39 N Asn AAT 0.29 H His CAT 0.45 N Asn AAC 0.71 H His CAC 0.55 M Met ATG 1.00 L Leu CTG 0.24 I Ile ATA 0.11 L Leu CTA 0.10 I Ile ATT 0.31 L Leu CTT 0.20 I Ile ATC 0.58 L Leu CTC 0.26 Also look at medium redundancy codons or lower prererences ( 45 55% usage) T Thr ACG 0.22 P Pro CCG 0.22 T Thr ACA 0.19 P Pro CCA 0.22 T Thr ACT 0.25 P Pro CCT 0.27 T Thr ACC 0.34 P Pro CCC 0.28

M Met ATG 1.00 W Trp TGG 1.00 C Cys TGC 0.72 N Asn AAC 0.71 F Phe TTC 0.70 K Lys AAG 0.70 Y Tyr TAC 0.66 E Glu GAG 0.62 Q Gln CAG 0.61 I Ile ATC 0.58 H His CAC 0.55 D Asp GAT 0.50 Create ranked list and begin to scan sequence for op5mal codons

Integrate both sources of informa5on Red = highly conserved Blue = less conserved Green = low redundancy

Tips for Primer Design Roughly calculate Tm by adding 4 C for every G or C and 2 C for every A or T Aim for a Tm of 60 62 C Increase length when you suspect a mismatch might occur End primer aker two bases of the triplet Take reverse complement of the sequence for the return primer Check for snap backs, self annealing, or cross annealing between primers

Example Primer 01 Designs

Confirma5on of design h;p://blast.ncbi.nlm.nih.gov/blast.cgi?cmd=web&page_type=blasthome h;p://www.humgen.nl/primer_design.html

h;p://www.idtdna.com/analyzer/applica5ons/oligoanalyzer/

Maximum Change in the Gibbs Free Energy ΔG is 64 kcal.mol 1

Change in the Gibbs Free Energy ΔG (kcal.mol 1 ) Melt temp Tm ( C) 2.86 54.5 2.54 42.3 2.20 39.1 2.07 38.9

First Primer Analysis Run oligo analysis on your own primer sequence to determine the Tm, poten5al for snap back and self annealing, and finally BLAST primer Note that this region can be used as either a forward or return primer and that the complement is listed on the results page

Second Primer Analysis Determine which region will be used for the forward primer design and which will be used for the return primer design Design return primer and analyze as before Run oligo analysis on both primer sequences to determine whether there is a poten5al for annealing between primers From the DHFR pep5de alignment predict the size of amplifica5on product

Next steps Amplify from environmental sample, clone the PCR products (using pgem or equivalent) and determine the DNA sequence Match the sequence between the primers with derived polypep5de sequence to determine whether the authen5c locus has been amplified Use authen5c sequence to walk upstream and downstream to capture the en5re gene

TA Cloning A PCR Product A Many Taq polymerases preferen5ally add an adenine to the 3' end of the product Vector tailed with a single 3 overhang thymine residue on each end T and A are paired and gap DNA linked by Ligase Transform into E. coli and determine sequence using universal primers

Analysis Determine sequence of your PCR product and compare to original alignment If the sequence between the priming sites is more similar to fungal DNAs than plant DNAs you have your first piece of endophyte DHFR DNA What now?

Take a walk into the unknown Chromosome walking is a technique for cloning everything in the genome around a known piece of DNA (the star5ng probe) Using defined unique primers to your sequence begin walking upstream and downstream of your endophyte DHFR DNA

Construct Linker Library

Anchored PCR

Bubble Linker

Analysis This process is repeated using new specific primers un5l you have recovered the en5re coding region

Analysis Now you have the en5re coding region of the endophyte DHFR gene Conduct a phylogene5c analysis to find its nearest neighbor Design primers (to non conserved regions) to specifically screen for the presence of this endophyte in other host species