Oligonucleotide Design by Multilevel Optimization
|
|
- Chloe Gray
- 6 years ago
- Views:
Transcription
1 January 21, Technical Report. Oligonucleotide Design by Multilevel Optimization Colas Schretter & Michel C. Milinkovitch
2 Oligonucleotide Design by Multilevel Optimization Colas Schretter 1, Michel C. Milinkovitch 1 1 Laboratory of Evolutionary Genetics, Institute of Molecular Biology and Medicine (IBMM), Free University of Brussels (ULB) Colas Schretter - cschrett@ulb.ac.be; Michel C. Milinkovitch - mcmilink@ulb.ac.be; Corresponding author Abstract Background: Many molecular biology experiments make use of small RNA or DNA sequences called oligonucleotides. Their success is highly dependent on oligonucleotide design. Several constraints and properties of such oligonucleotides vary among applications such as long oligos for micro-arrays, primer pairs for PCR amplifications and sequencing, sirna to knock down gene expression. Most of methods proposed in the literature are usualy conceived with a dedicated and specific application in mind. The aim of our work is to specify a general framework to build design applications. Every given algorithm is a building block that can be combined to create a customized oligonucleotide design pipeline. Results: We present a collection of complementary techniques for the election of high quality oligonucleotides for PCR and DNA array experiments. The general pipeline proceeds by successive selection of best candidates on various criteria like minimization of secondary structures, using statistical mechanics approaches, and maximization of specificity. The latter is optimized through performing searches on genome among a short list of finalist candidates. Furthermore, we maintain diversity in the population of candidates to ensure domain exploration. Conclusions: The method of candidate selection we developed yields high-quality oligonucleotides and is implemented in a collection of design applications that is available at 1
3 1 Background A comparison of most recognized solutions to the problem of oligonucleotide design [1,2] underscore the importance of several key features to ensure high quality design. Specifically, a successful computer-assisted oligonucleotide design should: ensure hard constraints bounds on properties like the melting temperature estimation and the oligo length ranges; minimize the likelihood of secondary structure formations, namely, hairpins, homodimers and heterodimers; achieve perfect local complementarity matching with a query sequence; minimize cross-hybridization with non-target sequences within the considered genome. However, most of these constraints are independent such that it is impossible to define a unique objective function yielding solutions that are optimized across all constraints. Hence, we formulate the problem as a multi-criteria optimization problem. Multi-criteria optimization methods [3] tend to find the Pareto-optimal set of solutions. A design is said to be Pareto-optimal if there exists no feasible design which would improve one of the objectives without simultaneous worsening at least one other objective. Our approach consists into pruning the domain of candidate oligonucleotides by optimizing each set of independent constraints using an tour-by-tour process. After each tour, the size of the set of candidate solutions is reduced. Furthermore, we maintain diversity within the population of candidates by selecting nearly non-overlapping candidates only, hence, allowing for a trade-off between exploitation of the best solutions so far and exploration of the potential solutions domain. 2 Method Each possible candidate sub-sequence enters the selection pipeline shown in Figure 1. The set of best candidates is determined by searching for the cluster of best solutions based on the current criterion. This approach is a heuristic as a possibly optimal oligonucleotides could be discarded at an early stage of the selection pipeline without being tested for other criteria. Hence, we keep the largest possible set of acceptable candidates at each stage. 2
4 G S Q C C S u e r y e q u e n c e o n s t r a i n t s P a r a m e t e r s C o m b i n a t o r i a l o n s t r a i n t s D i m e r E n e r g y e n e r a t i o n F i l t e r O p t i m i z a t i o n R e s u l t S e t p e c i fi c i t y O p t i m i z a t i o n D o m a i n S a m p l i n g Figure 1: The selection pipeline. After each stage, only a subset of the candidates are retained 2.1 Generation of Candidates Every candidate oligonucleotide that conforms with the user-specified minimum and maximum lengths is generated from the query sequence. The number of such candidates is N = L max i=l min W i + 1 with L min L max W where W is the design area width, L min and L max are the minimum and maximum oligonucleotide lengths, respectively. Although N is a combinatorial quantity that grows factorialy with W, computational resources of a workstation allow the generation of every candidate for W and (L max L min ) values used in practice. e.g. N = 5236 if W = 500, L min = 20 and L max = 30. The complete coverage of the solution domain at this early stage ensures that an a priori optimal solution is not missed. 2.2 Filtering from Constraints The set of candidates is then filtered against a series of user-specified hard constraints: accepted range of melting temperature estimation (T m ), 3
5 accepted tolerance of overlapping with repetition or microsatellites regions, minimum and mean query sequence quality at the oligonucleotide positions. The filtering on quality is very flexible and the user can skip that criterion. Furthermore, in the case of primer design, we provide independent quality testing for the 3 -end of the oligonucleotides T m Estimation We use well know and validated methods for T m estimation. If the length of the oligonucleotide sequence is < 20, we use the Wallace model [4] T m = 2 (A + T) + 4 (G + C) where A, C, G and T are the number of corresponding nucleotides, else we use a more elaborated thermodynamic method [5] T m = T H log[salt] H G + RT ln (C) where T = o K is an experimental temperature, H and G are, respectively, the sum of the nearest-neighbor enthalpy, and Gibbs free energy (in cal/mol), R = 1.987cal/mol o K is the molar gas constant, C is the oligonucleotide concentration, and [salt] is a correction term dependent on the experimental salt conditions Repetitions and Microsatellites Masking Because they generally are very numerous within genomes, repeated regions are unspecific, and oligonucleotides containing repetitions should be avoided. Therefore, to each query sequence, we join a binary mask that indicates the positions of repeats. The union of all repetition regions is found by using regular expression matching. We accept masked bases within oligonucleotides with a tolerance proportional to the length of the candidate oligonucleotide. Indeed, a given oligonucleotide can correspond to a very high quality solution even if a few of its positions overlap masked regions. 2.3 Internal Energy Optimization All candidate oligonucleotides that passed the above-described stages are sorted according to their internal energy, i.e., their relative risks of forming hairpin and homodimer secondary structures. All possible hairpin and homodimer configurations s are enumerated. 4
6 We slide one oligo or primer over itself for homodimer and hairpin configurations, or over the other primer in case of heterodimer estimation. Each possible offset correspond to a state s S. The value G/U ref estimates the risk of hairpin and homodimer realization [6]. [ ] G = k B T ln e (Us/kBT) where k B = is the Boltzmann constant and U s is the reference internal energy of the state s. Hence s S U s = k B T ln e (Us/kBT) For each state s, we use the Wallace model [4] to weight the sum of the interactions for each base. Indeed, we estimates U as U = k B [2 (A + T) + 4 (G + C)] where A, C, G and T are the number of hydrogen bonds of a nucleotides with its complement in the current dimer configuration. We normalize each total energy estimation by a reference energy value U ref to select best candidates regardless of the oligo length. Indeed, a sorting criterion directly proportional to U would systematically favor short oligonucleotides, because of their intrinsical lower hybridation energy. To compute U ref, we simply sum the interaction factor (2 or 4) associated to each base of the oligonucleotide sequence and multiply the sum by k B. 2.4 Domain Sampling We select nearly non-overlapping candidates to ensure domain exploration and to diversify the population of candidates for the next specificity optimization stage. We proceed by walking the set of candidates sorted on their internal energy, as shown in Figure 2. An item is discarded if it overlaps more than t 10 nucleotide positions with the union of previously retained oligonucleotides. As the list is initially sorted by increasing internal energy, the procedure gives more priority to oligonucleotides with lower internal energy, i.e., high quality solutions are selected first. The domain sampling stage is motivated by the observation that close neighbors, hence largely overlapping oligonucleotides, in the candidate space exhibit nearly identical scoring values. Therefore, diversification of the population of candidate is needed to avoid the selection of a unique cluster of close candidates. 5
7 Figure 2: Priority-based sampling. Rectangles represent relative positions and length of oligo s. All candidates are sorted vertically on their internal energy score. If a candidate is selected, its rectangle is filled in black. Regions masked by previously selected spans are casted in grey on the next candidates. The first oligonucleotide, i.e., with the lowest internal energy, is always retained. The 6th candidate for example is discarded because its overlap with the union of previously retained oligonucleotides. 2.5 Pairing of Primers Oligonucleotide design for PCR applications generally require further constrains such as pairing of oligonucleotides, then called primers, a low difference in T m between the two members of the pair, a maximum size of amplicons, and a minimization of the risks of heterodimer realization. We generate a fixed number of primer pairs in increasing order of internal energy score. Then, we trivially reject pairs defining amplicons that do not fit the range of user-defined amplicon size. Valid pairs are sorted by increasing heterodimer risk, evaluated with the thermodynamic model presented in section Specificity Optimization The final selection stage identifies solutions that minimizes cross-hybridation (i.e., hybridization with a non-target sequence within the genome). We evaluate specificity by defining within the output of a Blast query: 1. whether the first hit corresponds to a perfect match of the candidate on the genome, 2. the number of bases matches in the second hit. 6
8 Candidates are sorted by increasing Blast score of the second best hit: more specific oligonucleotides are higher in the list. specificity score of a primer pair is defined as the worse specificity score of its two members. Our approach therefore requires, for each candidate, a Blast pass on the considered genome or a database of mrna, depending on the specific application. To speed-up this process, we propose to extract, for the considered genome, a database of non-specific regions, dedicated to specificity testing. Hence, a significant first hit demonstrate poor specificity. A given region is defined to be non-specific if it is similar to another region within the genome. A few other alternative specificity evaluation heuristic are proposed in literature [7,8]. 3 Implementation and Results Two oligonucleotides design applications have been implemented in Java using our common multilevel optimization pipeline, namely, OptiAmp (Design of Primers for PCR Amplifications), and LOD (Long Oligo Design). The OligoFaktory web portal embeds these bioinformatic tools in a web-based framework. The dynamic and interactive web application provides consistent form-based input interface and presentation of outputs. Each plugin tool reads an input parameter file and dumps results on an output file. Both input and output files conform to a common XML interchange file format. An XHTML form is associated with each tool to fetch parameters from user s input and to produce input XML files. For all applications, a unified presentation of result sheets provides distribution graphs and locations bar graphs to visualize the result set. Moreover, easy-to-spot warning flags are shown in case of problems with hairpin and homodimer secondary structures and/or with specificity. The project is aimed at assisting researchers for a painless, rapid, automated, and reliable design. 3.1 Brucellas Design An hybrid micro-array was designed to capture the expression of genes for both Brucella Suis 1330 and Brucella Melitensis 16M microbial species. Pairing of orthologous genes and alignments of consensus sequences have been performed as explained in [9]. This preprocessing yielded 2853 consensus sequences. Figure 3 shows charts of relevant features of the designed oligonucleotide set. Note that the oligonucleotide locations tend to cluster to the 3 -end of the query sequences. The specificity score indicates the maximum 7
9 (a) Oligos Length Distribution (b) Relative Oligos Locations (c) Tm Distances from 79 o (d) Specificity Scores Distribution Figure 3: Features of the designed micro-array. number of non-specific base paring on the second best hit found on the genome of Brucellas Suis. The first hit corresponds to the perfect match of the oligonucleotide on its target genes in both Suis and Melitensis. The temperature range does not exceed two degrees, as specified by the constraints. 4 Discussion and Conclusions We have presented a general framework to build oligonucleotide design applications. The method is based on the principles of multilevel optimization strategies. A general pipeline of candidate selection is described and algorithmic details of each stages are given. The main contribution of our method, compared to current techniques is, by far, the internal energy optimization stage based on statistical mechanics principles. Indeed, we experimented that a poor evaluation of dimerization risk is one, if not, the most important factor of failures in applications. Our proposal is a refined and realistic model to evaluate risk of dimer realization. The evaluation of this model require more computational resources in comparison of the common heuristics. However, today workstations allow the design of large batches of queries with our method. A refined model based on nearest neighbors could be used to better estimate the internal energy U. However, we observed that the classification of dimer configurations as performed with the computationally cheap Wallace model is accurate in practice. 8
10 The domain sampling strategy is essential to allow a trade-off between exploration of the solution space and exploitation of best so-far solutions. Furthermore, additional optimization constraints can be taken into account by appending successive domain sampling stages followed by selection. Introducing diversification within the population of candidates is a key point for a parameterizable multi-criteria optimization pipeline. Acknowledgments The Internal Energy Optimization section has greatly benefited from discussions with Daniel Van Belle on statistical mechanics. This work was supported by the Universite Libre de Bruxelles (ULB) and the Region Wallonne (BioRobot-Initiative ). References 1. Burpo JF: A critical review of PCR primer design algorithms and crosshybridization case study. Tech. rep., Department of Chemical Engineering, Stanford University Kampke T, Kieninger M, Mecklenbug M: Efficient primer design algorithms. Bioinformatics 2001, 17(3). 3. Candler W, Norton R: Multilevel Programming. Tech. rep., unpublished research memorandum, DRC, World Bank, Washington Wallace RB, Shaffer J, Murphy RF, Bonner J, Hirose T, Itakura K: Hybridization of synthetic oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch. Nucleic Acids Research 1979, Wetmur JG, Sninsky JJ: PCR Strategies, Academic Press 1995 chap. 6, : Groebe DR, Uhlenbeck OC: Characterization of RNA hairpin loop stability. Nucleic Acids Research 1988, Kurata K, Nakamura H: Novel Method for Primer/Probe Design and Sequence Analysis. Tech. rep., School of Engineering, The University of Tokyo Rahmann S: Fast Large Scale Oligonucleotide Selection Using the Longest Common Factor Approach. Journal of Bioinformatics and Computational Biology 2003, 1(2): Schretter C, Milinkovitch MC: Automated Long Oligo Design on Consensus Regions of Similar Genomes. Tech. rep., Unit of Evolutionary Genetics, Universite Libre de Bruxelles
CHAPTER 3 PRIMER DESIGN CRITERIA
CHAPTER 3 PRIMER DESIGN CRITERIA In this chapter, we will discuss five basic elements of primer design criteria. The first section is melting temperature. In PCR experiment, there are three temperaturedependent
More information1. The AGI (Arabidospis Genome Initiative) convention gene names or AtRTPrimer ID should
We will show how users can select their desired types of primer-pairs, as we explain each of forms indicated by the blue-filled rectangles of Figure 1. Figure 1 Front-end webpage for searching desired
More informationPCR PRIMER DESIGN SARIKA GARG SCHOOL OF BIOTECHNOLGY DEVI AHILYA UNIVERSITY INDORE INDIA
PCR PRIMER DESIGN SARIKA GARG SCHOOL OF BIOTECHNOLGY DEVI AHILYA UNIVERSITY INDORE-452017 INDIA BIOINFORMATICS Bioinformatics is considered as amalgam of biological sciences especially Biotechnology with
More informationHigh-Throughput SNP Genotyping by SBE/SBH
High-hroughput SNP Genotyping by SBE/SBH Ion I. Măndoiu and Claudia Prăjescu Computer Science & Engineering Department, University of Connecticut 371 Fairfield Rd., Unit 2155, Storrs, C 06269-2155, USA
More informationSNPWizard User Guide
SNPWizard User Guide About SNPWizard There are many situations in which one wishes to amplify a small segment of DNA where otherwise identical strands may differ. Such segments may consist of a single
More informationMulti-objective Evolutionary Probe Design Based on Thermodynamic Criteria for HPV Detection
Multi-objective Evolutionary Probe Design Based on Thermodynamic Criteria for HPV Detection In-Hee Lee, Sun Kim, and Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering
More informationGenome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human. Supporting Information
Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human Angela Re #, Davide Corá #, Daniela Taverna and Michele Caselle # equal contribution * corresponding author,
More informationPRESENTING SEQUENCES 5 GAATGCGGCTTAGACTGGTACGATGGAAC 3 3 CTTACGCCGAATCTGACCATGCTACCTTG 5
Molecular Biology-2017 1 PRESENTING SEQUENCES As you know, sequences may either be double stranded or single stranded and have a polarity described as 5 and 3. The 5 end always contains a free phosphate
More informationBioinformatics Course AA 2017/2018 Tutorial 2
UNIVERSITÀ DEGLI STUDI DI PAVIA - FACOLTÀ DI SCIENZE MM.FF.NN. - LM MOLECULAR BIOLOGY AND GENETICS Bioinformatics Course AA 2017/2018 Tutorial 2 Anna Maria Floriano annamaria.floriano01@universitadipavia.it
More informationOptimizing a Conventional Polymerase Chain Reaction (PCR) and Primer Design
Optimizing a Conventional Polymerase Chain Reaction (PCR) and Primer Design The Polymerase Chain Reaction (PCR) is a powerful technique used for the amplification of a specific segment of a nucleic acid
More informationPolymerase Chain Reaction: Application and Practical Primer Probe Design qrt-pcr
Polymerase Chain Reaction: Application and Practical Primer Probe Design qrt-pcr review Enzyme based DNA amplification Thermal Polymerarase derived from a thermophylic bacterium DNA dependant DNA polymerase
More informationArray-Ready Oligo Set for the Rat Genome Version 3.0
Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.
More informationStudent Learning Outcomes (SLOS)
Student Learning Outcomes (SLOS) KNOWLEDGE AND LEARNING SKILLS USE OF KNOWLEDGE AND LEARNING SKILLS - how to use Annhyb to save and manage sequences - how to use BLAST to compare sequences - how to get
More informationRNA Structure Prediction. Algorithms in Bioinformatics. SIGCSE 2009 RNA Secondary Structure Prediction. Transfer RNA. RNA Structure Prediction
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary
More informationLecture #1. Introduction to microarray technology
Lecture #1 Introduction to microarray technology Outline General purpose Microarray assay concept Basic microarray experimental process cdna/two channel arrays Oligonucleotide arrays Exon arrays Comparing
More informationFollowing text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005
Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of
More informationHigh-Throughput Assay Design. Microarrays. Applications. Overview. Algorithms Universal DNA Tag Array Design and Optimization
Algorithms for Universal DNA Tag Array Design and Optimization Watson- Crick C o m p l e m e n t a r i t y Four nucleotide types: A,C,T,G A s paired with T s (2 hydrogen bonds) C s paired with G s (3 hydrogen
More informationDNA and RNA are both composed of nucleotides. A nucleotide contains a base, a sugar and one to three phosphate groups. DNA is made up of the bases
1 DNA and RNA are both composed of nucleotides. A nucleotide contains a base, a sugar and one to three phosphate groups. DNA is made up of the bases Adenine, Guanine, Cytosine and Thymine whereas in RNA
More informationDegenerate Primer Design using Computational Tools Computational Molecular Biology Veronica Brand 11 December 2011
Degenerate Primer Design using Computational Tools Computational Molecular Biology Veronica Brand 11 December 2011 The polymerase chain reaction (PCR) is widely used to uncover new information about genes
More informationTextbook Reading Guidelines
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science
More informationTUTORIAL: PCR ANALYSIS AND PRIMER DESIGN
C HAPTER 8 TUTORIAL: PCR ANALYSIS AND PRIMER DESIGN Introduction This chapter introduces you to tools for designing and analyzing PCR primers and procedures. At the end of this tutorial session, you will
More information601 CTGTCCACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGAGAGACCACATGGTCCTT GACAGGTGTGTTAGACGGGAAAGCTTTCTAGGGTTGCTTTTCTCTCTGGTGTACCAGGAA >>>>>>>>>>>>>>>>>>
BIO450 Primer Design Tutorial The most critical step in your PCR experiment will be designing your oligonucleotide primers. Poor primers could result in little or even no PCR product. Alternatively, they
More informationAn Investigation of Palindromic Sequences in the Pseudomonas fluorescens SBW25 Genome Bachelor of Science Honors Thesis
An Investigation of Palindromic Sequences in the Pseudomonas fluorescens SBW25 Genome Bachelor of Science Honors Thesis Lina L. Faller Department of Computer Science University of New Hampshire June 2008
More informationLecture 2: Central Dogma of Molecular Biology & Intro to Programming
Lecture 2: Central Dogma of Molecular Biology & Intro to Programming Central Dogma of Molecular Biology Proteins: workhorse molecules of biological systems Proteins are synthesized from the genetic blueprints
More informationPrimeTime Pre-designed qpcr Assays
PrimeTime Pre-designed qpcr Assays nuclease assays for human, mouse, and rat PrimeTime qpcr Assays IDT now offers PrimeTime Pre-designed qpcr Assays that are guaranteed to work for human, mouse, and rat
More informationMolecular Biology: DNA sequencing
Molecular Biology: DNA sequencing Author: Prof Marinda Oosthuizen Licensed under a Creative Commons Attribution license. SEQUENCING OF LARGE TEMPLATES As we have seen, we can obtain up to 800 nucleotides
More informationExperiment (5): Polymerase Chain Reaction (PCR)
BCH361 [Practical] Experiment (5): Polymerase Chain Reaction (PCR) Aim: Amplification of a specific region on DNA. Primer design. Determine the parameters that may affect he specificity, fidelity and efficiency
More informationAn Analytical Upper Bound on the Minimum Number of. Recombinations in the History of SNP Sequences in Populations
An Analytical Upper Bound on the Minimum Number of Recombinations in the History of SNP Sequences in Populations Yufeng Wu Department of Computer Science and Engineering University of Connecticut Storrs,
More information90 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 4, 2006
90 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 4, 2006 8 RNA Secondary Structure Sources for this lecture: R. Durbin, S. Eddy, A. Krogh und G. Mitchison. Biological sequence analysis,
More informationIntroduction to Microarray Analysis
Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray
More informationPositional Preference of Rho-Independent Transcriptional Terminators in E. Coli
Positional Preference of Rho-Independent Transcriptional Terminators in E. Coli Annie Vo Introduction Gene expression can be regulated at the transcriptional level through the activities of terminators.
More informationCAP BIOINFORMATICS Su-Shing Chen CISE. 10/5/2005 Su-Shing Chen, CISE 1
CAP 5510-9 BIOINFORMATICS Su-Shing Chen CISE 10/5/2005 Su-Shing Chen, CISE 1 Basic BioTech Processes Hybridization PCR Southern blotting (spot or stain) 10/5/2005 Su-Shing Chen, CISE 2 10/5/2005 Su-Shing
More informationGenome Sequence Assembly
Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:
More informationChIP-seq and RNA-seq
ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)
More informationMeasuring and Understanding Gene Expression
Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics
More informationSequence Design for DNA Computing
Sequence Design for DNA Computing 2004. 10. 16 Advanced AI Soo-Yong Shin and Byoung-Tak Zhang Biointelligence Laboratory DNA Hydrogen bonds Hybridization Watson-Crick Complement A single-stranded DNA molecule
More informationDNA/RNA MICROARRAYS NOTE: USE THIS KIT WITHIN 6 MONTHS OF RECEIPT.
DNA/RNA MICROARRAYS This protocol is based on the EDVOTEK protocol DNA/RNA Microarrays. 10 groups of students NOTE: USE THIS KIT WITHIN 6 MONTHS OF RECEIPT. 1. EXPERIMENT OBJECTIVE The objective of this
More informationHuman Genomics. Higher Human Biology
Human Genomics Higher Human Biology Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA sequences Human Genomics The genome is the whole hereditary
More informationOutline. Analysis of Microarray Data. Most important design question. General experimental issues
Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,
More informationExploring Similarities of Conserved Domains/Motifs
Exploring Similarities of Conserved Domains/Motifs Sotiria Palioura Abstract Traditionally, proteins are represented as amino acid sequences. There are, though, other (potentially more exciting) representations;
More informationTALENs (Transcription Activator-Like Effector Nucleases)
TALENs (Transcription Activator-Like Effector Nucleases) The fundamental rationale between TALENs and ZFNs is similar, namely, combine a sequencespecific DNA-binding peptide domain with a nuclease domain
More informationBackground Analysis and Cross Hybridization. Application
Background Analysis and Cross Hybridization Application Pius Brzoska, Ph.D. Abstract Microarray technology provides a powerful tool with which to study the coordinate expression of thousands of genes in
More informationPRIMER SELECTION METHODS FOR DETECTION OF GENOMIC INVERSIONS AND DELETIONS VIA PAMP
1 PRIMER SELECTION METHODS FOR DETECTION OF GENOMIC INVERSIONS AND DELETIONS VIA PAMP B. DASGUPTA Department of Computer Science, University of Illinois at Chicago, Chicago, IL 60607-7053 E-mail: dasgupta@cs.uic.edu
More informationENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics
A very coarse introduction to bioinformatics In this exercise, you will get a quick primer on how DNA is used to manufacture proteins. You will learn a little bit about how the building blocks of these
More informationBIOINFORMATICS IN BIOCHEMISTRY
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses on the analysis of molecular sequences (DNA, RNA, and
More informationBLAST. Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences.
BLAST Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences. An example could be aligning an mrna sequence to genomic DNA. Proteins are frequently composed of
More informationA Guide to Consed Michelle Itano, Carolyn Cain, Tien Chusak, Justin Richner, and SCR Elgin.
1 A Guide to Consed Michelle Itano, Carolyn Cain, Tien Chusak, Justin Richner, and SCR Elgin. Main Window Figure 1. The Main Window is the starting point when Consed is opened. From here, you can access
More informationWhat Are the Chemical Structures and Functions of Nucleic Acids?
THE NUCLEIC ACIDS What Are the Chemical Structures and Functions of Nucleic Acids? Nucleic acids are polymers specialized for the storage, transmission, and use of genetic information. DNA = deoxyribonucleic
More informationTutorial for Stop codon reassignment in the wild
Tutorial for Stop codon reassignment in the wild Learning Objectives This tutorial has two learning objectives: 1. Finding evidence of stop codon reassignment on DNA fragments. 2. Detecting and confirming
More informationIterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data
http://www.psi.toronto.edu Iterated Conditional Modes for Cross-Hybridization Compensation in DNA Microarray Data Jim C. Huang, Quaid D. Morris, Brendan J. Frey October 06, 2004 PSI TR 2004 031 Iterated
More informationRNA Secondary Structure Prediction
RNA Secondary Structure Prediction Outline 1) Introduction: RNA structure basics 2) Dynamic programming for RNA secondary structure prediction The Central Dogma of Molecular Biology DNA CCTGAGCCAACTATTGATGAA
More informationQuestion 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.
Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or
More informationALGORITHMS IN BIO INFORMATICS. Chapman & Hall/CRC Mathematical and Computational Biology Series A PRACTICAL INTRODUCTION. CRC Press WING-KIN SUNG
Chapman & Hall/CRC Mathematical and Computational Biology Series ALGORITHMS IN BIO INFORMATICS A PRACTICAL INTRODUCTION WING-KIN SUNG CRC Press Taylor & Francis Group Boca Raton London New York CRC Press
More informationSupplementary Materials. for. array reveals biophysical and evolutionary landscapes
Supplementary Materials for Quantitative analysis of RNA- protein interactions on a massively parallel array reveals biophysical and evolutionary landscapes Jason D. Buenrostro 1,2,4, Carlos L. Araya 1,4,
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools News About NCBI Site Map
More informationChapter 8: Recombinant DNA. Ways this technology touches us. Overview. Genetic Engineering
Chapter 8 Recombinant DNA and Genetic Engineering Genetic manipulation Ways this technology touches us Criminal justice The Justice Project, started by law students to advocate for DNA testing of Death
More informationNature Methods: doi: /nmeth Supplementary Figure 1. Pilot CrY2H-seq experiments to confirm strain and plasmid functionality.
Supplementary Figure 1 Pilot CrY2H-seq experiments to confirm strain and plasmid functionality. (a) RT-PCR on HIS3 positive diploid cell lysate containing known interaction partners AT3G62420 (bzip53)
More informationQPCR ASSAYS FOR MIRNA EXPRESSION PROFILING
TECH NOTE 4320 Forest Park Ave Suite 303 Saint Louis, MO 63108 +1 (314) 833-9764 mirna qpcr ASSAYS - powered by NAWGEN Our mirna qpcr Assays were developed by mirna experts at Nawgen to improve upon previously
More informationPRIMEGENSw3 User Manual
PRIMEGENSw3 User Manual PRIMEGENSw3 is Web Server version of PRIMEGENS program to automate highthroughput primer and probe design. It provides three separate utilities to select targeted regions of interests
More informationFactors affecting PCR
Lec. 11 Dr. Ahmed K. Ali Factors affecting PCR The sequences of the primers are critical to the success of the experiment, as are the precise temperatures used in the heating and cooling stages of the
More informationChIP-seq and RNA-seq. Farhat Habib
ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions
More informationOligo. Version 6 for Macintosh. Primer Analysis Software. DEMO Guide. Wojciech Rychlik. 1999, Molecular Biology Insights, Inc.
Oligo Primer Analysis Software Version 6 for Macintosh DEMO Guide Wojciech Rychlik Molecular Biology Insights, Inc. 8685 Hwy 24 Cascade, CO 889 Phone: (8) 747-462 (79) 684 7988 Fax: (79) 684 7989 www.oligo.net
More informationSupplementary Information for:
Supplementary Information for: A streamlined and high-throughput targeting approach for human germline and cancer genomes using Oligonucleotide-Selective Sequencing Samuel Myllykangas 1, Jason D. Buenrostro
More informationHuman Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased
Human Genome Sequencing Over the Decades The capacity to sequence all 3.2 billion bases of the human genome (at 30X coverage) has increased exponentially since the 1990s. In 2005, with the introduction
More informationFeature Selection of Gene Expression Data for Cancer Classification: A Review
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 52 57 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Feature Selection of Gene Expression
More informationSMRT Analysis Barcoding Overview (v6.0.0)
SMRT Analysis Barcoding Overview (v6.0.0) Introduction This document applies to PacBio RS II and Sequel Systems using SMRT Link v6.0.0. Note: For information on earlier versions of SMRT Link, see the document
More informationA Greedy Algorithm for Minimizing the Number of Primers in Multiple PCR Experiments
A Greedy Algorithm for Minimizing the Number of Primers in Multiple PCR Experiments Koichiro Doi Hiroshi Imai doi@is.s.u-tokyo.ac.jp imai@is.s.u-tokyo.ac.jp Department of Information Science, Faculty of
More informationSequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing
Sequence Analysis II: Sequence Patterns and Matrices George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Patterns and Matrices Multiple sequence alignments Sequence patterns Sequence
More informationChapter 5. Structural Genomics
Chapter 5. Structural Genomics Contents 5. Structural Genomics 5.1. DNA Sequencing Strategies 5.1.1. Map-based Strategies 5.1.2. Whole Genome Shotgun Sequencing 5.2. Genome Annotation 5.2.1. Using Bioinformatic
More informationDesigning TaqMan MGB Probe and Primer Sets for Gene Expression Using Primer Express Software Version 2.0
Designing TaqMan MGB Probe and Primer Sets for Gene Expression Using Primer Express Software Version 2.0 Overview This tutorial details how a TaqMan MGB Probe can be designed over a specific region of
More informationThe goal of this project was to prepare the DEUG contig which covers the
Prakash 1 Jaya Prakash Dr. Elgin, Dr. Shaffer Biology 434W 10 February 2017 Finishing of DEUG4927010 Abstract The goal of this project was to prepare the DEUG4927010 contig which covers the terminal 99,279
More informationDNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer.
DNA Preparation and QC Extraction DNA was extracted from whole blood or flash frozen post-mortem tissue using a DNA mini kit (QIAmp #51104 and QIAmp#51404, respectively) following the manufacturer s recommendations.
More informationAdvisors: Prof. Louis T. Oliphant Computer Science Department, Hiram College.
Author: Sulochana Bramhacharya Affiliation: Hiram College, Hiram OH. Address: P.O.B 1257 Hiram, OH 44234 Email: bramhacharyas1@my.hiram.edu ACM number: 8983027 Category: Undergraduate research Advisors:
More informationInsights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays
Slide 1 of 38 Insights from the first RT-qPCR based human transcriptome profiling based on wet lab validated assays Jan Hellemans, PhD CEO Biogazelle qpcr & NGS 2013 Freising, Germany March 19, 2013 Biogazelle
More informationBootcamp: Molecular Biology Techniques and Interpretation
Bootcamp: Molecular Biology Techniques and Interpretation Bi8 Winter 2016 Today s outline Detecting and quantifying nucleic acids and proteins: Basic nucleic acid properties Hybridization PCR and Designing
More informationDNA Chip Technology Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center
DNA Chip Technology Benedikt Brors Dept. Intelligent Bioinformatics Systems German Cancer Research Center Why DNA Chips? Functional genomics: get information about genes that is unavailable from sequence
More informationA critical review of PCR primer design algorithms and crosshybridization
Biochemistry 218 A critical review of PCR primer design algorithms and crosshybridization case study F. John Burpo Department of Chemical Engineering, Stanford University, CA 94305 Submitted August 11,
More informationMethods of Biomaterials Testing Lesson 3-5. Biochemical Methods - Molecular Biology -
Methods of Biomaterials Testing Lesson 3-5 Biochemical Methods - Molecular Biology - Chromosomes in the Cell Nucleus DNA in the Chromosome Deoxyribonucleic Acid (DNA) DNA has double-helix structure The
More informationPolymerase Chain Reaction-361 BCH
Polymerase Chain Reaction-361 BCH 1-Polymerase Chain Reaction Nucleic acid amplification is an important process in biotechnology and molecular biology and has been widely used in research, medicine, agriculture
More informationMolecular Biology and Pooling Design
Molecular Biology and Pooling Design Weili Wu 1, Yingshu Li 2, Chih-hao Huang 2, and Ding-Zhu Du 2 1 Department of Computer Science, University of Texas at Dallas, Richardson, TX 75083, USA weiliwu@cs.utdallas.edu
More informationMotivation From Protein to Gene
MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein
More informationGenetics and Genomics in Medicine Chapter 3. Questions & Answers
Genetics and Genomics in Medicine Chapter 3 Multiple Choice Questions Questions & Answers Question 3.1 Which of the following statements, if any, is false? a) Amplifying DNA means making many identical
More informationNew Plant Breeding Technologies
New Plant Breeding Technologies Ricarda A. Steinbrecher, PhD EcoNexus / ENSSER Berlin, 07 May 2015 r.steinbrecher@econexus.info distributed by EuropaBio What are the NPBTs? *RNAi *Epigenetic alterations
More informationM1D2: Diagnostic Primer Design 2/10/15
M1D2: Diagnostic Primer Design 2/10/15 Announcements 1. Expanded office hours for this week: Wednesday, 3-5pm in 16-319 Friday, 3-5pm in 16-319 Sunday, 3-5pm in 16-319 2. Weekly office hours (starting
More informationDeakin Research Online
Deakin Research Online This is the published version: Church, Philip, Goscinski, Andrzej, Wong, Adam and Lefevre, Christophe 2011, Simplifying gene expression microarray comparative analysis., in BIOCOM
More informationSelecting Specific PCR Primers with MFEprimer. Wubin Qu and Chenggang Zhang
Chapter 15 Selecting Specific PCR Primers with MFEprimer Abstract Selecting specific primers is crucial for polymerase chain reaction (PCR). Nonspecific primers will bind to unintended genes and result
More informationChang Xu Mohammad R Nezami Ranjbar Zhong Wu John DiCarlo Yexun Wang
Supplementary Materials for: Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller Chang Xu Mohammad R Nezami Ranjbar Zhong Wu John
More informationMicroarray Probe Design Using ɛ-multi-objective Evolutionary Algorithms with Thermodynamic Criteria
Microarray Probe Design Using ɛ-multi-objective Evolutionary Algorithms with Thermodynamic Criteria Soo-Yong Shin, In-Hee Lee, and Byoung-Tak Zhang Biointelligence Laboratory, School of Computer Science
More informationMotif Discovery from Large Number of Sequences: a Case Study with Disease Resistance Genes in Arabidopsis thaliana
Motif Discovery from Large Number of Sequences: a Case Study with Disease Resistance Genes in Arabidopsis thaliana Irfan Gunduz, Sihui Zhao, Mehmet Dalkilic and Sun Kim Indiana University, School of Informatics
More informationMetaheuristics. Approximate. Metaheuristics used for. Math programming LP, IP, NLP, DP. Heuristics
Metaheuristics Meta Greek word for upper level methods Heuristics Greek word heuriskein art of discovering new strategies to solve problems. Exact and Approximate methods Exact Math programming LP, IP,
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More informationA Sequencing Heuristic to Minimize Weighted Flowtime in the Open Shop
A Sequencing Heuristic to Minimize Weighted Flowtime in the Open Shop Eric A. Siy Department of Industrial Engineering email : eric.siy@dlsu.edu.ph Abstract: The open shop is a job shop with no precedence
More informationRNA Secondary Structure Prediction Computational Genomics Seyoung Kim
RNA Secondary Structure Prediction 02-710 Computational Genomics Seyoung Kim Outline RNA folding Dynamic programming for RNA secondary structure prediction Covariance model for RNA structure prediction
More informationAnalysis of Microarray Data
Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction
More informationMicroarrays & Gene Expression Analysis
Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed
More informationGrundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson
Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 11, 2011 1 1 Introduction Grundlagen der Bioinformatik Summer 2011 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a) 1.1
More informationThousands of corresponding human and mouse genomic regions unalignable in primary sequence contain. Elfar Þórarinsson February 2006
Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure Elfar Þórarinsson February 2006 It s interesting to note that: Approximately half
More informationFinishing of Fosmid 1042D14. Project 1042D14 is a roughly 40 kb segment of Drosophila ananassae
Schefkind 1 Adam Schefkind Bio 434W 03/08/2014 Finishing of Fosmid 1042D14 Abstract Project 1042D14 is a roughly 40 kb segment of Drosophila ananassae genomic DNA. Through a comprehensive analysis of forward-
More informationBIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis
BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology
More informationRoche Molecular Biochemicals Technical Note No. LC 6/99
Roche Molecular Biochemicals Technical Note No. LC 6/99 LightCycler Selection of Hybridization Probe Sequences for Use with the LightCycler Olfert Landt and Andreas Nitsche, TIB MOLBIOL, Berlin 1. Introduction
More information