Characterization of transcription factor binding sites by high-throughput SELEX. Overview of the HTPSELEX Database
|
|
- Stephanie Johnston
- 6 years ago
- Views:
Transcription
1 Characterization of transcription factor binding sites by high-throughput SELEX Overview of the HPSELEX Database ranscription Factor Binding Sites: Features and Facts Degenerate sequence motifs ypical length: 6-20 bp Low information content: 8-12 bits (1 site per bp) Quantitative recognition mechanism: measurable affinity of different sites may vary over three orders of magnitude Regulatory function often depends on cooperative interactions with neighboring sites
2 Representation of the binding specificity by a scoring matrix (also referred to as weight matrix) A C G Strong C G A C Binding site = 43 Random A C G A C G A Sequence = -83 itle
3 Physical interpretation of an weight matrix Weight matrix elements represent relative binding energies between DNA base-pairs and protein surface areas (base-pair acceptor sites). A weight matrix column describes the base preferences of a base-pair acceptor site. Berg-von Hippel model of protein-dna interactions he weight matrix score expresses the binding free energy of protein-dna complex in arbitrary units: It is convenient to express the binding free energy in dimension-free R units: G( x) = S( x) + const. S(x) = N i= 1 N w i ( x i ) E( x) = ε ( i x i i= 1 εi ( b) wi ( b) R ) On a relative scale, the binding constant for sequence x is given by: K = e rel ( x) For sequences longer than the weight matrix: 1 1 K ( x) = or K (... ) rel( x) = E xi xi+ N 1 E( xi... xi+ N e max e rel 1 ) i i E( x) (index i runs over all subsequence starting positions on both strands)
4 Berg-von Hippel heory Information Content he energy terms of a weight matrix can be computed from the base frequencies p i (b) found in in vitro or in vivo selected binding sites: q(b) is the background frequency of base b. 1 pi ( b) ε i ( b) = ln λ q( b) λ is an unknown parameters related to the stringency of the binding conditions. he information content of a binding site has been defined as the conditional entropy of the base frequency matrix relative to back-ground base frequencies. IC = N i= 1 b= A p ( b)log i 2 pi ( b) q( b) Paradox: λ depends on selection conditions (e.g. the protein concentration) - therefore the base frequencies observed in selected binding sites do not reflect a protein-intrinsic property. Weight matrices/profiles from a biochemical and viewpoint A weight matrix expresses the sequence specificity of a DNA binding proteins. A column describes the base preferences of a surface area of the DNAbinding protein. Weights of a weight matrix can be interpreted as additive binding energy contributions. No interactions between binding site positions! According to the Berg-von Hippel theory negated binding energies are proportional to the logarithms of the base frequencies observed in an in vivo or in vitro selected set of binding sites. Weight matrices can thus be used to compute relative binding energies or dissociation constants for oligonucleotides of any sequence, which in turn can be experimentally determined by gel shift experiments. An accurate weight matrix for the binding specificity of a transcription factor is one that accurately predicts binding constants.
5 Experimental techniques for estimating the parameters of a F specificity matrix Competitive bandshifts (EMSA) rel. binding constants of oligonucletides Alignment of in vivo sites base frequency matrix (from sequences) in vitro selection (SELEX) base frequency matrix (up to 200 sequences) SAGE/SELEX base frequency matrix (up to binding sequences) Exhaustive mutagenesis + K rel assay intrinsic specificity matrix Protein binding arrays + magic algorithm intrinsic specificity matrix Some problems and limitations: A base probability matrix is generate by an alignment or probabilistic modeling algorithm no direct observation K rel usually not very precise (within factor of 2) Point mutations may create binding site in other frame Modeling of a ranscription Factor Binding Site from High hroughput SELEX Data Using a Hidden Markov Modeling Approach Emmanuelle Roulet, Nicolas Mermod (Center for biotechnology UNIL- EPFL, Lausanne, Switzerland) Anamaria A Camargo, Andrew JG Simpson (Ludwig Institute of Cancer Research, Sao Paulo, Brazil) Philipp Bucher (Swiss Institute for Experimental Cancer Research and Swiss Institute of Bioinformatics, Epalinges s/lausanne, Switzerland) Nat. Biotechnol. 20, (2002)
6 Motivation and Goals of the Project Motivation: Accurate and reliable computational tools to predict transcription factor binding sites are still not available. Potential reasons: 1. Lack of adequate experimental data 2. Lack of adequate computational models 3. Lack of an adequate method to estimate the parameters of a computational model from the experimental data Goal: o develop a combined computational-experimental protocol to derive an accurate predictive model of the sequence specificity of a DNA-binding protein Potential benefits: 1. Being able to predict transcription factor binding in genome sequences. 2. Insights into molecular mechanisms of sequence-specific protein-dna interactions 3. Ability to rationally design gene control regions of desired properties for biotechnological applications
7 Our Approach to the Problem of Characterizing the Sequence-Specificity of a DNA Binding ranscription Factor 1. Choice of a quantitative predictive model for representing the binding specificity. Our choice: a profile-hmm 2. Choice of an experimental method to generate data for estimating the model parameters. Our choice: a SELEX experiment 3. Choice of a machine learning algorithm to estimate the model parameters from the data. Our choice: the Baum-Welch HMM training algorithm 4. Validation of the approach and optimization of the experimental parameters by a computer simulation of step 2 and 3. Adjustment of experimental protocol to produce the necessary data as suggested by the computer simulation 6. Generation of the experimental data 7. Building a binding site model from the data 8. A posteriori validation of the model by cross-validation and comparison with independent experimental results Study Object: ranscription Factor CF/NFI Dimeric DNA-binding protein recognizing a palindromic sequence motif with consensus sequence GGC(N)GCCAA First isolated as a replication factor of Adenovirus type 2 Later independently isolated as a CCAA-box binding transcription factor Can activate transcription of a reporter gene in transfected cells Recently shown to be implicated in regulatory pathways related to tumor progression and immune response Biochemical mechanism of gene regulation still elusive
8 Old CF/NFI Binding Site Profile Example: GGGCAAAGCCAC Score: = 88
9 Random sequence library CCACCCGAGCGAGACA.N(2).AGACCCAACCGACCCGAA-3 Second strand synthesis by pcr Primer 1 Bgl II Bgl II CCACCCGAGCGAGACA.N(2).AGACCCAACCGACCCGAA-3 3 AGGAGAGAAGACAACAGACAGA.N(2).ACAGAGGAGGCGAGGCAAAA- Selection of binding sequences (gel shift) Amplification Primer 2 Selection cycles Digestion Bgl II GACA..N(2)..A A..N(2)..ACAG-3 Concatemerization and cloning -GACA N(2) AGACA N(2) AGACA N(2) A A N(2) ACAGA N(2) ACAGA N(2) ACAG-3 site 1 site 2 site 3 HS sequencing Principle of the Baum-Welch hidden Markov model training algorithm Initial model: raining sequences: AACAGCGGCCAACAGGACACA CCACAACFFACGCCCAAAAACCAA GAGGGACCGCCCAGCAAC ACACGGCACCCCACGC GGAAAAAAAAAAACAGGG GCGCGGAGGCACGCCCAA AAGGGCCACCAAAGCGAG... How does it work? 1. he initial model serves as current model. 2. raining sequences are aligned to the current model. 3. New base and transition frequencies are estimated from the multiple alignment generated by step 2. he new model becomes the current model. 4. Step 2 and 3 are repeated until convergence is reached. rained model:
10
11
12 Doing the Experiment
13 Results CF/NF1 Cycle Cycle SUM Seq.reads Sites Clone statistics Clones Different sites Site Statistics 1481 Colonies Diff. sites err < 0.01/bp err </bp Clones with detectable inserts New CF/NFI model Hidden Markov Model (frequencies given in %): Scoring profile (relative energy units):
14 Predicted and observed evolution of Selex populations heoretically predicted affinity profiles of successive SELEX cycles (Djordjevic & Sengupta 2006) high low affinity Weight matrix scores for successive CF/NF1 HP SELEX populations (Roulet et al. 2002) high Major Differences between New and Old CF/NFI Binding Site Models he new model contains a sixth half-site position reducing the major spacer length class to 3. his extends the consensus half-site motif to GGCA. Alternative spacer length classes N4 and N (N6 and N7 according to the old numbering system) receive much more severe penalties in the new profile. Based on the estimated frequencies, it is not certain whether these binding modes have occurred at all during SELEX amplification. he G mismatch at the first position of the half-site weigth matrix has a much lower weight in the new model.
15
16 Quality Assessment of the New Model: Comparison of Predicted Binding Scores with in vitro measured Binding Constants Data from Meisterernst et al. (1988). Nucl. Acids Res. 16,
17 Beyond simple weight matrices: correlated dinucleotide analysis HP SELEX Sequencing totals for members of the CF family SELEX Library LEF1_2 LEF1_3 LEF1_ LEF1_6 LEF1_7 SUM LBC_ LBC_6 SUM CF4_3 otal number of sites LEF1/CF-1 α with β-catenin otal number of unique sites LEF1/CF-1α CF % error rate <0.01% per bp <% per bp
18 PSSM of LEF1/CF-1α SELEX cycle 3 1 C 2 C G 7 A 8 9 C 10 A A C G PSSM of LEF1/CF-1α SELEX cycle 6 1 C 2 C G 7 A 8 9 C 10 A A C G Base frequency tables for DNA binding sites of CF family members derived by HP SELEX
19 Sequence Logos for binding sites of CF family proteins Lef-1 Lef-1/beta-catenin cf-4 Comparison of our CF4 binding site with motif obtained by affinity measurements Sequence Logo pasted from Hallikas et al. (2006). Cell 124:21. Motif obtained by competition assays with complete single base-substitution series. Note: at least one significant position is missing because of a priori restriction of motif extension.
20 Overview of HPSELEX Database Contents from raw data to HMMs: Single-read sequencing chromatograms Clone sequences (assembled by Phred/Phrap) Site sequences with estimated sequencing errors HMMs for binding sites in two formats (decodeanhmm, MAMO) Additional features: Quality-controlled sequence download Access to selected low-throughput SELEX data Experimental and computational protocols
21 Example of a HPSELEX clone entry ID LBC standard; DNA; UNC; 1023 BP. XX AC LBC XX D -Jun-200 XX DE ' Sequence of SELEX/SAGE Clone : LBC of cycle XX KW HP SELEX/SAGE, invitro transcription factor binding sites XX OS unidentified OC unidentified XX RN [1] RA Emmanuelle Roulet, Stephane Busso, Anamaria A.Camargo, Andrew J.G Simpson, RA Nicolas Mermod, and Philipp Bucher. R High-throughput SELEX-SAGE method for quantitative modelling of R transcription-factor binding sites. RL Nature Biotechnology 20:831-83(2000) XX DR RACES;LBC 003F.scf XX FH Key Location/Qualifiers FH F source F /mol_type="unassigned DNA" F /organism="unidentified" F /tissue_type="selex" F misc_binding F /bound_moiety ="LEF1/CF with beta catenin " F /label="lbc 00003_1" F /note="base quality score is e-03" F misc_binding F /bound_moiety ="LEF1/CF with beta catenin " F /label="lbc 00003_2" F /note="base quality score is e-03" XX SQ Sequence 1023 BP; 230 A; 291 C; 260 G; 242 ; 0 other; AAAACCAA AAAGGGGCA GAAGGGCC CCCGAGC GCCGAGCG GCCGCCAGG GAGGAA CGCAGAA CCAGCACAC GGCGGCCG ACAGGGA CAGGCGG
Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq
Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne Data flow in ChIP-Seq data analysis Level 1:
More informationHidden Markov Models. Some applications in bioinformatics
Hidden Markov Models Some applications in bioinformatics Hidden Markov models Developed in speech recognition in the late 1960s... A HMM M (with start- and end-states) defines a regular language L M of
More informationMachine Learning. HMM applications in computational biology
10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly
More informationCharacterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057
Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Reviewing sites: affinity and specificity representation binding
More informationIntroduction to Bioinformatics Online Course: IBT
Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec5: Interpreting your MSA Using Logos Using Logos - Logos are a terrific way to generate
More informationExtraction of Hidden Markov Model Representations of Signal Patterns in. DNA Sequences
686 Extraction of Hidden Markov Model Representations of Signal Patterns in. DNA Sequences Tetsushi Yada The Japan Information Center of Science and Technology (JICST) 5-3 YonbancllO, Clliyoda-ku, Tokyo
More informationCS273B: Deep learning for Genomics and Biomedicine
CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy
More informationMethoden zur Analyse von Transkriptionsfaktoren. Seminar: BCII, Lausen
Methoden zur Analyse von Transkriptionsfaktoren Seminar: BCII, Lausen Gene expression: from transcription to translation Orphanides G, Reinberg D.Cell. 2002 Feb 22;108(4):439-51. Schematic of a gene regulatory
More informationRegulation of eukaryotic transcription:
Promoter definition by mass genome annotation data: in silico primer extension EMBNET course Bioinformatics of transcriptional regulation Jan 28 2008 Christoph Schmid Regulation of eukaryotic transcription:
More informationMicroarrays & Gene Expression Analysis
Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed
More informationRepresenting Errors and Uncertainty in Plasma Proteomics
Representing Errors and Uncertainty in Plasma Proteomics David J. States, M.D., Ph.D. University of Michigan Bioinformatics Program Proteomics Alliance for Cancer Genomics vs. Proteomics Genome sequence
More informationThe application of hidden markov model in building genetic regulatory network
J. Biomedical Science and Engineering, 2010, 3, 633-637 doi:10.4236/bise.2010.36086 Published Online June 2010 (http://www.scirp.org/ournal/bise/). The application of hidden markov model in building genetic
More informationDesign. Construction. Characterization
Design Construction Characterization DNA mrna (messenger) A C C transcription translation C A C protein His A T G C T A C G Plasmids replicon copy number incompatibility selection marker origin of replication
More informationMolecular Genetics Techniques. BIT 220 Chapter 20
Molecular Genetics Techniques BIT 220 Chapter 20 What is Cloning? Recombinant DNA technologies 1. Producing Recombinant DNA molecule Incorporate gene of interest into plasmid (cloning vector) 2. Recombinant
More informationBunDLE-seq (Binding to Designed Library, Extracting and Sequencing) -
Protocol BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) - A quantitative investigation of various determinants of TF binding; going beyond the characterization of core site Einat Zalckvar*
More information2/19/13. Contents. Applications of HMMs in Epigenomics
2/19/13 I529: Machine Learning in Bioinformatics (Spring 2013) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Background:
More informationModule Overview. Lecture
Module Overview Day 1 2 3 4 5 6 7 8 Lecture Introduction SELEX I: Building a Library SELEX II: Selecting RNA with target functionality SELEX III: Technical advances & problem-solving Characterizing aptamers
More information2/10/17. Contents. Applications of HMMs in Epigenomics
2/10/17 I529: Machine Learning in Bioinformatics (Spring 2017) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Background:
More informationStructure-Guided Deimmunization CMPS 3210
Structure-Guided Deimmunization CMPS 3210 Why Deimmunization? Protein, or biologic therapies are proving to be useful, but can be much more immunogenic than small molecules. Like a drug compound, a biologic
More informationENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics
A very coarse introduction to bioinformatics In this exercise, you will get a quick primer on how DNA is used to manufacture proteins. You will learn a little bit about how the building blocks of these
More informationThe ChIP-Seq project. Giovanna Ambrosini, Philipp Bucher. April 19, 2010 Lausanne. EPFL-SV Bucher Group
The ChIP-Seq project Giovanna Ambrosini, Philipp Bucher EPFL-SV Bucher Group April 19, 2010 Lausanne Overview Focus on technical aspects Description of applications (C programs) Where to find binaries,
More informationFinding Eukaryotic Genes Computationally
Gene Identification Finding Eukaryotic Genes Computationally ü Content-based Methods GC content, hexamer repeats, composition statistics, codon frequencies ü Site-based Methods donor sites, acceptor sites,
More informationBiochemistry 674, Fall, 1995: Nucleic Acids Exam II: November 16, 1995 Your Name Here: PCR A C
CHM 674 Exam II, 1995 1 iochemistry 674, Fall, 1995: Nucleic cids Prof. Jason Kahn Exam II: November 16, 1995 Your Name Here: This exam has five questions worth 20 points each. nswer all five. You do not
More informationChapter 20 Recombinant DNA Technology. Copyright 2009 Pearson Education, Inc.
Chapter 20 Recombinant DNA Technology Copyright 2009 Pearson Education, Inc. 20.1 Recombinant DNA Technology Began with Two Key Tools: Restriction Enzymes and DNA Cloning Vectors Recombinant DNA refers
More informationBLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments
BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database
More informationMotivation From Protein to Gene
MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein
More informationRoche Molecular Biochemicals Technical Note No. LC 12/2000
Roche Molecular Biochemicals Technical Note No. LC 12/2000 LightCycler Absolute Quantification with External Standards and an Internal Control 1. General Introduction Purpose of this Note Overview of Method
More informationProtein Structure Prediction. christian studer , EPFL
Protein Structure Prediction christian studer 17.11.2004, EPFL Content Definition of the problem Possible approaches DSSP / PSI-BLAST Generalization Results Definition of the problem Massive amounts of
More informationImproving CRISPR-Cas9 Gene Knockout with a Validated Guide RNA Algorithm
Improving CRISPR-Cas9 Gene Knockout with a Validated Guide RNA Algorithm Anja Smith Director R&D Dharmacon, part of GE Healthcare Imagination at work crrna:tracrrna program Cas9 nuclease Active crrna is
More informationBi 8 Lecture 4. Ellen Rothenberg 14 January Reading: from Alberts Ch. 8
Bi 8 Lecture 4 DNA approaches: How we know what we know Ellen Rothenberg 14 January 2016 Reading: from Alberts Ch. 8 Central concept: DNA or RNA polymer length as an identifying feature RNA has intrinsically
More informationChapter 1. from genomics to proteomics Ⅱ
Proteomics Chapter 1. from genomics to proteomics Ⅱ 1 Functional genomics Functional genomics: study of relations of genomics to biological functions at systems level However, it cannot explain any more
More informationTextbook Reading Guidelines
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: May 1, 2009 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science
More informationModule Overview. Lecture. DNA library synthesis (PCR) Introduction
Module Overview Day 1 Lecture Introduction Lab DNA library synthesis (PCR) 2 3 4 5 6 7 8 SELEX I: Building a Library SELEX II: Selecting RNA with target functionality SELEX III: Library deconvolution,
More informationApplications of HMMs in Epigenomics
I529: Machine Learning in Bioinformatics (Spring 2013) Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Background:
More informationScoring Alignments. Genome 373 Genomic Informatics Elhanan Borenstein
Scoring Alignments Genome 373 Genomic Informatics Elhanan Borenstein A quick review Course logistics Genomes (so many genomes) The computational bottleneck Python: Programs, input and output Number and
More informationÜbung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS
Übung V Einführung, Teil 1 Transktiptionelle Regulation TFBS Transcription Factors These proteins promote transcription 1. Bind DNA 2. Activate Transcription These two functions usually reside on separate
More informationBiochemistry 111. Carl Parker x A Braun
Biochemistry 111 Carl Parker x6368 101A Braun csp@caltech.edu Central Dogma of Molecular Biology DNA-Dependent RNA Polymerase Requires a DNA Template Synthesizes RNA in a 5 to 3 direction Requires ribonucleoside
More informationChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland
ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis
More informationMethods and tools for exploring functional genomics data
Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for
More informationIn vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features
In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features Yiliang Ding, Yin Tang, Chun Kit Kwok, Yu Zhang, Philip C. Bevilacqua & Sarah M. Assmann (2014) Seminar RNA Bioinformatics
More informationModule 2 overview SPRING BREAK
1 Module 2 overview lecture lab 1. Introduction to the module 1. Start-up protein eng. 2. Rational protein design 2. Site-directed mutagenesis 3. Fluorescence and sensors 3. DNA amplification 4. Protein
More informationChIP. November 21, 2017
ChIP November 21, 2017 functional signals: is DNA enough? what is the smallest number of letters used by a written language? DNA is only one part of the functional genome DNA is heavily bound by proteins,
More informationTextbook Reading Guidelines
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: January 16, 2013 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science
More informationCreation of a PAM matrix
Rationale for substitution matrices Substitution matrices are a way of keeping track of the structural, physical and chemical properties of the amino acids in proteins, in such a fashion that less detrimental
More informationGene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis
Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods
More informationPolymerase Chain Reaction (PCR) and Its Applications
Polymerase Chain Reaction (PCR) and Its Applications What is PCR? PCR is an exponentially progressing synthesis of the defined target DNA sequences in vitro. It was invented in 1983 by Dr. Kary Mullis,
More informationModule 2 overview SPRING BREAK
1 Module 2 overview lecture lab 1. Introduction to the module 1. Start-up protein eng. 2. Rational protein design 2. Site-directed mutagenesis 3. Fluorescence and sensors 3. DNA amplification 4. Protein
More informationMarch 9, Hidden Markov Models and. BioInformatics, Part I. Steven R. Dunbar. Intro. BioInformatics Problem. Hidden Markov.
and, and, March 9, 2017 1 / 30 Outline and, 1 2 3 4 2 / 30 Background and, Prof E. Moriyama (SBS) has a Seminar SBS, Math, Computer Science, Statistics Extensive use of program "HMMer" Britney (Hinds)
More informationEnhancers mutations that make the original mutant phenotype more extreme. Suppressors mutations that make the original mutant phenotype less extreme
Interactomics and Proteomics 1. Interactomics The field of interactomics is concerned with interactions between genes or proteins. They can be genetic interactions, in which two genes are involved in the
More informationSequencing technologies
Sequencing technologies part of High-Throughput Analyzes of Genome Sequenzes Computational EvoDevo University of Leipzig Leipzig, WS 2014/15 Sanger Sequencing (Chain Termination Method) Sequencing of one
More informationComputational Biology I LSM5191
Computational Biology I LSM5191 Lecture 5 Notes: Genetic manipulation & Molecular Biology techniques Broad Overview of: Enzymatic tools in Molecular Biology Gel electrophoresis Restriction mapping DNA
More information1 Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol doi: /nbt.3128 (2015).
F op-scoring motif Optimized motifs E Input sequences entral 1 bp region Dinucleotideshuffled seqs B D ll B1H-R predicted motifs Enriched B1H- R predicted motifs L!=!7! L!=!6! L!=5! L!=!4! L!=!3! L!=!2!
More informationComparative Bioinformatics. BSCI348S Fall 2003 Midterm 1
BSCI348S Fall 2003 Midterm 1 Multiple Choice: select the single best answer to the question or completion of the phrase. (5 points each) 1. The field of bioinformatics a. uses biomimetic algorithms to
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationReading Lecture 8: Lecture 9: Lecture 8. DNA Libraries. Definition Types Construction
Lecture 8 Reading Lecture 8: 96-110 Lecture 9: 111-120 DNA Libraries Definition Types Construction 142 DNA Libraries A DNA library is a collection of clones of genomic fragments or cdnas from a certain
More informationComputational Methods for Protein Structure Prediction and Fold Recognition... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M.
Contents Computational Methods for Protein Structure Prediction and Fold Recognition........................... 1 I. Cymerman, M. Feder, M. PawŁowski, M.A. Kurowski, J.M. Bujnicki 1 Primary Structure Analysis...................
More informationBootcamp: Molecular Biology Techniques and Interpretation
Bootcamp: Molecular Biology Techniques and Interpretation Bi8 Winter 2016 Today s outline Detecting and quantifying nucleic acids and proteins: Basic nucleic acid properties Hybridization PCR and Designing
More informationDeoxyribonucleic Acid DNA
Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/
More informationIntroduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods
Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/
More informationChapter 10 Genetic Engineering: A Revolution in Molecular Biology
Chapter 10 Genetic Engineering: A Revolution in Molecular Biology Genetic Engineering Direct, deliberate modification of an organism s genome bioengineering Biotechnology use of an organism s biochemical
More informationBiotechnology. Review labs 1-5! Ch 17: Genomes. Ch 18: Recombinant DNA and Biotechnology. DNA technology and its applications
Biotechnology DNA technology and its applications Biotechnology and Molecular Biology Concepts: Polymerase chain reaction (PCR) Plasmids and restriction digests Recombinant protein production UV spectrophotometry
More informationProfile HMMs. 2/10/05 CAP5510/CGS5166 (Lec 10) 1 START STATE 1 STATE 2 STATE 3 STATE 4 STATE 5 STATE 6 END
Profile HMMs START STATE 1 STATE 2 STATE 3 STATE 4 STATE 5 STATE 6 END 2/10/05 CAP5510/CGS5166 (Lec 10) 1 Profile HMMs with InDels Insertions Deletions Insertions & Deletions DELETE 1 DELETE 2 DELETE 3
More informationReverse Transcription & RT-PCR
Creating Gene Expression Solutions Reverse Transcription & RT-PCR Reverse transcription, a process that involves a reverse transcriptase (RTase) which uses RNA as the template to make complementary DNA
More informationFactors affecting PCR
Lec. 11 Dr. Ahmed K. Ali Factors affecting PCR The sequences of the primers are critical to the success of the experiment, as are the precise temperatures used in the heating and cooling stages of the
More informationGenome Sequence Assembly
Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:
More informationName with Last Name, First: BIOE111: Functional Biomaterial Development and Characterization MIDTERM EXAM (October 7, 2010) 93 TOTAL POINTS
BIOE111: Functional Biomaterial Development and Characterization MIDTERM EXAM (October 7, 2010) 93 TOTAL POINTS Question 0: Fill in your name and student ID on each page. (1) Question 1: What is the role
More informationIn-Fusion HD Cloning Plus System
In-Fusion HD Cloning Plus System One trustworthy solution for all your cloning and mutagenesis projects Seamless 15-30 Directional Any vector GOI + Any insert Anywhere Large & small inserts or vectors
More informationCollecTF Documentation
CollecTF Documentation Release 1.0.0 Sefa Kilic August 15, 2016 Contents 1 Curation submission guide 3 1.1 Data.................................................... 3 1.2 Before you start.............................................
More informationBiochemistry 674 Your Name: Nucleic Acids Prof. Jason Kahn Exam II (100 points total) November 17, 2005
Biochemistry 674 ucleic Acids Your ame: Prof. Jason Kahn Exam II (100 points total) ovember 17, 2005 You have 80 minutes for this exam. Exams written in pencil or erasable ink will not be re-graded under
More informationBi 8 Lecture 5. Ellen Rothenberg 19 January 2016
Bi 8 Lecture 5 MORE ON HOW WE KNOW WHAT WE KNOW and intro to the protein code Ellen Rothenberg 19 January 2016 SIZE AND PURIFICATION BY SYNTHESIS: BASIS OF EARLY SEQUENCING complex mixture of aborted DNA
More informationDNBseq TM SERVICE OVERVIEW Plant and Animal Whole Genome Re-Sequencing
TM SERVICE OVERVIEW Plant and Animal Whole Genome Re-Sequencing Plant and animal whole genome re-sequencing (WGRS) involves sequencing the entire genome of a plant or animal and comparing the sequence
More informationThis place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.
G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic
More informationLearning Methods for DNA Binding in Computational Biology
Learning Methods for DNA Binding in Computational Biology Mark Kon Dustin Holloway Yue Fan Chaitanya Sai Charles DeLisi Boston University IJCNN Orlando August 16, 2007 Outline Background on Transcription
More informationInternational Journal of Engineering & Technology IJET-IJENS Vol:14 No:01 9
International Journal of Engineering & Technology IJET-IJENS Vol:14 No:01 9 Analysis on Clustering Method for HMM-Based Exon Controller of DNA Plasmodium falciparum for Performance Improvement Alfred Pakpahan
More informationBioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine
Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Overview This lecture will
More informationBioinformatics overview
Bioinformatics overview Aplicações biomédicas em plataformas computacionais de alto desempenho Aplicaciones biomédicas sobre plataformas gráficas de altas prestaciones Biomedical applications in High performance
More informationRNA Expression of the information in a gene generally involves production of an RNA molecule transcribed from a DNA template. RNA differs from DNA
RNA Expression of the information in a gene generally involves production of an RNA molecule transcribed from a DNA template. RNA differs from DNA that it has a hydroxyl group at the 2 position of the
More informationSupplement to: The Genomic Sequence of the Chinese Hamster Ovary (CHO)-K1 cell line
Supplement to: The Genomic Sequence of the Chinese Hamster Ovary (CHO)-K1 cell line Table of Contents SUPPLEMENTARY TEXT:... 2 FILTERING OF RAW READS PRIOR TO ASSEMBLY:... 2 COMPARATIVE ANALYSIS... 2 IMMUNOGENIC
More informationSupplementary Information for:
Supplementary Information for: A streamlined and high-throughput targeting approach for human germline and cancer genomes using Oligonucleotide-Selective Sequencing Samuel Myllykangas 1, Jason D. Buenrostro
More informationTranscription Gene regulation
Transcription Gene regulation The machine that transcribes a gene is composed of perhaps 50 proteins, including RNA polymerase, the enzyme that converts DNA code into RNA code. A crew of transcription
More informationVALLIAMMAI ENGINEERING COLLEGE
VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER BM6005 BIO INFORMATICS Regulation 2013 Academic Year 2018-19 Prepared
More informationSequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University
Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned
More informationSite directed mutagenesis, Insertional and Deletion Mutagenesis. Mitesh Shrestha
Site directed mutagenesis, Insertional and Deletion Mutagenesis Mitesh Shrestha Mutagenesis Mutagenesis (the creation or formation of a mutation) can be used as a powerful genetic tool. By inducing mutations
More informationProductInformation INTRODUCTION TO THE VECTORETTE SYSTEM
INTRODUCTION TO THE VECTORETTE SYSTEM ProductInformation The following is background information on the Vectorette System, included to familiarize the researcher with the Vectorette Unit and its function
More informationBurrH: a new modular DNA binding protein for genome engineering
Supplementary information for: BurrH: a new modular protein for genome engineering Alexandre Juillerat, Claudia Bertonati, Gwendoline Dubois, Valérie Guyot, Séverine Thomas, Julien Valton, Marine Beurdeley,
More informationSequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing
Sequence Analysis II: Sequence Patterns and Matrices George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Patterns and Matrices Multiple sequence alignments Sequence patterns Sequence
More informationAdmission Exam for the Graduate Course in Bioinformatics. November 17 th, 2017 NAME:
1 Admission Exam for the Graduate Course in Bioinformatics November 17 th, 2017 NAME: This exam contains 30 (thirty) questions divided in 3 (three) areas (maths/statistics, computer science, biological
More informationGene Expression Technology
Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene
More informationThe Biotechnology Toolbox
Chapter 15 The Biotechnology Toolbox Cutting and Pasting DNA Cutting DNA Restriction endonuclease or restriction enzymes Cellular protection mechanism for infected foreign DNA Recognition and cutting specific
More informationNon-coding Function & Variation, MPRAs II. Mike White Bio /5/18
Non-coding Function & Variation, MPRAs II Mike White Bio 5488 3/5/18 MPRA Review Problem 1: Where does your CRE DNA come from? DNA synthesis Genomic fragments Targeted regulome capture Problem 2: How do
More informationIncorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits
Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing
More informationSome types of Mutagenesis
Mutagenesis What Is a Mutation? Genetic information is encoded by the sequence of the nucleotide bases in DNA of the gene. The four nucleotides are: adenine (A), thymine (T), guanine (G), and cytosine
More information3 Designing Primers for Site-Directed Mutagenesis
3 Designing Primers for Site-Directed Mutagenesis 3.1 Learning Objectives During the next two labs you will learn the basics of site-directed mutagenesis: you will design primers for the mutants you designed
More informationSELECTED TECHNIQUES AND APPLICATIONS IN MOLECULAR GENETICS
SELECTED TECHNIQUES APPLICATIONS IN MOLECULAR GENETICS Restriction Enzymes 15.1.1 The Discovery of Restriction Endonucleases p. 420 2 2, 3, 4, 6, 7, 8 Assigned Reading in Snustad 6th ed. 14.1.1 The Discovery
More informationAuthors: Vivek Sharma and Ram Kunwar
Molecular markers types and applications A genetic marker is a gene or known DNA sequence on a chromosome that can be used to identify individuals or species. Why we need Molecular Markers There will be
More informationData Retrieval from GenBank
Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing
More informationLecture Four. Molecular Approaches I: Nucleic Acids
Lecture Four. Molecular Approaches I: Nucleic Acids I. Recombinant DNA and Gene Cloning Recombinant DNA is DNA that has been created artificially. DNA from two or more sources is incorporated into a single
More informationGenetic Engineering & Recombinant DNA
Genetic Engineering & Recombinant DNA Chapter 10 Copyright The McGraw-Hill Companies, Inc) Permission required for reproduction or display. Applications of Genetic Engineering Basic science vs. Applied
More informationBioinformatics : Gene Expression Data Analysis
05.12.03 Bioinformatics : Gene Expression Data Analysis Aidong Zhang Professor Computer Science and Engineering What is Bioinformatics Broad Definition The study of how information technologies are used
More informationresequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics
RNA Sequencing T TM variation genetics validation SNP ncrna metagenomics private trio de novo exome mendelian ChIP-seq RNA DNA bioinformatics custom target high-throughput resequencing storage ncrna comparative
More information