Transcription factor binding site prediction in vivo using DNA sequence and shape features
|
|
- Dominick Phillip Thompson
- 6 years ago
- Views:
Transcription
1 Transcription factor binding site prediction in vivo using DNA sequence and shape features Anthony Mathelier, Lin Yang, Tsu-Pei Chiu, Remo Rohs, and Wyeth REGSYSGEN 2015 Nov. 17th Centre for Molecular Medicine and Therapeutics 1
2 Transcriptional regulation of gene expression Histone octamer TFs Enhancer Nucleosome RNA transcripts Cohesin TSS A. Mathelier, W. Shi, and W.W. Wasserman, Trends in Genetics, DNA RNA PolII Regulatory proteins Promoters Transcription of genes is turned on/off thanks to transcription factors (TFs). TFs bind to DNA at transcription factor binding sites (TFBSs). 2
3 Modeling TFBS using position frequency matrices (PFMs) Known binding sites: GTAACAAT GTAAACAT GTAAACAA GTAAACAA GTAAACAT GTAAACAA GTAAACAC GTCAACAG GTAAACAT GTAAACAA GTAAACAT TTAAGTAA ATAAACAA CTAAACAG GTAAACAT GTAAACAA GTAAACAT GTAAACAC GTAAACAT GTAAACAG Position Frequency Matrix: A [ ] C [ ] G [ ] T [ ] PFMs - PWMs Classically, position weight (PWMs) are derived from PFMs to model TFBSs, assuming nucleotide independence within TFBSs. 3
4 Modeling TFBS using Transcription Factor Flexible Models >HNF4A 1...AGTTCAAAGTTCA... >HNF4A 2...AGTCCAAAGTTCA >HNF4A CTTGGAACCGGGG... >HNF4A GGCAAGGTTCATA... ChIP-seq sequences AA CA GA TA AA CA GA TA AC CC GC TC AC CC GC TC AG CG GG TG AG CG GG TG AT CT GT TT AT CT GT TT E0 E1 bg/fg position 1 BG 1 E2 position bg/bg TFFMs AA CA GA TA AA CA GA TA AC CC GC TC AC CC GC TC AG CG GG TG AG CG GG TG AT CT GT TT AT CT GT TT... 1 En position n bits Logos A. Mathelier and W.W. Wasserman, PLoS Computational Biology, TFFMs TFFMs model the sequence property of TFBSs from ChIP-seq data by capturing successive dinucleotide dependencies. 4
5 DNA shape features The DNAshape tool predicts DNA shape features of a DNA sequence. Genome wide DNA shape features available on GBshape are: Minor Groove Width (MGW) Roll Propeller Twist (ProT) Helix Twist (HelT) T. Zhou et al., Nucl. Acids Res., T.P. Chiu et al., Nucl. Acids Res.,
6 Using DNA shape to model TFBSs Studies showed DNA shapes importance to model TFBSs from: SELEX-seq experiments. Protein-binding microarray experiments. BunDLE-seq experiments. N. Abe et al., Cell, T. Zhou et al., PNAS, M. Levo et al., Genome Res.,
7 Using DNA shape to model TFBSs Studies showed DNA shapes importance to model TFBSs from: SELEX-seq experiments. Protein-binding microarray experiments. BunDLE-seq experiments. N. Abe et al., Cell, T. Zhou et al., PNAS, M. Levo et al., Genome Res., Aims of our study: Construct computational models from large scale in vivo data (ChIP-seq) by combining DNA sequence and shape features. Show TFBS prediction improvements on in vivo data. Analyze whether DNA shape induced improvements are TF family specific. Analyze position-specific DNA shape importance at TFBSs. 6
8 Combining TFFMs and DNA shapes at TFBSs Feature vector hit score MGW ProT Roll HelT We used an ensemble machine learning approach to combine DNA sequence and shape features. 7
9 DNA shape features improve TFBS prediction in vivo A B Results on 400 human ENCODE ChIP-seq data sets Combining TFFM scores and DNA shape features improve the discriminative power. AUROC difference > 0.05 in 107 cases. 8
10 DNA shape features are important for specific TF families B C Data sets from E2F and MADS-domain TF families are enriched for strong improvements when considering DNA shape features. 9
11 Validation on independent plant MADS-domain TFs Incorporating DNA shape features significantly improve TFBS prediction for plant MADS-domain TFs. 10
12 ProT position-specific importance for MADS-domain TFs A 2 AGL15 bits 1 B ProT is of critical importance for predicting TFBSs associated to plant MADS-domain TFs in a position-specific manner. 11
13 Conclusions Our analyses of ChIP-seq data reprensent the in vivo conterpart of the published in vitro studies. We can construct computational models combining DNA sequence and shape features from ChIP-seq data to improve TFBS prediction in vivo. Incorporating DNA shape information is most beneficial when applied to the E2F and MADS-domain TF families. ProT is critical for MADS-domain TF binding specificity in a position-specific manner. 12
14 Acknowledgements Wyeth Wasserman Remo Rohs Lin Yang Tsu-Pei Chiu François Parcy Oriol Fornes Chih-Yu Chen Centre for Molecular Medicine and Therapeutics 13
15 2 1 hit score A B Feature vector MGW ProT Roll C HelT Thank you AGL15 bits 14
Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Wenxiu Ma 1, Lin Yang 2, Remo Rohs 2, and William Stafford Noble 3 1 Department of Statistics,
More informationThe Next Generation of Transcription Factor Binding Site Prediction
The Next Generation of Transcription Factor Binding Site Prediction Anthony Mathelier*, Wyeth W. Wasserman* Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department
More informationDOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION
DOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION Raphaël MOURAD, Assist. Prof. Centre de Biologie Intégrative Université Paul Sabatier, Toulouse III INTRODUCTION Double-strand
More informationDNA sequence and chromatin structure. Mapping nucleosome positioning using high-throughput sequencing
DNA sequence and chromatin structure Mapping nucleosome positioning using high-throughput sequencing DNA sequence and chromatin structure Higher-order 30 nm fibre Mapping nucleosome positioning using high-throughput
More informationCharacterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057
Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Reviewing sites: affinity and specificity representation binding
More informationSequence Motif Analysis
Sequence Motif Analysis Lecture in M.Sc. Biomedizin, Module: Proteinbiochemie und Bioinformatik Jonas Ibn-Salem Andrade group Johannes Gutenberg University Mainz Institute of Molecular Biology March 7,
More informationSupplementary Data Bioconductor Vignette for DNAshapeR Package. DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding
Supplementary Data Bioconductor Vignette for DNAshapeR Package DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding Tsu-Pei Chiu 1,#, Federico Comoglio 2,#, Tianyin Zhou 1,&,
More informationEpigenetics and DNase-Seq
Epigenetics and DNase-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony
More informationFigure S4 A-H : Initiation site properties and evolutionary changes
A 0.3 Figure S4 A-H : Initiation site properties and evolutionary changes G-correction not used 0.25 Fraction of total counts 0.2 0.5 0. tag 2 tags 3 tags 4 tags 5 tags 6 tags 7tags 8tags 9 tags >9 tags
More informationLecture 5: Regulation
Machine Learning in Computational Biology CSC 2431 Lecture 5: Regulation Instructor: Anna Goldenberg Central Dogma of Biology Transcription DNA RNA protein Process of producing RNA from DNA Constitutive
More informationSupplementary Figure 1
number of cells, normalized number of cells, normalized number of cells, normalized Supplementary Figure CD CD53 Cd3e fluorescence intensity fluorescence intensity fluorescence intensity Supplementary
More informationMany transcription factors! recognize DNA shape
Many transcription factors! recognize DN shape Katie Pollard! Gladstone Institutes USF Division of Biostatistics, Institute for Human Genetics, and Institute for omputational Health Sciences ENODE Users
More informationProbing transcription factor combinatorics in different promoter classes and in enhancers
D R A F T Probing transcription factor combinatorics in different promoter classes and in enhancers Jimmy Vandel 1,2 Océane Cassan 1,2 Sophie Lèbre 1,3 Charles-Henri Lecellier 1,4 Laurent Bréhélin 1,2
More informationCS273B: Deep learning for Genomics and Biomedicine
CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy
More informationL8: Downstream analysis of ChIP-seq and ATAC-seq data
L8: Downstream analysis of ChIP-seq and ATAC-seq data Shamith Samarajiwa CRUK Bioinformatics Autumn School September 2017 Summary Downstream analysis for extracting meaningful biology : Normalization and
More informationComputational Technique for Improvement of the Position-Weight Matrices for the DNA/Protein Binding Sites
Wright State University CORE Scholar Physics Faculty Publications Physics 2005 Computational Technique for Improvement of the Position-Weight Matrices for the DNA/Protein Binding Sites Naum I. Gershenzon
More informationMeasuring Protein-DNA interactions
Measuring Protein-DNA interactions How is Biological Complexity Achieved? Mediated by Transcription Factors (TFs) 2 Transcription Factors are genetic switches 3 Regulation of Gene Expression by Transcription
More informationChapter 10: Gene Expression and Regulation
Chapter 10: Gene Expression and Regulation Fact 1: DNA contains information but is unable to carry out actions Fact 2: Proteins are the workhorses but contain no information THUS Information in DNA must
More informationBunDLE-seq (Binding to Designed Library, Extracting and Sequencing) -
Protocol BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) - A quantitative investigation of various determinants of TF binding; going beyond the characterization of core site Einat Zalckvar*
More informationYear III Pharm.D Dr. V. Chitra
Year III Pharm.D Dr. V. Chitra 1 Genome entire genetic material of an individual Transcriptome set of transcribed sequences Proteome set of proteins encoded by the genome 2 Only one strand of DNA serves
More information2/10/17. Contents. Applications of HMMs in Epigenomics
2/10/17 I529: Machine Learning in Bioinformatics (Spring 2017) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Background:
More informationÜbung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS
Übung V Einführung, Teil 1 Transktiptionelle Regulation TFBS Transcription Factors These proteins promote transcription 1. Bind DNA 2. Activate Transcription These two functions usually reside on separate
More informationNGS Approaches to Epigenomics
I519 Introduction to Bioinformatics, 2013 NGS Approaches to Epigenomics Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Background: chromatin structure & DNA methylation Epigenomic
More informationFigure 7.1: PWM evolution: The sequence affinity of TFBSs has evolved from single sequences, to PWMs, to larger and larger databases of PWMs.
Chapter 7 Discussion This thesis presents dry and wet lab techniques to elucidate the involvement of transcription factors (TFs) in the regulation of the cell cycle and myogenesis. However, the techniques
More informationIntroduction to Transcription Factor Binding Sites (TFBS) Cells control the expression of genes using Transcription Factors.
Identification of Functional Transcription Factor Binding Sites using Closely Related Saccharomyces species Scott W. Doniger 1, Juyong Huh 2, and Justin C. Fay 1,2 1 Computation Biology Program and 2 Department
More informationComputational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq
Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne Data flow in ChIP-Seq data analysis Level 1:
More informationIntroduction to genome biology
Introduction to genome biology Lisa Stubbs Deep transcritpomes for traditional model species from ENCODE (and modencode) Deep RNA-seq and chromatin analysis on 147 human cell types, as well as tissues,
More informationApplications of HMMs in Epigenomics
I529: Machine Learning in Bioinformatics (Spring 2013) Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Background:
More information2/19/13. Contents. Applications of HMMs in Epigenomics
2/19/13 I529: Machine Learning in Bioinformatics (Spring 2013) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Background:
More informationIntroduction to BIOINFORMATICS
COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/
More informationA Supervised Learning Approach to the Prediction of Hi-C Data
A Supervised Learning Approach to the Prediction of Hi-C Data Tyler Derr YUE Lab The Department of Biochemistry & Molecular Biology Institute for Personalized Medicine Pennsylvania State University College
More informationPrediction of Transcription Factors that Regulate Common Binding Motifs Dana Wyman and Emily Alsentzer CS 229, Fall 2014
Prediction of Transcription Factors that Regulate Common Binding Motifs Dana Wyman and Emily Alsentzer CS 229, Fall 2014 Introduction A. Background Proper regulation of mrna levels is essential to nearly
More informationUnderstanding transcriptional regulation by integrative analysis of transcription factor binding data
Understanding transcriptional regulation by integrative analysis of transcription factor binding data Cheng et al. 2012 Shu Yang Feb. 21, 2013 1 / 26 Introduction 2 / 26 DNA-binding Proteins sequence-specific
More informationRAFT. Sandesh Prasai. Real And False TFBSs. Medical Technology Submission date: July 2014 Supervisor: Pål Sætrom, IDI Co-supervisor: Finn Drabløs, IKM
RAFT Real And False TFBSs Sandesh Prasai Medical Technology Submission date: July 2014 Supervisor: Pål Sætrom, IDI Co-supervisor: Finn Drabløs, IKM Norwegian University of Science and Technology Department
More information27041, Week 02. Review of Week 01
27041, Week 02 Review of Week 01 The human genome sequencing project (HGP) 2 CBS, Department of Systems Biology Systems Biology and emergent properties 3 CBS, Department of Systems Biology Different model
More informationApplying Machine Learning Strategy in Transcription Factor DNA Bindings Site Prediction
Applying Machine Learning Strategy in Transcription Factor DNA Bindings Site Prediction Ziliang Qian Key Laboratory of Systems Biology Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences,
More informationAccelerating Genomic Computations 1000X with Hardware
Accelerating Genomic Computations 1000X with Hardware Yatish Turakhia EE PhD candidate Stanford University Prof. Bill Dally (Electrical Engineering and Computer Science) Prof. Gill Bejerano (Computer Science,
More informationModule 2: Core Bioinformatics FINAL EXAM SOLUTIONS
Master in Bioinformatics January 9th, 2013 Universitat Autònoma de Barcelona Module 2: Core Bioinformatics FINAL EXAM SOLUTIONS Question 1: What is the statement that does NOT apply to the FASTA format?
More informationGene splice sites correlate with nucleosome positions
Gene splice sites correlate with nucleosome positions Simon Kogan and Edward N. Trifonov* Genome Diversity Center, Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel Abstract
More informationDIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine
DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine Defining MYB Transcriptional Network by Genome-wide Chromatin Occupancy Profiling (ChIP-Seq) 2010 E.Glazov, L. Zhao Transcription Factors:
More informationArray Informatics. Mark Gerstein
1 Lectures.GersteinLab.org (c) Array Informatics Mark Gerstein CEGS Informatics Developing Tools and Technical Analyses Related to Genome Technologies Main Genome Technologies Tiling Arrays Next Generation
More informationMotifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin
Motifs BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin An example transcriptional regulatory cascade Here, controlling Salmonella bacteria multidrug resistance Sequencespecific
More informationLecture 7: April 7, 2005
Analysis of Gene Expression Data Spring Semester, 2005 Lecture 7: April 7, 2005 Lecturer: R.Shamir and C.Linhart Scribe: A.Mosseri, E.Hirsh and Z.Bronstein 1 7.1 Promoter Analysis 7.1.1 Introduction to
More informationThe ChIP-Seq project. Giovanna Ambrosini, Philipp Bucher. April 19, 2010 Lausanne. EPFL-SV Bucher Group
The ChIP-Seq project Giovanna Ambrosini, Philipp Bucher EPFL-SV Bucher Group April 19, 2010 Lausanne Overview Focus on technical aspects Description of applications (C programs) Where to find binaries,
More informationChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015
ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA
More informationChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015
ChIP-Seq Tools J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA or
More informationChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014
ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind
More informationTranscription start site classification
Transcription start site classification Max Libbrecht, Matt Fisher, Roy Frostig, Hrysoula Papadakis, Anshul Kundaje, Serafim Batzoglou December 11, 2009 Abstract Understanding the mechanisms of gene expression
More informationBioinformatics of Transcriptional Regulation
Bioinformatics of Transcriptional Regulation Carl Herrmann IPMB & DKFZ c.herrmann@dkfz.de Wechselwirkung von Maßnahmen und Auswirkungen Einflussmöglichkeiten in einem Dialog From genes to active compounds
More informationIdentification of Conserved Structural Features at Sequentially Degenerate Locations in Transcription Factor Binding Sites
Genome Informatics 16(1): 49 58 (2005) 49 Identification of Conserved Structural Features at Sequentially Degenerate Locations in Transcription Factor Binding Sites Heather E. Burden 1,2 Zhiping Weng 1,2
More informationDiscovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks
Discovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks Reesab Pathak Dept. of Computer Science Stanford University rpathak@stanford.edu Abstract Transcription factors are
More informationMachine Learning. HMM applications in computational biology
10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly
More informationIntroduction to genome biology
Introduction to genome biology Lisa Stubbs We ve found most genes; but what about the rest of the genome? Genome size* 12 Mb 95 Mb 170 Mb 1500 Mb 2700 Mb 3200 Mb #coding genes ~7000 ~20000 ~14000 ~26000
More informationSupplementary table 1: List of sequences of primers used in sequenom assay
Supplementary table 1: List of sequences of primers used in sequenom assay SNP_ID 2nd-PCRP Sequence 1st-PCRP Sequence Allele specific (iplex) iplex primer primer Direction ROCK2 1 rs978906 ACGTTGGATGATAAAGCTCTCTCGGCAGTC
More informationDiscovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies
Discovering gene regulatory control using ChIP-chip and ChIP-seq Part 1 An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk http://bit.ly/bio2links
More informationGenome-Scale Predictions of the Transcription Factor Binding Sites of Cys 2 His 2 Zinc Finger Proteins in Yeast June 17 th, 2005
Genome-Scale Predictions of the Transcription Factor Binding Sites of Cys 2 His 2 Zinc Finger Proteins in Yeast June 17 th, 2005 John Brothers II 1,3 and Panayiotis V. Benos 1,2 1 Bioengineering and Bioinformatics
More informationDiscovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies
Discovering gene regulatory control using ChIP-chip and ChIP-seq An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk bit.ly/bio2_2012 The Central Dogma
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1
Supplementary Figure 1 An extended version of Figure 2a, depicting multi-model training and reverse-complement mode To use the GPU s full computational power, we train several independent models in parallel
More informationIn Silico Transcription Factor Binding Site Prediction How To Improve?
Frequency 31/03/2014 In Silico Transcription Factor Binding Site Prediction How To Improve? Pieter De Bleser, Ph.D. pieterdb@irc.vib-ugent.be Credits: R. Bruskiewich and F. Brinkman, MBB with material
More informationFunctional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017
Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017 Agenda What is Functional Genomics? RNA Transcription/Gene Expression Measuring Gene
More informationGene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis
Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods
More informationEditorial. Current Computational Models for Prediction of the Varied Interactions Related to Non-Coding RNAs
Editorial Current Computational Models for Prediction of the Varied Interactions Related to Non-Coding RNAs Xing Chen 1,*, Huiming Peng 2, Zheng Yin 3 1 School of Information and Electrical Engineering,
More informationSolutions will be posted on the web.
MIT Biology Department 7.012: Introductory Biology - Fall 2004 Instructors: Professor Eric Lander, Professor Robert A. Weinberg, Dr. Claudette Gardel NAME TA SEC 7.012 Problem Set 7 FRIDAY December 3,
More informationChromatin. Structure and modification of chromatin. Chromatin domains
Chromatin Structure and modification of chromatin Chromatin domains 2 DNA consensus 5 3 3 DNA DNA 4 RNA 5 ss RNA forms secondary structures with ds hairpins ds forms 6 of nucleic acids Form coiling bp/turn
More informationBio5488 Practice Midterm (2018) 1. Next-gen sequencing
1. Next-gen sequencing 1. You have found a new strain of yeast that makes fantastic wine. You d like to sequence this strain to ascertain the differences from S. cerevisiae. To accurately call a base pair,
More informationA Brief History. Bootstrapping. Bagging. Boosting (Schapire 1989) Adaboost (Schapire 1995)
A Brief History Bootstrapping Bagging Boosting (Schapire 1989) Adaboost (Schapire 1995) What s So Good About Adaboost Improves classification accuracy Can be used with many different classifiers Commonly
More informationArticle Predicting Variation of DNA Shape Preferences in Protein-DNA Interaction in Cancer Cells with a New Biophysical Model
Article Predicting Variation of DNA Shape Preferences in Protein-DNA Interaction in Cancer Cells with a New Biophysical Model Kirill Batmanov and Junbai Wang * Department of Pathology, Oslo University
More informationGenome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013)
Genome annotation Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation AGACAAAGATCCGCTAAATTAAATCTGGACTTCACATATTGAAGTGATATCACACGTTTCTCTAAT AATCTCCTCACAATATTATGTTTGGGATGAACTTGTCGTGATTTGCCATTGTAGCAATCACTTGAA
More informationA Repressor Complex Governs the Integration of
Developmental Cell 15 Supplemental Data A Repressor Complex Governs the Integration of Flowering Signals in Arabidopsis Dan Li, Chang Liu, Lisha Shen, Yang Wu, Hongyan Chen, Masumi Robertson, Chris A.
More informationPromoter Prediction (really) 10/26/05
10/26/05 Promoter Prediction (really!) Announcements BCB Link for Seminar Schedules (updated) http://www.bcb.iastate.edu/seminars/inde.html Seminar (Fri Oct 28) 12:10 PM BCB Faculty Seminar in E164 Lagomarcino
More informationNon-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18
Non-coding Function & Variation, MPRAs Mike White Bio5488 3/5/18 Outline MONDAY Non-coding function and variation The barcode Basic versions of MRPA technology WEDNESDAY More varieties of MRPAs Some key
More informationMolecular Cell Biology - Problem Drill 06: Genes and Chromosomes
Molecular Cell Biology - Problem Drill 06: Genes and Chromosomes Question No. 1 of 10 1. Which of the following statements about genes is correct? Question #1 (A) Genes carry the information for protein
More informationInteraktionen und Modifikationen von RNAs und Proteinen RNA-Protein Interactions II
Interaktionen und Modifikationen von RNAs und Proteinen RNA-Protein Interactions II (Modul 10-202-2208; Spezialvorlesung) Jörg Fallmann Institute for Bioinformatics University of Leipzig 11.05.2018 1 /
More informationComputational Genomics. Irit Gat-Viks & Ron Shamir & Haim Wolfson Fall
Computational Genomics Irit Gat-Viks & Ron Shamir & Haim Wolfson Fall 2015-16 1 What s in class this week Motivation Administrata Some very basic biology Some very basic biotechnology Examples of our type
More informationHigh-throughput Transcriptome analysis
High-throughput Transcriptome analysis CAGE and beyond Dr. Rimantas Kodzius, Singapore, A*STAR, IMCB rkodzius@imcb.a-star.edu.sg for KAUST 2008 Agenda 1. Current research - PhD work on discovery of new
More informationComputational Systems Biology Deep Learning in the Life Sciences
Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490 HST.506 Christina Ji April 6, 2017 DanQ: a hybrid convolutional and recurrent deep neural network for quantifying
More informationSequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing
Sequence Analysis II: Sequence Patterns and Matrices George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Patterns and Matrices Multiple sequence alignments Sequence patterns Sequence
More informationActivation of a Floral Homeotic Gene in Arabidopsis
Activation of a Floral Homeotic Gene in Arabidopsis By Maximiliam A. Busch, Kirsten Bomblies, and Detlef Weigel Presentation by Lis Garrett and Andrea Stevenson http://ucsdnews.ucsd.edu/archive/graphics/images/image5.jpg
More informationSupplementary Material
Reverse Transcriptase-Mediated Tropism Switching in Bordetella Bacteriophage Minghsun Liu, Rajendar Deora, Sergei R. Doulatov, Mari Gingery, Frederick A. Eiserling, Andrew Preston, Duncan J. Maskell, Robert
More informationMassively parallel decoding of mammalian regulatory sequences supports a flexible organizational model
Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model Robin P. Smith 1,2*, Leila Taher 3,4* Rupali P. Patwardhan 5, Mee J. Kim 1,2, Fumitaka Inoue 1,2,
More informationEffective Placement of. Raymond J. Peterson, Ph.D.
Effective Placement of LNA into Q-PCR Q Probes Raymond J. Peterson, Ph.D. Why Use LNA in Q-PCR? Q Proper design enables stable & specific hybridization for primers & probes Goals: Ensure sufficient signal
More informationGene and DNA structure. Dr Saeb Aliwaini
Gene and DNA structure Dr Saeb Aliwaini 2016 DNA during cell cycle Cell cycle for different cell types Molecular Biology - "Study of the synthesis, structure, and function of macromolecules (DNA, RNA,
More informationLearning Methods for DNA Binding in Computational Biology
Learning Methods for DNA Binding in Computational Biology Mark Kon Dustin Holloway Yue Fan Chaitanya Sai Charles DeLisi Boston University IJCNN Orlando August 16, 2007 Outline Background on Transcription
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru
More informationFile S1. Program overview and features
File S1 Program overview and features Query list filtering. Further filtering may be applied through user selected query lists (Figure. 2B, Table S3) that restrict the results and/or report specifically
More informationTitle: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background
Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Team members: David Moskowitz and Emily Tsang Background Transcription factors
More informationEnsembl Funcgen: A Database and API for Epigenomics and Gene Regulation Data.
Ensembl Funcgen: A Database and API for Epigenomics and Gene Regulation Data. Nathan Johnson Ensembl Regulation EBI is an Outstation of the European Molecular Biology Laboratory.! Workshop Overview http://www.ebi.ac.uk/~njohnson/courses/23.05.2013-
More informationChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland
ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis
More informationCharles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use
Charles Girardot, Furlong Lab MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use ChIP-Seq signal properties Only 5 ends of ChIPed fragments are sequenced Shifted read
More informationRNA-Seq Now What? BIS180L Professor Maloof May 24, 2018
RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018 We have differentially expressed genes, what do we want to know about them? We have differentially expressed genes, what do we want to know about
More informationComputational Genomics
Computational Genomics http://www.cs.cmu.edu/~02710 Ziv Bar-Joseph zivbj@cs.cmu.edu GHC 8006 Chakra Chennubhotla chakracs@pitt.edu Suite 3064, BST3 Topics Introduction (1 Week) Sequence analysis(4 weeks)
More informationCSC 2427: Algorithms in Molecular Biology Lecture #14
CSC 2427: Algorithms in Molecular Biology Lecture #14 Lecturer: Michael Brudno Scribe Note: Hyonho Lee Department of Computer Science University of Toronto 03 March 2006 Microarrays Revisited In the last
More informationQuantifying family-wise specificity of intramolecular flanking region flexibility and
COMP 680-Final Project Quantifying family-wise specificity of intramolecular flanking region flexibility and structural motif interactions as features for transcription factor binding site classification
More informationSupplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and
Supplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and Tin TFBS motifs and located in the flanking or intronic
More informationGenes - DNA - Chromosome. Chutima Talabnin Ph.D. School of Biochemistry,Institute of Science, Suranaree University of Technology
Genes - DNA - Chromosome Chutima Talabnin Ph.D. School of Biochemistry,Institute of Science, Suranaree University of Technology DNA Cellular DNA contains genes and intragenic regions both of which may
More informationPredicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs
BMC Bioinformatics This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. Predicting tissue specific
More informationChapter 24: Promoters and Enhancers
Chapter 24: Promoters and Enhancers A typical gene transcribed by RNA polymerase II has a promoter that usually extends upstream from the site where transcription is initiated the (#1) of transcription
More informationIntroduction to Bioinformatics Online Course: IBT
Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec5: Interpreting your MSA Using Logos Using Logos - Logos are a terrific way to generate
More informationChIP-seq and RNA-seq. Farhat Habib
ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions
More informationNucleic acids. How DNA works. DNA RNA Protein. DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology
Nucleic acid chemistry and basic molecular theory Nucleic acids DNA (deoxyribonucleic acid) RNA (ribonucleic acid) Central Dogma of Molecular Biology Cell cycle DNA RNA Protein Transcription Translation
More information