Many transcription factors! recognize DNA shape
|
|
- Kathryn Whitehead
- 6 years ago
- Views:
Transcription
1 Many transcription factors! recognize DN shape Katie Pollard! Gladstone Institutes USF Division of Biostatistics, Institute for Human Genetics, and Institute for omputational Health Sciences ENODE Users Meeting - Stanford, June 10, 2016
2 Most disease associated mutations are outside coding regions Hypothesis 1: Non-coding variants alter transcription factor sequence motifs. DN G Gene Gene B Schizophrenia Gene associated variant T
3 DN Most disease associated mutations are outside coding regions Hypothesis 1: Non-coding variants alter transcription factor sequence motifs. _ G Gene Gene B Schizophrenia Gene associated variant T pproach: Map variants to correct pathways by predicting enhancers and their target genes. Score variants for changes in binding affinity.
4 DN EnhancerFinder distinguishes biologically active enhancers G Gene Gene B Schizophrenia Gene associated variant Training Data VIST Enhancers Performance on held out data T 80% validate in vivo Yes No Func1onal Genomics U=0.96 Power=85% at 10% FPR DN Sequences,,G,T,.. Erwin et al. (2014) apra et al. (2014)
5 DN EnhancerFinder distinguishes biologically active enhancers G Gene Gene B Schizophrenia Gene associated variant Training Data VIST Enhancers Performance on held out data T 80% validate in vivo Yes No Func1onal Genomics U=0.96 Power=85% at 10% FPR DN Sequences,,G,T,.. Erwin et al. (2014) apra et al. (2014)
6 MotifDiverge quantifies loss/gain of TF binding sites G T _ DN Gene Gene B Schizophrenia Gene associated variant Statistical model for TFBS evolution with turnover Predicts change of function P-value for net change in binding One or many TFs lignment-free Evolutionary model Motif specific Detects loss/gain of function mutations with high accuracy Better than conservation scores In vivo and MPRs in cell lines Ritter et al. (2010) Kostka et al. (2015)
7 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Training Data c1ve enhancers Expressed genes Hi- interac1ons Func1onal Genomics losest gene usually wrong Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with TF improve predictions! Whalen et al. (2016)
8 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes.! - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene Brain (96)
9 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. 239 Predicted HR Enhancers - Massive integration of Heart functional (28) genomics data enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Limb (41) - Functional variants outside enhancers Brain (96)
10 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)
11 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)
12 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. pproach: RISPR edit sites identified by TargetFinder, then test chromatin and expression. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)
13 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Brain (96) - Functional variants outside enhancers - Enhancer variants outside sequence motifs
14 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Brain (96) - Functional variants outside enhancers - Enhancer variants outside sequence motifs For a typical ENODE TF 23% of the top 2000 hipseq peaks have no sequence motif (range = 1%-63%)
15 Many enhancer mutations are outside known or de novo sequence motifs Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. DN G Gene Gene B Schizophrenia Gene associated variant T
16 Many enhancer mutations are outside known or de novo sequence motifs DN Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. Gene Gene B Schizophrenia Gene associated variant TFs can recognize shape in addition to sequence. DN shape differentiates similar sequence motifs. Distinct sequences can encode same shape. G T
17 Many enhancer mutations are outside known or de novo sequence motifs DN Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. Gene Gene B Schizophrenia Gene associated variant TFs can recognize shape in addition to sequence. DN shape differentiates similar sequence motifs. Distinct sequences can encode same shape. G pproach: lgorithm to learn shape motifs de novo for all ENODE TFs, predict shape motif hits in hipseq peaks, compare to sequence motifs T
18 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features. Based on molecular dynamics simulations. MGW ProT Roll HelT [Figures from Blackburn and Gait, 1996]
19 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs!
20 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position
21 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position pply to remaining 1000 of top 2000 peaks and flanking non-peak regions for each TF
22 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position pply to remaining 1000 of top 2000 peaks and flanking non-peak regions for each TF 4. Enrichment test: Hypergeometric p-value.
23 Shape motifs are common Percentage TFs with a motif for each feature in each cell line
24 Shape complements sequence motifs Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center.!!
25 Shape complements sequence motifs Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. ~25% of peaks have sequence and shape motifs.!!
26 Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. ~25% of peaks have sequence and shape motifs. - These can be similar,! Shape complements sequence motifs Underlying sequence logo Nrsf Roll motif in K562 Nrsf FactorBook sequence motif
27 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another,!! Underlying! sequence fos ProT motif in K562 FactorBook! sequence! motif
28 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Underlying! sequence Maff HelT motif in K562 FactorBook! sequence! motif
29 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Underlying! sequence Nfya HelT motif in K562 FactorBook! sequence! motif is 3bp! upstream
30 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Underlying! sequence fos Roll motif in K562 FactorBook! sequence! motif is 30bp! away
31 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Shape motifs can differ between TFs with similar sequence motifs and/or the same protein fold.! Fosl1 has a HelT motif tf3 has a Roll motif
32 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types! - Spectrum of recognition Heart modes (28) 239 Predicted HR Enhancers Limb (41) Brain (96)
33 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks! Limb (41) 239 Predicted HR Enhancers Brain (96)
34 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks Role of shape in ectopic binding of TFs when cofactors are absent Limb [Luna-Zurita (41) et al. 2016] 239 Predicted HR Enhancers! Brain (96)
35 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks Role of shape in ectopic binding of TFs when cofactors are absent Limb [Luna-Zurita (41) et al. 2016] 239 Predicted HR Enhancers Brain (96) Evolutionary modeling of DN shape - onservation of shape without sequence - Scoring SNPs for effects on shape motifs
36 ollaborators EnhancerFinder Tony apra Gen Haliburton DN Shape Hassan Samee TargetFinder Rebecca Truty Sean Whalen MotifDiverge Dennis Kostka Functional ssays Hane Ryu lex Pollen Nadav hituv rnold Kriegstein Funding from NIMH, NIGMS, NHLBI, PhRM Foundation, Gladstone Institutes
Integrating diverse datasets improves developmental enhancer prediction
1 Integrating diverse datasets improves developmental enhancer prediction Genevieve D. Erwin 1, Rebecca M. Truty 1, Dennis Kostka 2, Katherine S. Pollard 1,3, John A. Capra 4 1 Gladstone Institute of Cardiovascular
More informationIntegrating Diverse Datasets Improves Developmental Enhancer Prediction
Integrating Diverse Datasets Improves Developmental Enhancer Prediction Genevieve D. Erwin 1,2, Nir Oksenberg 2,3, Rebecca M. Truty 1, Dennis Kostka 4, Karl K. Murphy 2,3, Nadav Ahituv 2,3, Katherine S.
More informationTranscription factor binding site prediction in vivo using DNA sequence and shape features
Transcription factor binding site prediction in vivo using DNA sequence and shape features Anthony Mathelier, Lin Yang, Tsu-Pei Chiu, Remo Rohs, and Wyeth Wasserman anthony.mathelier@gmail.com @AMathelier
More informationWhole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist
Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data
More informationSupplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and
Supplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and Tin TFBS motifs and located in the flanking or intronic
More informationCS273B: Deep learning for Genomics and Biomedicine
CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy
More information1 Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol doi: /nbt.3128 (2015).
F op-scoring motif Optimized motifs E Input sequences entral 1 bp region Dinucleotideshuffled seqs B D ll B1H-R predicted motifs Enriched B1H- R predicted motifs L!=!7! L!=!6! L!=5! L!=!4! L!=!3! L!=!2!
More informationPersonal and population genomics of human regulatory variation
Personal and population genomics of human regulatory variation Benjamin Vernot,, and Joshua M. Akey Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA Ting WANG 1 Brief
More informationNon-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18
Non-coding Function & Variation, MPRAs Mike White Bio5488 3/5/18 Outline MONDAY Non-coding function and variation The barcode Basic versions of MRPA technology WEDNESDAY More varieties of MRPAs Some key
More informationDecoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics
Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing
More informationComparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls
Comparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls, Duke University Christopher D Brown, University of Pennsylvania November 9th,
More informationDiscovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies
Discovering gene regulatory control using ChIP-chip and ChIP-seq Part 1 An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk http://bit.ly/bio2links
More informationThe genetic code of gene regulatory elements
The genetic code of gene regulatory elements Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology Information National Institutes of Health October 23, 2008 Outline Gene deserts
More informationCharacterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057
Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Reviewing sites: affinity and specificity representation binding
More informationChIP-seq analysis 2/28/2018
ChIP-seq analysis 2/28/2018 Acknowledgements Much of the content of this lecture is from: Furey (2012) ChIP-seq and beyond Park (2009) ChIP-seq advantages + challenges Landt et al. (2012) ChIP-seq guidelines
More informationSupplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Wenxiu Ma 1, Lin Yang 2, Remo Rohs 2, and William Stafford Noble 3 1 Department of Statistics,
More informationNGS Approaches to Epigenomics
I519 Introduction to Bioinformatics, 2013 NGS Approaches to Epigenomics Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Background: chromatin structure & DNA methylation Epigenomic
More informationChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland
ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis
More informationYear III Pharm.D Dr. V. Chitra
Year III Pharm.D Dr. V. Chitra 1 Genome entire genetic material of an individual Transcriptome set of transcribed sequences Proteome set of proteins encoded by the genome 2 Only one strand of DNA serves
More informationCharles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use
Charles Girardot, Furlong Lab MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use ChIP-Seq signal properties Only 5 ends of ChIPed fragments are sequenced Shifted read
More informationActivation of a Floral Homeotic Gene in Arabidopsis
Activation of a Floral Homeotic Gene in Arabidopsis By Maximiliam A. Busch, Kirsten Bomblies, and Detlef Weigel Presentation by Lis Garrett and Andrea Stevenson http://ucsdnews.ucsd.edu/archive/graphics/images/image5.jpg
More informationPIP-seq. Cells. Permanganate ChIP-Seq
PIP-seq ells Formaldehyde Permanganate 5 Harvest Lyse Sonicate First dapter Ligation 3 3 5 hip Elute Reverse rosslinks Piperidine cleavage 5 3 3 5 Primer Extension Second dapter Ligation 5 3 3 5 Deep Sequencing
More informationChIP. November 21, 2017
ChIP November 21, 2017 functional signals: is DNA enough? what is the smallest number of letters used by a written language? DNA is only one part of the functional genome DNA is heavily bound by proteins,
More informationChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015
ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA
More informationChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015
ChIP-Seq Tools J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA or
More informationChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014
ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind
More informationDiscovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies
Discovering gene regulatory control using ChIP-chip and ChIP-seq An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk bit.ly/bio2_2012 The Central Dogma
More information: Genomic Regions Enrichment of Annotations Tool
http://great.stanford.edu/ : Genomic Regions Enrichment of Annotations Tool Gill Bejerano Dept. of Developmental Biology & Dept. of Computer Science Stanford University 1 Human Gene Regulation 10 13 different
More informationEpigenetics and DNase-Seq
Epigenetics and DNase-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony
More informationIntroduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H
Introduction to ChIP Seq data analyses Acknowledgement: slides taken from Dr. H Wu @Emory ChIP seq: Chromatin ImmunoPrecipitation it ti + sequencing Same biological motivation as ChIP chip: measure specific
More informationand Promoter Sequence Data
: Combining Gene Expression and Promoter Sequence Data Outline 1. Motivation Functionally related genes cluster together genes sharing cis-elements cluster together transcriptional regulation is modular
More information2/10/17. Contents. Applications of HMMs in Epigenomics
2/10/17 I529: Machine Learning in Bioinformatics (Spring 2017) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Background:
More informationRNA-Seq Now What? BIS180L Professor Maloof May 24, 2018
RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018 We have differentially expressed genes, what do we want to know about them? We have differentially expressed genes, what do we want to know about
More informationNext- genera*on Sequencing. Lecture 13
Next- genera*on Sequencing Lecture 13 ChIP- seq Applica*ons iden%fy sequence varia%ons DNA- seq Iden%fy Pathogens RNA- seq Kahvejian et al, 2008 Protein-DNA interaction DNA is the informa*on carrier of
More informationSUPPLEMENTAL MATERIALS
SUPPLEMENL MERILS Eh-seq: RISPR epitope tagging hip-seq of DN-binding proteins Daniel Savic, E. hristopher Partridge, Kimberly M. Newberry, Sophia. Smith, Sarah K. Meadows, rian S. Roberts, Mark Mackiewicz,
More informationBiology 644: Bioinformatics
Processes Activation Repression Initiation Elongation.... Processes Splicing Editing Degradation Translation.... Transcription Translation DNA Regulators DNA-Binding Transcription Factors Chromatin Remodelers....
More informationNEXT GENERATION SEQUENCING. Farhat Habib
NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp
More informationComputational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq
Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne Data flow in ChIP-Seq data analysis Level 1:
More informationBunDLE-seq (Binding to Designed Library, Extracting and Sequencing) -
Protocol BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) - A quantitative investigation of various determinants of TF binding; going beyond the characterization of core site Einat Zalckvar*
More informationNon-coding Function & Variation, MPRAs II. Mike White Bio /5/18
Non-coding Function & Variation, MPRAs II Mike White Bio 5488 3/5/18 MPRA Review Problem 1: Where does your CRE DNA come from? DNA synthesis Genomic fragments Targeted regulome capture Problem 2: How do
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationDeep learning frameworks for regulatory genomics and epigenomics
Deep learning frameworks for regulatory genomics and epigenomics Chuan Sheng Foo Avanti Shrikumar Nicholas Sinnott- Armstrong ANSHUL KUNDAJE Genetics, Computer science Stanford University Johnny Israeli
More informationAnalyzing ChIP-seq data. R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H.
Analyzing ChIP-seq data R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H. Pages Biological Motivation Chromatin-immunopreciptation followed
More informationDeep learning sequence-based ab initio prediction of variant effects on expression and disease risk
Summer Review 7 Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Jian Zhou 1,2,3, Chandra L. Theesfeld 1, Kevin Yao 3, Kathleen M. Chen 3, Aaron K. Wong
More information14 March, 2016: Introduction to Genomics
14 March, 2016: Introduction to Genomics Genome Genome within Ensembl browser http://www.ensembl.org/homo_sapiens/location/view?db=core;g=ensg00000139618;r=13:3231547432400266 Genome within Ensembl browser
More informationWhat Can the Epigenome Teach Us About Cellular States and Diseases?
What Can the Epigenome Teach Us About Cellular States and Diseases? (a computer scientist s view) Luca Pinello Outline Epigenetic: the code over the code What can we learn from epigenomic data? Resources
More informationL8: Downstream analysis of ChIP-seq and ATAC-seq data
L8: Downstream analysis of ChIP-seq and ATAC-seq data Shamith Samarajiwa CRUK Bioinformatics Autumn School September 2017 Summary Downstream analysis for extracting meaningful biology : Normalization and
More informationIntroduction to Transcription Factor Binding Sites (TFBS) Cells control the expression of genes using Transcription Factors.
Identification of Functional Transcription Factor Binding Sites using Closely Related Saccharomyces species Scott W. Doniger 1, Juyong Huh 2, and Justin C. Fay 1,2 1 Computation Biology Program and 2 Department
More informationLearning Methods for DNA Binding in Computational Biology
Learning Methods for DNA Binding in Computational Biology Mark Kon Dustin Holloway Yue Fan Chaitanya Sai Charles DeLisi Boston University IJCNN Orlando August 16, 2007 Outline Background on Transcription
More informationNature Genetics: doi: /ng Supplementary Figure 1. H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts.
Supplementary Figure 1 H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts. (a) Schematic of chromatin contacts captured in H3K27ac HiChIP. (b) Loop call overlap for cohesin HiChIP
More informationReview of Transposable elements have rewired the core regulatory network of human embryonic stem cells
Review of Transposable elements have rewired the core regulatory network of human embryonic stem cells RNA structure Mutagenic screens using transposons Nature Genetics, 42(7), 631-635 (2010). Bradly Alicea
More informationGenome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays
Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays Please consider both the forward and reverse strands (i.e. reverse compliment sequence). You do not need to
More informationThe Next Generation of Transcription Factor Binding Site Prediction
The Next Generation of Transcription Factor Binding Site Prediction Anthony Mathelier*, Wyeth W. Wasserman* Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department
More informationPioneering Clinical Omics
Pioneering Clinical Omics Clinical Genomics Strand NGS An analysis tool for data generated by cutting-edge Next Generation Sequencing(NGS) instruments. Strand NGS enables read alignment and analysis of
More informationTitle: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background
Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Team members: David Moskowitz and Emily Tsang Background Transcription factors
More informationGenome editing: Principles, Current and Future Uses
enome editing: Principles, urrent and Future Uses Dr ndrew Wood University of Edinburgh alton Institute dvance in enetics onference June 28 th 2017 enome editing defined: a type of genetic engineering
More informationLecture 11 Microarrays and Expression Data
Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 11 Microarrays and Expression Data Genetic Expression Data Microarray experiments Applications Expression
More informationDNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.
Epigenomics T TM activation SNP target ncrna validation metagenomics genetics private RRBS-seq de novo trio RIP-seq exome mendelian comparative genomics DNA NGS ChIP-seq bioinformatics assembly tumor-normal
More informationMethods and tools for exploring functional genomics data
Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for
More informationMicroarrays & Gene Expression Analysis
Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed
More informationMotifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin
Motifs BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin An example transcriptional regulatory cascade Here, controlling Salmonella bacteria multidrug resistance Sequencespecific
More informationThe ENCODE Encyclopedia. & Variant Annotation Using RegulomeDB and HaploReg
The ENCODE Encyclopedia & Variant Annotation Using RegulomeDB and HaploReg Jill E. Moore Weng Lab University of Massachusetts Medical School October 10, 2015 Where s the Encyclopedia? ENCODE: Encyclopedia
More informationComputational Systems Biology Deep Learning in the Life Sciences
Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490 HST.506 Christina Ji April 6, 2017 DanQ: a hybrid convolutional and recurrent deep neural network for quantifying
More informationBio5488 Practice Midterm (2018) 1. Next-gen sequencing
1. Next-gen sequencing 1. You have found a new strain of yeast that makes fantastic wine. You d like to sequence this strain to ascertain the differences from S. cerevisiae. To accurately call a base pair,
More informationSupplementary Figure 1 Strategy for parallel detection of DHSs and adjacent nucleosomes
Supplementary Figure 1 Strategy for parallel detection of DHSs and adjacent nucleosomes DNase I cleavage DNase I DNase I digestion Sucrose gradient enrichment Small Large F1 F2...... F9 F1 F1 F2 F3 F4
More informationLecture 5: Regulation
Machine Learning in Computational Biology CSC 2431 Lecture 5: Regulation Instructor: Anna Goldenberg Central Dogma of Biology Transcription DNA RNA protein Process of producing RNA from DNA Constitutive
More informationÜbung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS
Übung V Einführung, Teil 1 Transktiptionelle Regulation TFBS Transcription Factors These proteins promote transcription 1. Bind DNA 2. Activate Transcription These two functions usually reside on separate
More informationDIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine
DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine Defining MYB Transcriptional Network by Genome-wide Chromatin Occupancy Profiling (ChIP-Seq) 2010 E.Glazov, L. Zhao Transcription Factors:
More informationFunctional Genomics and Single-Cell Genomics. Christine Queitsch Department of Genome Sciences
Functional Genomics and Single-Cell Genomics Christine Queitsch Department of Genome Sciences queitsch@uw.edu 1 Outline Challenges Combining methods for better genome assembly Functional genomics let s
More informationECS 234: Genomic Data Integration ECS 234
: Genomic Data Integration Heterogeneous Data Integration DNA Sequence Microarray Proteomics >gi 12004594 gb AF217406.1 Saccharomyces cerevisiae uridine nucleosidase (URH1) gene, complete cds ATGGAATCTGCTGATTTTTTTACCTCACGAAACTTATTAAAACAGATAATTTCCCTCATCTGCAAGGTTG
More informationres2)=)colocalisation(eqtl_file="icoslg_eqtl.txt",) gwas_file="uc_gwas_icoslg_locus.txt",)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$
Homework 7 res1)=)colocalisation(eqtl_file="ptk2b_eqtl.txt",) gwas_file="ad_gwas_ptk2b_locus.txt",)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$ ##))1.02e+06))2.43e+05))6.80e+04))1.51e+02))9.84e+01)$
More informationClassifying human promoters by occupancy patterns identifies recurring sequence elements, combinatorial binding, and spatial interactions
Yang and Vingron BMC Biology (2018) 16:138 https://doi.org/10.1186/s12915-018-0585-5 RESEARCH ARTICLE Open Access Classifying human promoters by occupancy patterns identifies recurring sequence elements,
More informationHuijuan Feng, Shining Ma,Chao Ye & Zhixing Feng
Huijuan Feng, Shining Ma,Chao Ye & Zhixing Feng Background-Author introduction Research interest: Methods for gene mapping of complex traits Inference of population structure from genetic data Genome variation
More informationThe ENCODE Encyclopedia. Zhiping Weng University of Massachusetts Medical School ENCODE 2016: Research Applications and Users Meeting June 8, 2016
The ENCODE Encyclopedia Zhiping Weng University of Massachusetts Medical School ENCODE 2016: Research Applications and Users Meeting June 8, 2016 The Encyclopedia Of DNA Elements Consortium Goals: - Catalog
More informationDOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION
DOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION Raphaël MOURAD, Assist. Prof. Centre de Biologie Intégrative Université Paul Sabatier, Toulouse III INTRODUCTION Double-strand
More informationSynthetic Biology. Sustainable Energy. Therapeutics Industrial Enzymes. Agriculture. Accelerating Discoveries, Expanding Possibilities. Design.
Synthetic Biology Accelerating Discoveries, Expanding Possibilities Sustainable Energy Therapeutics Industrial Enzymes Agriculture Design Build Generate Solutions to Advance Synthetic Biology Research
More informationMolecular Cell Biology - Problem Drill 06: Genes and Chromosomes
Molecular Cell Biology - Problem Drill 06: Genes and Chromosomes Question No. 1 of 10 1. Which of the following statements about genes is correct? Question #1 (A) Genes carry the information for protein
More informationNature Genetics: doi: /ng Supplementary Figure 1. ChIP-seq genome browser views of BRM occupancy at previously identified BRM targets.
Supplementary Figure 1 ChIP-seq genome browser views of BRM occupancy at previously identified BRM targets. Gene structures are shown underneath each panel. Supplementary Figure 2 pref6::ref6-gfp complements
More information3/15/2016. Genome = Genes + Gene Regulation. Personal Transcription Factor. Binding Site Mutations Point to. Personal Medical Histories
Personal Transcription Factor GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG GCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGC
More informationPredicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs
BMC Bioinformatics This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. Predicting tissue specific
More informationA Brief History. Bootstrapping. Bagging. Boosting (Schapire 1989) Adaboost (Schapire 1995)
A Brief History Bootstrapping Bagging Boosting (Schapire 1989) Adaboost (Schapire 1995) What s So Good About Adaboost Improves classification accuracy Can be used with many different classifiers Commonly
More informationStructural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction
Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Conformational Analysis 2 Conformational Analysis Properties of molecules depend on their three-dimensional
More informationIntroduction to Bioinformatics Online Course: IBT
Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec5: Interpreting your MSA Using Logos Using Logos - Logos are a terrific way to generate
More informationUsing GREAT.stanford.edu to interpret cis-regulatory rich datasets including ChIP-Seq, Epi markers, GWAS etc.
Using GREAT.stanford.edu to interpret cis-regulatory rich datasets including ChIP-Seq, Epi markers, GWAS etc. Gill Bejerano Dept. of Developmental Biology & Dept. of Computer Science Stanford University
More informationIdentifying Cooperativity Among Transcription Factors Controlling the Cell Cycle In Yeast
Identifying Cooperativity Among Transcription Factors Controlling the Cell Cycle In Yeast Nilanjana Banerjee 1,2 and Michael Q. Zhang 1 Nucleic Acids Research, 2003, Vol.31, No.23 1 Cold Spring Harbor
More informationGene Regulation Biology
Gene Regulation Biology Potential and Limitations of Cell Re-programming in Cancer Research Eric Blanc KCL April 13, 2010 Eric Blanc (KCL) Gene Regulation Biology April 13, 2010 1 / 21 Outline 1 The Central
More informationSUPPLEMENTARY INFORMATION
SUPPLEMENRY INFORMION doi:.38/nature In vivo nucleosome mapping D4+ Lymphocytes radient-based and I-bead cell sorting D8+ Lymphocytes ranulocytes Lyse the cells Isolate and sequence mononucleosome cores
More informationIntroduction to BIOINFORMATICS
COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/
More informationAlexander Statnikov, Ph.D.
Alexander Statnikov, Ph.D. Director, Computational Causal Discovery Laboratory Benchmarking Director, Best Practices Integrative Informatics Consultation Service Assistant Professor, Department of Medicine,
More informationIllumina s Suite of Targeted Resequencing Solutions
Illumina s Suite of Targeted Resequencing Solutions Colin Baron Sr. Product Manager Sequencing Applications 2011 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa, Making Sense Out of Life,
More informationBiochemistry 674, Fall, 1995: Nucleic Acids Exam II: November 16, 1995 Your Name Here: PCR A C
CHM 674 Exam II, 1995 1 iochemistry 674, Fall, 1995: Nucleic cids Prof. Jason Kahn Exam II: November 16, 1995 Your Name Here: This exam has five questions worth 20 points each. nswer all five. You do not
More informationExplicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection
Published online 7 October 2014 Nucleic Acids Research, 2014, Vol. 42, No. 19 11865 11878 doi: 10.1093/nar/gku810 Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint
More informationGenome 541! Unit 4, lecture 3! Genomics assays
Genome 541! Unit 4, lecture 3! Genomics assays I d like a bit more background on the assays and bioterminology.!! The phantom peak concept was confusing.! I didn t quite understand what the phantom peak
More informationresequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics
RNA Sequencing T TM variation genetics validation SNP ncrna metagenomics private trio de novo exome mendelian ChIP-seq RNA DNA bioinformatics custom target high-throughput resequencing storage ncrna comparative
More informationConvolutional Kitchen Sinks for Transcription Factor Binding Site Prediction
Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction Alyssa Morrow*, Vaishaal Shankar* Anthony Joseph, Benjamin Recht, Nir Yosef Transcription Factor A protein that binds to DNA
More informationChapter 13 - Regulation of Gene Expression
Chapter 13 - Regulation of Gene Expression 1. Describe the typical components of an operon in an E. coli (prokaryotic) cell. (p. 238-239) a. regulator gene - b. promoter - c. operator - d. structural gene
More informationFile S1. Program overview and features
File S1 Program overview and features Query list filtering. Further filtering may be applied through user selected query lists (Figure. 2B, Table S3) that restrict the results and/or report specifically
More informationNext-generation sequencing technologies
Next-generation sequencing technologies Illumina: Summary https://www.youtube.com/watch?v=fcd6b5hraz8 Illumina platforms: Benchtop sequencers https://www.illumina.com/systems/sequencing-platforms.html
More informationLong-range gene regulation
Long-range gene regulation Short Course in Medical Genetics Melbourne, June 2011 Question Why should long-range regulation of gene expression be of interest to Clinical Scientists and Pathologists interested
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1
Supplementary Figure 1 An extended version of Figure 2a, depicting multi-model training and reverse-complement mode To use the GPU s full computational power, we train several independent models in parallel
More information