Many transcription factors! recognize DNA shape

Size: px
Start display at page:

Download "Many transcription factors! recognize DNA shape"

Transcription

1 Many transcription factors! recognize DN shape Katie Pollard! Gladstone Institutes USF Division of Biostatistics, Institute for Human Genetics, and Institute for omputational Health Sciences ENODE Users Meeting - Stanford, June 10, 2016

2 Most disease associated mutations are outside coding regions Hypothesis 1: Non-coding variants alter transcription factor sequence motifs. DN G Gene Gene B Schizophrenia Gene associated variant T

3 DN Most disease associated mutations are outside coding regions Hypothesis 1: Non-coding variants alter transcription factor sequence motifs. _ G Gene Gene B Schizophrenia Gene associated variant T pproach: Map variants to correct pathways by predicting enhancers and their target genes. Score variants for changes in binding affinity.

4 DN EnhancerFinder distinguishes biologically active enhancers G Gene Gene B Schizophrenia Gene associated variant Training Data VIST Enhancers Performance on held out data T 80% validate in vivo Yes No Func1onal Genomics U=0.96 Power=85% at 10% FPR DN Sequences,,G,T,.. Erwin et al. (2014) apra et al. (2014)

5 DN EnhancerFinder distinguishes biologically active enhancers G Gene Gene B Schizophrenia Gene associated variant Training Data VIST Enhancers Performance on held out data T 80% validate in vivo Yes No Func1onal Genomics U=0.96 Power=85% at 10% FPR DN Sequences,,G,T,.. Erwin et al. (2014) apra et al. (2014)

6 MotifDiverge quantifies loss/gain of TF binding sites G T _ DN Gene Gene B Schizophrenia Gene associated variant Statistical model for TFBS evolution with turnover Predicts change of function P-value for net change in binding One or many TFs lignment-free Evolutionary model Motif specific Detects loss/gain of function mutations with high accuracy Better than conservation scores In vivo and MPRs in cell lines Ritter et al. (2010) Kostka et al. (2015)

7 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Training Data c1ve enhancers Expressed genes Hi- interac1ons Func1onal Genomics losest gene usually wrong Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with TF improve predictions! Whalen et al. (2016)

8 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes.! - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene Brain (96)

9 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. 239 Predicted HR Enhancers - Massive integration of Heart functional (28) genomics data enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Limb (41) - Functional variants outside enhancers Brain (96)

10 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)

11 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)

12 DN TargetFinder maps distal regulatory elements to genes G Gene Gene B Schizophrenia Gene associated variant T Hypothesis 2: Non-coding variants alter binding sites of structural proteins and chromatin modifiers. pproach: RISPR edit sites identified by TargetFinder, then test chromatin and expression. Reveals distinct genomic signature of looping DN Heterochromatin on loop ohesin within 6Kb of enhancer and promoter but not mid-loop TFs bound with cohesin improve predictions! Whalen et al. (2016)

13 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Brain (96) - Functional variants outside enhancers - Enhancer variants outside sequence motifs

14 Summary and hallenges Machine-learning on biologically validated enhancers identifies non-coding variants most likely to affect gene regulation and the targeted genes. - Massive integration of Heart functional (28) genomics data Limb (41) 239 Predicted HR Enhancers enables cell type specific predictions - Many enhancer-like regions are minimally active and not consistently looping to a target gene But much remains to be explained Brain (96) - Functional variants outside enhancers - Enhancer variants outside sequence motifs For a typical ENODE TF 23% of the top 2000 hipseq peaks have no sequence motif (range = 1%-63%)

15 Many enhancer mutations are outside known or de novo sequence motifs Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. DN G Gene Gene B Schizophrenia Gene associated variant T

16 Many enhancer mutations are outside known or de novo sequence motifs DN Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. Gene Gene B Schizophrenia Gene associated variant TFs can recognize shape in addition to sequence. DN shape differentiates similar sequence motifs. Distinct sequences can encode same shape. G T

17 Many enhancer mutations are outside known or de novo sequence motifs DN Hypothesis 3: Non-coding variants alter enhancer function by changing DN shape. Gene Gene B Schizophrenia Gene associated variant TFs can recognize shape in addition to sequence. DN shape differentiates similar sequence motifs. Distinct sequences can encode same shape. G pproach: lgorithm to learn shape motifs de novo for all ENODE TFs, predict shape motif hits in hipseq peaks, compare to sequence motifs T

18 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features. Based on molecular dynamics simulations. MGW ProT Roll HelT [Figures from Blackburn and Gait, 1996]

19 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs!

20 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position

21 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position pply to remaining 1000 of top 2000 peaks and flanking non-peak regions for each TF

22 De novo shape motif discovery 1. Estimate DN structure: DNshape (Zhou et al. 2013) Maps 5-mer sequences to structural features Based on molecular dynamics simulations 2. Learn TF shape motifs: Search hip-seq peaks for windows with similar values of a shape feature. Gibbs sampling with scores ~ exp( Dij) Vary window size 5-25bp Train on 1000 of top 2000 peaks for ~250 TFs 3. all hits: Scan hip-seq peaks with shape motifs. Null distribution on distance from mean shape feature value at each position pply to remaining 1000 of top 2000 peaks and flanking non-peak regions for each TF 4. Enrichment test: Hypergeometric p-value.

23 Shape motifs are common Percentage TFs with a motif for each feature in each cell line

24 Shape complements sequence motifs Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center.!!

25 Shape complements sequence motifs Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. ~25% of peaks have sequence and shape motifs.!!

26 Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. ~25% of peaks have sequence and shape motifs. - These can be similar,! Shape complements sequence motifs Underlying sequence logo Nrsf Roll motif in K562 Nrsf FactorBook sequence motif

27 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another,!! Underlying! sequence fos ProT motif in K562 FactorBook! sequence! motif

28 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Underlying! sequence Maff HelT motif in K562 FactorBook! sequence! motif

29 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Underlying! sequence Nfya HelT motif in K562 FactorBook! sequence! motif is 3bp! upstream

30 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Underlying! sequence fos Roll motif in K562 FactorBook! sequence! motif is 30bp! away

31 Shape motifs are complementary Most peaks without sequence motifs have at least one shape motif. It is typically at the peak center. Many peaks have sequence and shape motifs. - These can be similar, - Extensions or refinements of one another, - Or very different Shape motifs can flank sequence motifs Shape motifs can differ between TFs with similar sequence motifs and/or the same protein fold.! Fosl1 has a HelT motif tf3 has a Roll motif

32 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types! - Spectrum of recognition Heart modes (28) 239 Predicted HR Enhancers Limb (41) Brain (96)

33 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks! Limb (41) 239 Predicted HR Enhancers Brain (96)

34 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks Role of shape in ectopic binding of TFs when cofactors are absent Limb [Luna-Zurita (41) et al. 2016] 239 Predicted HR Enhancers! Brain (96)

35 Ongoing Work Hierarchical or mixture model of TF binding with sequence and shape motifs - Decompose sequence motifs by shape types - Spectrum of recognition Heart modes (28) Shape motifs in different contexts - o-factors and complexes - Weak hip-seq peaks Role of shape in ectopic binding of TFs when cofactors are absent Limb [Luna-Zurita (41) et al. 2016] 239 Predicted HR Enhancers Brain (96) Evolutionary modeling of DN shape - onservation of shape without sequence - Scoring SNPs for effects on shape motifs

36 ollaborators EnhancerFinder Tony apra Gen Haliburton DN Shape Hassan Samee TargetFinder Rebecca Truty Sean Whalen MotifDiverge Dennis Kostka Functional ssays Hane Ryu lex Pollen Nadav hituv rnold Kriegstein Funding from NIMH, NIGMS, NHLBI, PhRM Foundation, Gladstone Institutes

Integrating diverse datasets improves developmental enhancer prediction

Integrating diverse datasets improves developmental enhancer prediction 1 Integrating diverse datasets improves developmental enhancer prediction Genevieve D. Erwin 1, Rebecca M. Truty 1, Dennis Kostka 2, Katherine S. Pollard 1,3, John A. Capra 4 1 Gladstone Institute of Cardiovascular

More information

Integrating Diverse Datasets Improves Developmental Enhancer Prediction

Integrating Diverse Datasets Improves Developmental Enhancer Prediction Integrating Diverse Datasets Improves Developmental Enhancer Prediction Genevieve D. Erwin 1,2, Nir Oksenberg 2,3, Rebecca M. Truty 1, Dennis Kostka 4, Karl K. Murphy 2,3, Nadav Ahituv 2,3, Katherine S.

More information

Transcription factor binding site prediction in vivo using DNA sequence and shape features

Transcription factor binding site prediction in vivo using DNA sequence and shape features Transcription factor binding site prediction in vivo using DNA sequence and shape features Anthony Mathelier, Lin Yang, Tsu-Pei Chiu, Remo Rohs, and Wyeth Wasserman anthony.mathelier@gmail.com @AMathelier

More information

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data

More information

Supplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and

Supplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and Supplementary Fig. S1. Building a training set of cardiac enhancers. (A-E) Empirical validation of candidate enhancers containing matches to Twi and Tin TFBS motifs and located in the flanking or intronic

More information

CS273B: Deep learning for Genomics and Biomedicine

CS273B: Deep learning for Genomics and Biomedicine CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy

More information

1 Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol doi: /nbt.3128 (2015).

1 Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol doi: /nbt.3128 (2015). F op-scoring motif Optimized motifs E Input sequences entral 1 bp region Dinucleotideshuffled seqs B D ll B1H-R predicted motifs Enriched B1H- R predicted motifs L!=!7! L!=!6! L!=5! L!=!4! L!=!3! L!=!2!

More information

Personal and population genomics of human regulatory variation

Personal and population genomics of human regulatory variation Personal and population genomics of human regulatory variation Benjamin Vernot,, and Joshua M. Akey Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA Ting WANG 1 Brief

More information

Non-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18

Non-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18 Non-coding Function & Variation, MPRAs Mike White Bio5488 3/5/18 Outline MONDAY Non-coding function and variation The barcode Basic versions of MRPA technology WEDNESDAY More varieties of MRPAs Some key

More information

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing

More information

Comparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls

Comparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls Comparative eqtl analyses within and between seven tissue types suggest mechanisms underlying cell type specificity of eqtls, Duke University Christopher D Brown, University of Pennsylvania November 9th,

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq Part 1 An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk http://bit.ly/bio2links

More information

The genetic code of gene regulatory elements

The genetic code of gene regulatory elements The genetic code of gene regulatory elements Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology Information National Institutes of Health October 23, 2008 Outline Gene deserts

More information

Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057

Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057 Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Reviewing sites: affinity and specificity representation binding

More information

ChIP-seq analysis 2/28/2018

ChIP-seq analysis 2/28/2018 ChIP-seq analysis 2/28/2018 Acknowledgements Much of the content of this lecture is from: Furey (2012) ChIP-seq and beyond Park (2009) ChIP-seq advantages + challenges Landt et al. (2012) ChIP-seq guidelines

More information

Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Wenxiu Ma 1, Lin Yang 2, Remo Rohs 2, and William Stafford Noble 3 1 Department of Statistics,

More information

NGS Approaches to Epigenomics

NGS Approaches to Epigenomics I519 Introduction to Bioinformatics, 2013 NGS Approaches to Epigenomics Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Background: chromatin structure & DNA methylation Epigenomic

More information

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis

More information

Year III Pharm.D Dr. V. Chitra

Year III Pharm.D Dr. V. Chitra Year III Pharm.D Dr. V. Chitra 1 Genome entire genetic material of an individual Transcriptome set of transcribed sequences Proteome set of proteins encoded by the genome 2 Only one strand of DNA serves

More information

Charles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use

Charles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use Charles Girardot, Furlong Lab MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use ChIP-Seq signal properties Only 5 ends of ChIPed fragments are sequenced Shifted read

More information

Activation of a Floral Homeotic Gene in Arabidopsis

Activation of a Floral Homeotic Gene in Arabidopsis Activation of a Floral Homeotic Gene in Arabidopsis By Maximiliam A. Busch, Kirsten Bomblies, and Detlef Weigel Presentation by Lis Garrett and Andrea Stevenson http://ucsdnews.ucsd.edu/archive/graphics/images/image5.jpg

More information

PIP-seq. Cells. Permanganate ChIP-Seq

PIP-seq. Cells. Permanganate ChIP-Seq PIP-seq ells Formaldehyde Permanganate 5 Harvest Lyse Sonicate First dapter Ligation 3 3 5 hip Elute Reverse rosslinks Piperidine cleavage 5 3 3 5 Primer Extension Second dapter Ligation 5 3 3 5 Deep Sequencing

More information

ChIP. November 21, 2017

ChIP. November 21, 2017 ChIP November 21, 2017 functional signals: is DNA enough? what is the smallest number of letters used by a written language? DNA is only one part of the functional genome DNA is heavily bound by proteins,

More information

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA

More information

ChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015

ChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 ChIP-Seq Tools J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA or

More information

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk bit.ly/bio2_2012 The Central Dogma

More information

: Genomic Regions Enrichment of Annotations Tool

: Genomic Regions Enrichment of Annotations Tool http://great.stanford.edu/ : Genomic Regions Enrichment of Annotations Tool Gill Bejerano Dept. of Developmental Biology & Dept. of Computer Science Stanford University 1 Human Gene Regulation 10 13 different

More information

Epigenetics and DNase-Seq

Epigenetics and DNase-Seq Epigenetics and DNase-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony

More information

Introduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H

Introduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H Introduction to ChIP Seq data analyses Acknowledgement: slides taken from Dr. H Wu @Emory ChIP seq: Chromatin ImmunoPrecipitation it ti + sequencing Same biological motivation as ChIP chip: measure specific

More information

and Promoter Sequence Data

and Promoter Sequence Data : Combining Gene Expression and Promoter Sequence Data Outline 1. Motivation Functionally related genes cluster together genes sharing cis-elements cluster together transcriptional regulation is modular

More information

2/10/17. Contents. Applications of HMMs in Epigenomics

2/10/17. Contents. Applications of HMMs in Epigenomics 2/10/17 I529: Machine Learning in Bioinformatics (Spring 2017) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Background:

More information

RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018

RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018 RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018 We have differentially expressed genes, what do we want to know about them? We have differentially expressed genes, what do we want to know about

More information

Next- genera*on Sequencing. Lecture 13

Next- genera*on Sequencing. Lecture 13 Next- genera*on Sequencing Lecture 13 ChIP- seq Applica*ons iden%fy sequence varia%ons DNA- seq Iden%fy Pathogens RNA- seq Kahvejian et al, 2008 Protein-DNA interaction DNA is the informa*on carrier of

More information

SUPPLEMENTAL MATERIALS

SUPPLEMENTAL MATERIALS SUPPLEMENL MERILS Eh-seq: RISPR epitope tagging hip-seq of DN-binding proteins Daniel Savic, E. hristopher Partridge, Kimberly M. Newberry, Sophia. Smith, Sarah K. Meadows, rian S. Roberts, Mark Mackiewicz,

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics Processes Activation Repression Initiation Elongation.... Processes Splicing Editing Degradation Translation.... Transcription Translation DNA Regulators DNA-Binding Transcription Factors Chromatin Remodelers....

More information

NEXT GENERATION SEQUENCING. Farhat Habib

NEXT GENERATION SEQUENCING. Farhat Habib NEXT GENERATION SEQUENCING HISTORY HISTORY Sanger Dominant for last ~30 years 1000bp longest read Based on primers so not good for repetitive or SNPs sites HISTORY Sanger Dominant for last ~30 years 1000bp

More information

Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq

Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne Data flow in ChIP-Seq data analysis Level 1:

More information

BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) -

BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) - Protocol BunDLE-seq (Binding to Designed Library, Extracting and Sequencing) - A quantitative investigation of various determinants of TF binding; going beyond the characterization of core site Einat Zalckvar*

More information

Non-coding Function & Variation, MPRAs II. Mike White Bio /5/18

Non-coding Function & Variation, MPRAs II. Mike White Bio /5/18 Non-coding Function & Variation, MPRAs II Mike White Bio 5488 3/5/18 MPRA Review Problem 1: Where does your CRE DNA come from? DNA synthesis Genomic fragments Targeted regulome capture Problem 2: How do

More information

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics

More information

Deep learning frameworks for regulatory genomics and epigenomics

Deep learning frameworks for regulatory genomics and epigenomics Deep learning frameworks for regulatory genomics and epigenomics Chuan Sheng Foo Avanti Shrikumar Nicholas Sinnott- Armstrong ANSHUL KUNDAJE Genetics, Computer science Stanford University Johnny Israeli

More information

Analyzing ChIP-seq data. R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H.

Analyzing ChIP-seq data. R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H. Analyzing ChIP-seq data R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H. Pages Biological Motivation Chromatin-immunopreciptation followed

More information

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Summer Review 7 Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Jian Zhou 1,2,3, Chandra L. Theesfeld 1, Kevin Yao 3, Kathleen M. Chen 3, Aaron K. Wong

More information

14 March, 2016: Introduction to Genomics

14 March, 2016: Introduction to Genomics 14 March, 2016: Introduction to Genomics Genome Genome within Ensembl browser http://www.ensembl.org/homo_sapiens/location/view?db=core;g=ensg00000139618;r=13:3231547432400266 Genome within Ensembl browser

More information

What Can the Epigenome Teach Us About Cellular States and Diseases?

What Can the Epigenome Teach Us About Cellular States and Diseases? What Can the Epigenome Teach Us About Cellular States and Diseases? (a computer scientist s view) Luca Pinello Outline Epigenetic: the code over the code What can we learn from epigenomic data? Resources

More information

L8: Downstream analysis of ChIP-seq and ATAC-seq data

L8: Downstream analysis of ChIP-seq and ATAC-seq data L8: Downstream analysis of ChIP-seq and ATAC-seq data Shamith Samarajiwa CRUK Bioinformatics Autumn School September 2017 Summary Downstream analysis for extracting meaningful biology : Normalization and

More information

Introduction to Transcription Factor Binding Sites (TFBS) Cells control the expression of genes using Transcription Factors.

Introduction to Transcription Factor Binding Sites (TFBS) Cells control the expression of genes using Transcription Factors. Identification of Functional Transcription Factor Binding Sites using Closely Related Saccharomyces species Scott W. Doniger 1, Juyong Huh 2, and Justin C. Fay 1,2 1 Computation Biology Program and 2 Department

More information

Learning Methods for DNA Binding in Computational Biology

Learning Methods for DNA Binding in Computational Biology Learning Methods for DNA Binding in Computational Biology Mark Kon Dustin Holloway Yue Fan Chaitanya Sai Charles DeLisi Boston University IJCNN Orlando August 16, 2007 Outline Background on Transcription

More information

Nature Genetics: doi: /ng Supplementary Figure 1. H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts.

Nature Genetics: doi: /ng Supplementary Figure 1. H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts. Supplementary Figure 1 H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts. (a) Schematic of chromatin contacts captured in H3K27ac HiChIP. (b) Loop call overlap for cohesin HiChIP

More information

Review of Transposable elements have rewired the core regulatory network of human embryonic stem cells

Review of Transposable elements have rewired the core regulatory network of human embryonic stem cells Review of Transposable elements have rewired the core regulatory network of human embryonic stem cells RNA structure Mutagenic screens using transposons Nature Genetics, 42(7), 631-635 (2010). Bradly Alicea

More information

Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays

Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays Please consider both the forward and reverse strands (i.e. reverse compliment sequence). You do not need to

More information

The Next Generation of Transcription Factor Binding Site Prediction

The Next Generation of Transcription Factor Binding Site Prediction The Next Generation of Transcription Factor Binding Site Prediction Anthony Mathelier*, Wyeth W. Wasserman* Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department

More information

Pioneering Clinical Omics

Pioneering Clinical Omics Pioneering Clinical Omics Clinical Genomics Strand NGS An analysis tool for data generated by cutting-edge Next Generation Sequencing(NGS) instruments. Strand NGS enables read alignment and analysis of

More information

Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background

Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Background Title: Genome-Wide Predictions of Transcription Factor Binding Events using Multi- Dimensional Genomic and Epigenomic Features Team members: David Moskowitz and Emily Tsang Background Transcription factors

More information

Genome editing: Principles, Current and Future Uses

Genome editing: Principles, Current and Future Uses enome editing: Principles, urrent and Future Uses Dr ndrew Wood University of Edinburgh alton Institute dvance in enetics onference June 28 th 2017 enome editing defined: a type of genetic engineering

More information

Lecture 11 Microarrays and Expression Data

Lecture 11 Microarrays and Expression Data Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 11 Microarrays and Expression Data Genetic Expression Data Microarray experiments Applications Expression

More information

DNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.

DNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction. Epigenomics T TM activation SNP target ncrna validation metagenomics genetics private RRBS-seq de novo trio RIP-seq exome mendelian comparative genomics DNA NGS ChIP-seq bioinformatics assembly tumor-normal

More information

Methods and tools for exploring functional genomics data

Methods and tools for exploring functional genomics data Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for

More information

Microarrays & Gene Expression Analysis

Microarrays & Gene Expression Analysis Microarrays & Gene Expression Analysis Contents DNA microarray technique Why measure gene expression Clustering algorithms Relation to Cancer SAGE SBH Sequencing By Hybridization DNA Microarrays 1. Developed

More information

Motifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin

Motifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin Motifs BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin An example transcriptional regulatory cascade Here, controlling Salmonella bacteria multidrug resistance Sequencespecific

More information

The ENCODE Encyclopedia. & Variant Annotation Using RegulomeDB and HaploReg

The ENCODE Encyclopedia. & Variant Annotation Using RegulomeDB and HaploReg The ENCODE Encyclopedia & Variant Annotation Using RegulomeDB and HaploReg Jill E. Moore Weng Lab University of Massachusetts Medical School October 10, 2015 Where s the Encyclopedia? ENCODE: Encyclopedia

More information

Computational Systems Biology Deep Learning in the Life Sciences

Computational Systems Biology Deep Learning in the Life Sciences Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490 HST.506 Christina Ji April 6, 2017 DanQ: a hybrid convolutional and recurrent deep neural network for quantifying

More information

Bio5488 Practice Midterm (2018) 1. Next-gen sequencing

Bio5488 Practice Midterm (2018) 1. Next-gen sequencing 1. Next-gen sequencing 1. You have found a new strain of yeast that makes fantastic wine. You d like to sequence this strain to ascertain the differences from S. cerevisiae. To accurately call a base pair,

More information

Supplementary Figure 1 Strategy for parallel detection of DHSs and adjacent nucleosomes

Supplementary Figure 1 Strategy for parallel detection of DHSs and adjacent nucleosomes Supplementary Figure 1 Strategy for parallel detection of DHSs and adjacent nucleosomes DNase I cleavage DNase I DNase I digestion Sucrose gradient enrichment Small Large F1 F2...... F9 F1 F1 F2 F3 F4

More information

Lecture 5: Regulation

Lecture 5: Regulation Machine Learning in Computational Biology CSC 2431 Lecture 5: Regulation Instructor: Anna Goldenberg Central Dogma of Biology Transcription DNA RNA protein Process of producing RNA from DNA Constitutive

More information

Übung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS

Übung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS Übung V Einführung, Teil 1 Transktiptionelle Regulation TFBS Transcription Factors These proteins promote transcription 1. Bind DNA 2. Activate Transcription These two functions usually reside on separate

More information

DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine

DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine Defining MYB Transcriptional Network by Genome-wide Chromatin Occupancy Profiling (ChIP-Seq) 2010 E.Glazov, L. Zhao Transcription Factors:

More information

Functional Genomics and Single-Cell Genomics. Christine Queitsch Department of Genome Sciences

Functional Genomics and Single-Cell Genomics. Christine Queitsch Department of Genome Sciences Functional Genomics and Single-Cell Genomics Christine Queitsch Department of Genome Sciences queitsch@uw.edu 1 Outline Challenges Combining methods for better genome assembly Functional genomics let s

More information

ECS 234: Genomic Data Integration ECS 234

ECS 234: Genomic Data Integration ECS 234 : Genomic Data Integration Heterogeneous Data Integration DNA Sequence Microarray Proteomics >gi 12004594 gb AF217406.1 Saccharomyces cerevisiae uridine nucleosidase (URH1) gene, complete cds ATGGAATCTGCTGATTTTTTTACCTCACGAAACTTATTAAAACAGATAATTTCCCTCATCTGCAAGGTTG

More information

res2)=)colocalisation(eqtl_file="icoslg_eqtl.txt",) gwas_file="uc_gwas_icoslg_locus.txt",)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$

res2)=)colocalisation(eqtl_file=icoslg_eqtl.txt,) gwas_file=uc_gwas_icoslg_locus.txt,)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$ Homework 7 res1)=)colocalisation(eqtl_file="ptk2b_eqtl.txt",) gwas_file="ad_gwas_ptk2b_locus.txt",)p1)=)1e+4,)p2)=)1e+4,)p12)=)1e+5,) region)=)200000)$ ##))1.02e+06))2.43e+05))6.80e+04))1.51e+02))9.84e+01)$

More information

Classifying human promoters by occupancy patterns identifies recurring sequence elements, combinatorial binding, and spatial interactions

Classifying human promoters by occupancy patterns identifies recurring sequence elements, combinatorial binding, and spatial interactions Yang and Vingron BMC Biology (2018) 16:138 https://doi.org/10.1186/s12915-018-0585-5 RESEARCH ARTICLE Open Access Classifying human promoters by occupancy patterns identifies recurring sequence elements,

More information

Huijuan Feng, Shining Ma,Chao Ye & Zhixing Feng

Huijuan Feng, Shining Ma,Chao Ye & Zhixing Feng Huijuan Feng, Shining Ma,Chao Ye & Zhixing Feng Background-Author introduction Research interest: Methods for gene mapping of complex traits Inference of population structure from genetic data Genome variation

More information

The ENCODE Encyclopedia. Zhiping Weng University of Massachusetts Medical School ENCODE 2016: Research Applications and Users Meeting June 8, 2016

The ENCODE Encyclopedia. Zhiping Weng University of Massachusetts Medical School ENCODE 2016: Research Applications and Users Meeting June 8, 2016 The ENCODE Encyclopedia Zhiping Weng University of Massachusetts Medical School ENCODE 2016: Research Applications and Users Meeting June 8, 2016 The Encyclopedia Of DNA Elements Consortium Goals: - Catalog

More information

DOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION

DOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION DOUBLE-STRAND DNA BREAK PREDICTION USING EPIGENOME MARKS AT KILOBASE RESOLUTION Raphaël MOURAD, Assist. Prof. Centre de Biologie Intégrative Université Paul Sabatier, Toulouse III INTRODUCTION Double-strand

More information

Synthetic Biology. Sustainable Energy. Therapeutics Industrial Enzymes. Agriculture. Accelerating Discoveries, Expanding Possibilities. Design.

Synthetic Biology. Sustainable Energy. Therapeutics Industrial Enzymes. Agriculture. Accelerating Discoveries, Expanding Possibilities. Design. Synthetic Biology Accelerating Discoveries, Expanding Possibilities Sustainable Energy Therapeutics Industrial Enzymes Agriculture Design Build Generate Solutions to Advance Synthetic Biology Research

More information

Molecular Cell Biology - Problem Drill 06: Genes and Chromosomes

Molecular Cell Biology - Problem Drill 06: Genes and Chromosomes Molecular Cell Biology - Problem Drill 06: Genes and Chromosomes Question No. 1 of 10 1. Which of the following statements about genes is correct? Question #1 (A) Genes carry the information for protein

More information

Nature Genetics: doi: /ng Supplementary Figure 1. ChIP-seq genome browser views of BRM occupancy at previously identified BRM targets.

Nature Genetics: doi: /ng Supplementary Figure 1. ChIP-seq genome browser views of BRM occupancy at previously identified BRM targets. Supplementary Figure 1 ChIP-seq genome browser views of BRM occupancy at previously identified BRM targets. Gene structures are shown underneath each panel. Supplementary Figure 2 pref6::ref6-gfp complements

More information

3/15/2016. Genome = Genes + Gene Regulation. Personal Transcription Factor. Binding Site Mutations Point to. Personal Medical Histories

3/15/2016. Genome = Genes + Gene Regulation. Personal Transcription Factor. Binding Site Mutations Point to. Personal Medical Histories Personal Transcription Factor GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTTGAACTTCCACGTGGTATTTACTCAGAGCAATTGGTGCCAGAG GCTCAGGGCCCTGGAGTATAAAGCAGAATGTCTGCTCTCTGTGCCCAGACGTGAGCAGGTGAGCAGCTGGGGCGAAAGACCTGTTGGAGGCTATGAATGC

More information

Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs BMC Bioinformatics This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. Predicting tissue specific

More information

A Brief History. Bootstrapping. Bagging. Boosting (Schapire 1989) Adaboost (Schapire 1995)

A Brief History. Bootstrapping. Bagging. Boosting (Schapire 1989) Adaboost (Schapire 1995) A Brief History Bootstrapping Bagging Boosting (Schapire 1989) Adaboost (Schapire 1995) What s So Good About Adaboost Improves classification accuracy Can be used with many different classifiers Commonly

More information

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction

Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Structural Bioinformatics (C3210) Conformational Analysis Protein Folding Protein Structure Prediction Conformational Analysis 2 Conformational Analysis Properties of molecules depend on their three-dimensional

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec5: Interpreting your MSA Using Logos Using Logos - Logos are a terrific way to generate

More information

Using GREAT.stanford.edu to interpret cis-regulatory rich datasets including ChIP-Seq, Epi markers, GWAS etc.

Using GREAT.stanford.edu to interpret cis-regulatory rich datasets including ChIP-Seq, Epi markers, GWAS etc. Using GREAT.stanford.edu to interpret cis-regulatory rich datasets including ChIP-Seq, Epi markers, GWAS etc. Gill Bejerano Dept. of Developmental Biology & Dept. of Computer Science Stanford University

More information

Identifying Cooperativity Among Transcription Factors Controlling the Cell Cycle In Yeast

Identifying Cooperativity Among Transcription Factors Controlling the Cell Cycle In Yeast Identifying Cooperativity Among Transcription Factors Controlling the Cell Cycle In Yeast Nilanjana Banerjee 1,2 and Michael Q. Zhang 1 Nucleic Acids Research, 2003, Vol.31, No.23 1 Cold Spring Harbor

More information

Gene Regulation Biology

Gene Regulation Biology Gene Regulation Biology Potential and Limitations of Cell Re-programming in Cancer Research Eric Blanc KCL April 13, 2010 Eric Blanc (KCL) Gene Regulation Biology April 13, 2010 1 / 21 Outline 1 The Central

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENRY INFORMION doi:.38/nature In vivo nucleosome mapping D4+ Lymphocytes radient-based and I-bead cell sorting D8+ Lymphocytes ranulocytes Lyse the cells Isolate and sequence mononucleosome cores

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/

More information

Alexander Statnikov, Ph.D.

Alexander Statnikov, Ph.D. Alexander Statnikov, Ph.D. Director, Computational Causal Discovery Laboratory Benchmarking Director, Best Practices Integrative Informatics Consultation Service Assistant Professor, Department of Medicine,

More information

Illumina s Suite of Targeted Resequencing Solutions

Illumina s Suite of Targeted Resequencing Solutions Illumina s Suite of Targeted Resequencing Solutions Colin Baron Sr. Product Manager Sequencing Applications 2011 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa, Making Sense Out of Life,

More information

Biochemistry 674, Fall, 1995: Nucleic Acids Exam II: November 16, 1995 Your Name Here: PCR A C

Biochemistry 674, Fall, 1995: Nucleic Acids Exam II: November 16, 1995 Your Name Here: PCR A C CHM 674 Exam II, 1995 1 iochemistry 674, Fall, 1995: Nucleic cids Prof. Jason Kahn Exam II: November 16, 1995 Your Name Here: This exam has five questions worth 20 points each. nswer all five. You do not

More information

Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection

Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection Published online 7 October 2014 Nucleic Acids Research, 2014, Vol. 42, No. 19 11865 11878 doi: 10.1093/nar/gku810 Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint

More information

Genome 541! Unit 4, lecture 3! Genomics assays

Genome 541! Unit 4, lecture 3! Genomics assays Genome 541! Unit 4, lecture 3! Genomics assays I d like a bit more background on the assays and bioterminology.!! The phantom peak concept was confusing.! I didn t quite understand what the phantom peak

More information

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics

resequencing storage SNP ncrna metagenomics private trio de novo exome ncrna RNA DNA bioinformatics RNA-seq comparative genomics RNA Sequencing T TM variation genetics validation SNP ncrna metagenomics private trio de novo exome mendelian ChIP-seq RNA DNA bioinformatics custom target high-throughput resequencing storage ncrna comparative

More information

Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction

Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction Alyssa Morrow*, Vaishaal Shankar* Anthony Joseph, Benjamin Recht, Nir Yosef Transcription Factor A protein that binds to DNA

More information

Chapter 13 - Regulation of Gene Expression

Chapter 13 - Regulation of Gene Expression Chapter 13 - Regulation of Gene Expression 1. Describe the typical components of an operon in an E. coli (prokaryotic) cell. (p. 238-239) a. regulator gene - b. promoter - c. operator - d. structural gene

More information

File S1. Program overview and features

File S1. Program overview and features File S1 Program overview and features Query list filtering. Further filtering may be applied through user selected query lists (Figure. 2B, Table S3) that restrict the results and/or report specifically

More information

Next-generation sequencing technologies

Next-generation sequencing technologies Next-generation sequencing technologies Illumina: Summary https://www.youtube.com/watch?v=fcd6b5hraz8 Illumina platforms: Benchtop sequencers https://www.illumina.com/systems/sequencing-platforms.html

More information

Long-range gene regulation

Long-range gene regulation Long-range gene regulation Short Course in Medical Genetics Melbourne, June 2011 Question Why should long-range regulation of gene expression be of interest to Clinical Scientists and Pathologists interested

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Nature Biotechnology: doi: /nbt Supplementary Figure 1 Supplementary Figure 1 An extended version of Figure 2a, depicting multi-model training and reverse-complement mode To use the GPU s full computational power, we train several independent models in parallel

More information