ChIP. November 21, 2017

Size: px
Start display at page:

Download "ChIP. November 21, 2017"

Transcription

1 ChIP November 21, 2017

2 functional signals: is DNA enough? what is the smallest number of letters used by a written language?

3 DNA is only one part of the functional genome DNA is heavily bound by proteins, in any cell: nucleosomes transcription factors transcription suppressors scaffolding Proteins can bind to specific DNA sequences Some proteins have fairly nonspecific binding (e.g. nucleosomes)

4 chromatin immunoprecipitation (ChIP) workflow: 1) crosslink DNA and proteins 2) shear or digest DNA into fragments 3) use tagged antibody to isolate protein of interest 4) reverse protein-dna crosslinks 5) sequence DNA and align to a reference genome, or hybridize to a microarray

5 cross link sonicate or digest add antibody no antibody immunoprecipitate, reverse cross links, purify DNA total input

6 after aligning to a reference genome

7 pathognomonic ChIP peak shape forward strand sequenced from this side only reverse strand sequenced from this side only

8 after aligning to a reference genome: IGV, with plus strand in pink, minus strand in blue

9 characteristic fragments Kharchenko et al, Nat Biotech 2008

10 MACS (Model-based Analysis of ChIP-Seq) two issues in peak calling: resolution (how finely can the peak be defined) ChIP seq tags only come from the ends of a fragment! so the exact position of a bound protein must be inferred to a resolution smaller than the fragment size detection above background noise because of sequencing biases, chromatin structure, copy number variation, and mapping biases, the baseline isn t flat

11 MACS (Model-based Analysis of ChIP-Seq)

12 MACS (Model-based Analysis of ChIP-Seq) assume there is no strand bias (not more likely to get tags from one strand than the other) then sample 1000 regions where there is more than mfold enrichment relative to a random distribution, and look at Watson vs Crick peak positions

13 MACS (Model-based Analysis of ChIP-Seq)

14 MACS (Model-based Analysis of ChIP-Seq) cross-correlation of signals from the two strands is highest when the shift distance matches the size of the binding site

15 MACS (Model-based Analysis of ChIP-Seq) options for removing background noise: 1) use a Poisson distribution (λbg) to define a cutoff # tags 2) use the total input to estimate local background MACS uses a dynamic Poisson parameter, λlocal, defined separately for each candidate peak as: λlocal = max(λbg, [λ1k,] λ5k, λ10k) λ1k, λ5k, λ10k are λ estimated from the 1 kb, 5 kb or 10 kb window centered at the peak location in the control sample

16 λ1k, λ5k, λ10k

17 common ChIP problems size range of the binding phenomenon is unknown (e.g. some repressive histone marks can occupy many kb of DNA) sequencing depth in control and IP samples influences peak finding

18 histonehmm expression data indicates that the huge repressive peak is real!

19 histonehmm Classifies data into four states: modified in both samples unmodified in both samples sample A is modified sample B is modified where the read counts are presumed to come from a mixture of background & signal what are the observed and hidden states?

20 what next? after finding ChIP peaks... look for motifs, to figure out binding site correlate binding with structural or functional assays (gene expression, chromatin conformation) use ChIP peaks for different marks to profile genes

21

22 Meta-clustering identifies combinatorial subprofiles for chromatin marks.

23 Meta-clustering identifies combinatorial subprofiles for chromatin marks.

24

25 viewing and describing motifs PWM (position weight matrix) ACCGCTG AGCGCTG TCCGCAG TCCCGTG ACCGCTG AGCGCTG AGCGCTG TCCGCAG pos. A C G T consensus sequence: ACCGCTG

26 viewing and describing motifs pos. A C G T consensus sequence: ACCGCTG Probability Position

27 viewing and describing motifs pos. A C G T test_sequences consensus sequence: ACCGCTG Information content Position

28 viewing and describing motifs seqlogo 2 test_sequences 1.5 Information content Position Information content: measure of tolerance to substitutions IC of 2 means only one nucleotide is allowed at that position. IC of 0 means that all nucleotides occur with equal frequency at that position.

29 seqlogo information content for position w in the motif, where J is the length of the alphabet for the motif (4 for DNA, 20 for protein) IC(w) = log2(j) - entropy(w) AAAAAAAAAAA has zero entropy

30 short side trip into entropy Measure of how close to uniform the distribution is (~unpredictability)... Like variance in a way but not the same thing Entropy of random DNA (Wootton and Federhen definition): ACAGGTTTCT AAAAAAAAAA

31 Entropy: most useful when calculated in windows ACTGACTGATCGACGTACGTACGTACGTACGT AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

32 Entropy Computing in windows is critical to assessing landscape ACTGACTGAAAAACGTACGTATTTCCCGTACGT

33 motif finding workflow get a bunch of sequences predicted experimental find candidate motifs de novo or starting from a known motif word-based algorithms probabilistic algorithms test whether motifs are functional chromatin binding assay reporter assay phylogeny gene set analysis

34 sequence sources predicted transcription factor binding sites typically bind within 1kb of a promoter, so search in those intervals for motifs genes in a pathway are often regulated by the same transcription factors, so their upstream sequences may have the same motifs experimental bind a transcription factor to DNA and digest all DNA that is not bound (DNA footprinting) collect sequences near or bound by particular proteins (ChIP-seq) as a control, include sequences not known to be bound by the factor or not known to have the effect

35 word-based algorithms simplest approach: assume that the binding site is n bp. Count occurrences of all n-bp sequences in the dataset and compare to the expected distribution: AAAAAA AAAAAC AAAAAG...CGCCCT CGCCGA...TTTTTT obs: exp: expected distribution is based on GC/AT content of the sequences. Calculate a z-score for the observed frequency of a motif. If it is significantly overrepresented, look at all 1-base edits: CGCCCT: AGCCCT,GGCCCT,TGCCCT... are these motifs overrepresented, as a group?

36 PWM-based algorithms use publicly available position weight matrices, look for -range of scores of alignments, then scores above the distribution -multiple matches in one sequence

37 probabilistic algorithms de novo motif finding simplest: from the set of sequences, find the n bp motif with the highest information content (greedy approach) look for similarities to motif in the other sequences; usually require that every instance of the motif has at least one common site with the first motif sometimes useful to allow one and only one match of the motif per sequence, to minimize bad matches and over weighting by long sequences

38 probabilistic algorithms de novo motif finding Gibbs Sampler: get best motif from a set of sequences 1) select random short subsequences from the set, call these the patterns 2) choose another short subsequence at random. its score is p(generated by the patterns)/p(generated by background) add high scoring subsequences to the pattern By starting with a known pattern specification this can not only find other instances of the pattern but can also improve the pattern specification

39 extensions to Gibbs sampling explicitly account for AT% of DNA from the organism consider both strands of DNA mask motifs that have been found so that other motifs can be uncovered use specific models to look for dyad sites and palindromes add random jumps to avoid local maxima add structure-based constraints account for motif families allow gapped motifs

40 testing motifs use a test set, if available ChIP in another cell type or another organism reporter assay phylogenetic comparisons gene set analysis

41 reporter assay

42 phylogeny

43 gene set/pathway analysis looking for enhancers! no well-defined location no well-defined binding sites

44

45

46

47

48 resources JASPAR (free) and TRANSFAC (licensed) databases Both are collections of experimentally validated transcription factor binding sites and PWMs.

49 MEME older, well-used program, now part of a suite of motif finding tools uses Expectation Maximization (Multiple EM for Motif Elicitation)

50

51 MEME older, well-used program, now part of a suite of motif finding tools uses Expectation Maximization (Multiple EM for Motif Elicitation)

52 input: promoter sequences for all yeast genes (~6000) >chr1.fa TTAATGCTTTTGATAAAATGTATATAAAGGCTGTCGTAATGTGCAGTAGTAAGGACCTGA CTGTGTTTGTGGTTCTCTTCATTCTTGAACCTTGTCATTGGTAAAAGACCATCGTCAAGA TATTTGAAAGTTAATAGACAGTTAACAATAATAACAACAGCAATAAGAATAACAATAAAT TCATTGAACATATTTCAGAAT >chr1.fa TGTTTCTCTTGATATGATAATAGGTGGAAACGTAGAAAAAAAAATCGACATATAAAAGTG GGGCAGATACTTCGTGTGACAATGGCCAATTCAAGCCCTTTGGGCAGATGTTGCCCTTCT TCTTTCTTAAAAAGTCTTAGTACGATTGACCAAGTCAGAAAAAAAAAAAAAAAGGAACTA AAAAAAGTTTTAATTAATTAT >chr1.fa AATAATATTTGGGGCCCCTCGCGGCTCATTTGTAGTATCTAAGATTATGTATTTTCTTTT ATAATATTTGTTGTTATGAAACAGACAGAAGTAAGTTTCTGCGACTATATTATTTTTTTT TTTCTTCTTTTTTTTTCCTTTATTCAACTTGGCGATGAGCTGAAAATTTTTTTGGTTAAG GACCCTTTAGAAGTATTGAAT >chr1.fa TTTTTTATATATCTGGATGTATACTATTATTGAAAAACTTCATTAATAGTTACAACTTTT TCAATATCAAGTTGATTAAGAAAAAGAAAATTATTATGGGTTAGCTGAAAACCGTGTGAT GCATGTCGTTTAAGGATTGTGTAAAAAAGTGAACGGCAACGCATTTCTAATATAGATAAC GGCCACACAAAGTAGTACTAT

53 MEME older, well-used program, now part of a suite of motif finding tools uses Expectation Maximization (Multiple EM for Motif Elicitation)

54

55 ChIPMunk May 2012 release optimized for lots of very long sequences; searches for motif with highest information content, then aligns to motifs with high information content, constructs a PWM from that alignment. Series of PWMs tested within ChIPseq peaks, taking into consideration peak shape.

56

57

58

59 other algorithms main variations add background kmer frequency in genome add negative set information for training set, weight the sequences account for related motifs from other TF family members rely on known signals (TRANSFAC & JASPAR) look for low probability clusters of signals look for repeated clusters in a set of sequences

60 useful sites (older programs) (newer programs & databases)

ChIP-seq analysis 2/28/2018

ChIP-seq analysis 2/28/2018 ChIP-seq analysis 2/28/2018 Acknowledgements Much of the content of this lecture is from: Furey (2012) ChIP-seq and beyond Park (2009) ChIP-seq advantages + challenges Landt et al. (2012) ChIP-seq guidelines

More information

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis

More information

Charles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use

Charles Girardot, Furlong Lab. MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use Charles Girardot, Furlong Lab MACS, CisGenome, SISSRs and other peak calling algorithms: differences and practical use ChIP-Seq signal properties Only 5 ends of ChIPed fragments are sequenced Shifted read

More information

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA

More information

ChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015

ChIP-Seq Tools. J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 ChIP-Seq Tools J Fass UCD Genome Center Bioinformatics Core Wednesday September 16, 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA or

More information

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014

ChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind

More information

Introduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H

Introduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H Introduction to ChIP Seq data analyses Acknowledgement: slides taken from Dr. H Wu @Emory ChIP seq: Chromatin ImmunoPrecipitation it ti + sequencing Same biological motivation as ChIP chip: measure specific

More information

Motif Finding: Summary of Approaches. ECS 234, Filkov

Motif Finding: Summary of Approaches. ECS 234, Filkov Motif Finding: Summary of Approaches Lecture Outline Flashback: Gene regulation, the cis-region, and tying function to sequence Motivation Representation simple motifs weight matrices Problem: Finding

More information

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk bit.ly/bio2_2012 The Central Dogma

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq Part 1 An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk http://bit.ly/bio2links

More information

Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq

Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Computational Analysis of Ultra-high-throughput sequencing data: ChIP-Seq Philipp Bucher Wednesday January 21, 2009 SIB graduate school course EPFL, Lausanne Data flow in ChIP-Seq data analysis Level 1:

More information

Introduction to genome biology

Introduction to genome biology Introduction to genome biology Lisa Stubbs Deep transcritpomes for traditional model species from ENCODE (and modencode) Deep RNA-seq and chromatin analysis on 147 human cell types, as well as tissues,

More information

Caroline Townsend December 2012 Biochem 218 A critical review of ChIP-seq enrichment analysis tools

Caroline Townsend December 2012 Biochem 218 A critical review of ChIP-seq enrichment analysis tools Caroline Townsend December 2012 Biochem 218 A critical review of ChIP-seq enrichment analysis tools Introduction Transcriptional regulation, chromatin states, and genome stability pathways are largely

More information

Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057

Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson Pinn 6-057 Characterizing DNA binding sites high throughput approaches Biol4230 Tues, April 24, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Reviewing sites: affinity and specificity representation binding

More information

Figure 7.1: PWM evolution: The sequence affinity of TFBSs has evolved from single sequences, to PWMs, to larger and larger databases of PWMs.

Figure 7.1: PWM evolution: The sequence affinity of TFBSs has evolved from single sequences, to PWMs, to larger and larger databases of PWMs. Chapter 7 Discussion This thesis presents dry and wet lab techniques to elucidate the involvement of transcription factors (TFs) in the regulation of the cell cycle and myogenesis. However, the techniques

More information

Introduction to genome biology

Introduction to genome biology Introduction to genome biology Lisa Stubbs We ve found most genes; but what about the rest of the genome? Genome size* 12 Mb 95 Mb 170 Mb 1500 Mb 2700 Mb 3200 Mb #coding genes ~7000 ~20000 ~14000 ~26000

More information

Next- genera*on Sequencing. Lecture 13

Next- genera*on Sequencing. Lecture 13 Next- genera*on Sequencing Lecture 13 ChIP- seq Applica*ons iden%fy sequence varia%ons DNA- seq Iden%fy Pathogens RNA- seq Kahvejian et al, 2008 Protein-DNA interaction DNA is the informa*on carrier of

More information

ChIP-seq/Functional Genomics/Epigenomics. CBSU/3CPG/CVG Next-Gen Sequencing Workshop. Josh Waterfall. March 31, 2010

ChIP-seq/Functional Genomics/Epigenomics. CBSU/3CPG/CVG Next-Gen Sequencing Workshop. Josh Waterfall. March 31, 2010 ChIP-seq/Functional Genomics/Epigenomics CBSU/3CPG/CVG Next-Gen Sequencing Workshop Josh Waterfall March 31, 2010 Outline Introduction to ChIP-seq Control data sets Peak/enriched region identification

More information

Sequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing

Sequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Analysis II: Sequence Patterns and Matrices George Bell, Ph.D. WIBR Bioinformatics and Research Computing Sequence Patterns and Matrices Multiple sequence alignments Sequence patterns Sequence

More information

Sequence Motif Analysis

Sequence Motif Analysis Sequence Motif Analysis Lecture in M.Sc. Biomedizin, Module: Proteinbiochemie und Bioinformatik Jonas Ibn-Salem Andrade group Johannes Gutenberg University Mainz Institute of Molecular Biology March 7,

More information

Analyzing ChIP-seq data. R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H.

Analyzing ChIP-seq data. R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H. Analyzing ChIP-seq data R. Gentleman, D. Sarkar, S. Tapscott, Y. Cao, Z. Yao, M. Lawrence, P. Aboyoun, M. Morgan, L. Ruzzo, J. Davison, H. Pages Biological Motivation Chromatin-immunopreciptation followed

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Lecture 7: April 7, 2005

Lecture 7: April 7, 2005 Analysis of Gene Expression Data Spring Semester, 2005 Lecture 7: April 7, 2005 Lecturer: R.Shamir and C.Linhart Scribe: A.Mosseri, E.Hirsh and Z.Bronstein 1 7.1 Promoter Analysis 7.1.1 Introduction to

More information

ECS 234: Genomic Data Integration ECS 234

ECS 234: Genomic Data Integration ECS 234 : Genomic Data Integration Heterogeneous Data Integration DNA Sequence Microarray Proteomics >gi 12004594 gb AF217406.1 Saccharomyces cerevisiae uridine nucleosidase (URH1) gene, complete cds ATGGAATCTGCTGATTTTTTTACCTCACGAAACTTATTAAAACAGATAATTTCCCTCATCTGCAAGGTTG

More information

ChIP-seq and RNA-seq

ChIP-seq and RNA-seq ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)

More information

L8: Downstream analysis of ChIP-seq and ATAC-seq data

L8: Downstream analysis of ChIP-seq and ATAC-seq data L8: Downstream analysis of ChIP-seq and ATAC-seq data Shamith Samarajiwa CRUK Bioinformatics Autumn School September 2017 Summary Downstream analysis for extracting meaningful biology : Normalization and

More information

Transcription Gene regulation

Transcription Gene regulation Transcription Gene regulation The machine that transcribes a gene is composed of perhaps 50 proteins, including RNA polymerase, the enzyme that converts DNA code into RNA code. A crew of transcription

More information

Epigenetics and DNase-Seq

Epigenetics and DNase-Seq Epigenetics and DNase-Seq BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Anthony

More information

CS273B: Deep learning for Genomics and Biomedicine

CS273B: Deep learning for Genomics and Biomedicine CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy

More information

Analysis of ChIP-seq data with R / Bioconductor

Analysis of ChIP-seq data with R / Bioconductor Analysis of ChIP-seq data with R / Bioconductor Martin Morgan Bioconductor / Fred Hutchinson Cancer Research Center Seattle, WA, USA 8-10 June 2009 ChIP-seq Chromatin immunopreciptation to enrich sample

More information

ChIP-Seq Data Analysis: Identification of Protein DNA Binding Sites with SISSRs Peak-Finder

ChIP-Seq Data Analysis: Identification of Protein DNA Binding Sites with SISSRs Peak-Finder Chapter 20 ChIP-Seq Data Analysis: Identification of Protein DNA Binding Sites with SISSRs Peak-Finder Leelavati Narlikar and Raja Jothi Abstract Protein DNA interactions play key roles in determining

More information

Computational Investigation of Gene Regulatory Elements. Ryan Weddle Computational Biosciences Internship Presentation 12/15/2004

Computational Investigation of Gene Regulatory Elements. Ryan Weddle Computational Biosciences Internship Presentation 12/15/2004 Computational Investigation of Gene Regulatory Elements Ryan Weddle Computational Biosciences Internship Presentation 12/15/2004 1 Table of Contents Introduction.... 3 Goals..... 9 Methods.... 12 Results.....

More information

Genome 541! Unit 4, lecture 3! Genomics assays

Genome 541! Unit 4, lecture 3! Genomics assays Genome 541! Unit 4, lecture 3! Genomics assays I d like a bit more background on the assays and bioterminology.!! The phantom peak concept was confusing.! I didn t quite understand what the phantom peak

More information

Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays

Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays Genome 541 Gene regulation and epigenomics Lecture 3 Integrative analysis of genomics assays Please consider both the forward and reverse strands (i.e. reverse compliment sequence). You do not need to

More information

CSC 2427: Algorithms in Molecular Biology Lecture #14

CSC 2427: Algorithms in Molecular Biology Lecture #14 CSC 2427: Algorithms in Molecular Biology Lecture #14 Lecturer: Michael Brudno Scribe Note: Hyonho Lee Department of Computer Science University of Toronto 03 March 2006 Microarrays Revisited In the last

More information

Genome 373: High- Throughput DNA Sequencing. Doug Fowler

Genome 373: High- Throughput DNA Sequencing. Doug Fowler Genome 373: High- Throughput DNA Sequencing Doug Fowler Tasks give ML unity We learned about three tasks that are commonly encountered in ML Models/Algorithms Give ML Diversity Classification Regression

More information

Chapter 1 Analysis of ChIP-Seq Data with Partek Genomics Suite 6.6

Chapter 1 Analysis of ChIP-Seq Data with Partek Genomics Suite 6.6 Chapter 1 Analysis of ChIP-Seq Data with Partek Genomics Suite 6.6 Overview ChIP-Sequencing technology (ChIP-Seq) uses high-throughput DNA sequencing to map protein-dna interactions across the entire genome.

More information

Chromatin immunoprecipitation: five steps to great results

Chromatin immunoprecipitation: five steps to great results Chromatin immunoprecipitation: five steps to great results Introduction The discovery and use of antibodies in life science research has been critical to many advancements across applications, including

More information

DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine

DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine DIAMANTINA INSTITUTE for Cancer, Immunology and Metabolic Medicine Defining MYB Transcriptional Network by Genome-wide Chromatin Occupancy Profiling (ChIP-Seq) 2010 E.Glazov, L. Zhao Transcription Factors:

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENRY INFORMION doi:.38/nature In vivo nucleosome mapping D4+ Lymphocytes radient-based and I-bead cell sorting D8+ Lymphocytes ranulocytes Lyse the cells Isolate and sequence mononucleosome cores

More information

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks

Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu Gary

More information

MCAT: Motif Combining and Association Tool

MCAT: Motif Combining and Association Tool MCAT: Motif Combining and Association Tool Yanshen Yang Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree

More information

File S1. Program overview and features

File S1. Program overview and features File S1 Program overview and features Query list filtering. Further filtering may be applied through user selected query lists (Figure. 2B, Table S3) that restrict the results and/or report specifically

More information

ChIP-seq data analysis

ChIP-seq data analysis hip-seq data analysis Harri Lähdesmäki Department of omputer Science Aalto University January 8, 2015 Motivation: transcription factor binding site (TBS) prediction Last time we studied computational methods

More information

The ChIP-Seq project. Giovanna Ambrosini, Philipp Bucher. April 19, 2010 Lausanne. EPFL-SV Bucher Group

The ChIP-Seq project. Giovanna Ambrosini, Philipp Bucher. April 19, 2010 Lausanne. EPFL-SV Bucher Group The ChIP-Seq project Giovanna Ambrosini, Philipp Bucher EPFL-SV Bucher Group April 19, 2010 Lausanne Overview Focus on technical aspects Description of applications (C programs) Where to find binaries,

More information

Intracellular receptors specify complex patterns of gene expression that are cell and gene

Intracellular receptors specify complex patterns of gene expression that are cell and gene SUPPLEMENTAL RESULTS AND DISCUSSION Some HPr-1AR ARE-containing Genes Are Unresponsive to Androgen Intracellular receptors specify complex patterns of gene expression that are cell and gene specific. For

More information

Übung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS

Übung V. Einführung, Teil 1. Transktiptionelle Regulation TFBS Übung V Einführung, Teil 1 Transktiptionelle Regulation TFBS Transcription Factors These proteins promote transcription 1. Bind DNA 2. Activate Transcription These two functions usually reside on separate

More information

2/10/17. Contents. Applications of HMMs in Epigenomics

2/10/17. Contents. Applications of HMMs in Epigenomics 2/10/17 I529: Machine Learning in Bioinformatics (Spring 2017) Contents Applications of HMMs in Epigenomics Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2017 Background:

More information

Sequence logos for DNA sequence alignments

Sequence logos for DNA sequence alignments Sequence logos for DNA sequence alignments Oliver Bembom Division of Biostatistics, University of California, Berkeley October, 202 Introduction An alignment of DNA or amino acid sequences is commonly

More information

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Applications in Genome Studies Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Protocols Next generation sequencing protocol cdna, not RNA sequencing

More information

Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human. Supporting Information

Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human. Supporting Information Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human Angela Re #, Davide Corá #, Daniela Taverna and Michele Caselle # equal contribution * corresponding author,

More information

ChIP-seq experimental design and analysis

ChIP-seq experimental design and analysis ChIP-seq experimental design and analysis Martin Morgan (mtmorgan@fhcrc.org) Fred Hutchinson Cancer Research Center Seattle, WA, USA 19 November, 2009 Classical ChIP-chip Biological context Punctuations,

More information

CollecTF Documentation

CollecTF Documentation CollecTF Documentation Release 1.0.0 Sefa Kilic August 15, 2016 Contents 1 Curation submission guide 3 1.1 Data.................................................... 3 1.2 Before you start.............................................

More information

The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks.

The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks. Open Seqmonk Launch SeqMonk The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks. SeqMonk Analysis Page 1 Create

More information

Machine Learning. HMM applications in computational biology

Machine Learning. HMM applications in computational biology 10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly

More information

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data

More information

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing

More information

Galaxy Platform For NGS Data Analyses

Galaxy Platform For NGS Data Analyses Galaxy Platform For NGS Data Analyses Weihong Yan wyan@chem.ucla.edu Collaboratory Web Site http://qcb.ucla.edu/collaboratory http://collaboratory.lifesci.ucla.edu Workshop Outline ü Day 1 UCLA galaxy

More information

Motifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin

Motifs. BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin Motifs BCH339N - Systems Biology / Bioinformatics Edward Marcotte, Univ of Texas at Austin An example transcriptional regulatory cascade Here, controlling Salmonella bacteria multidrug resistance Sequencespecific

More information

Current methods in the analysis of CLIP-Seq data

Current methods in the analysis of CLIP-Seq data 1 Current methods in the analysis of CLIP-Seq data Kai Fu, Bioinformatics Program,University of California Los Angeles RNA-binding proteins play important roles in the post-transcriptional regulation.

More information

Nature Genetics: doi: /ng Supplementary Figure 1. H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts.

Nature Genetics: doi: /ng Supplementary Figure 1. H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts. Supplementary Figure 1 H3K27ac HiChIP enriches enhancer promoter-associated chromatin contacts. (a) Schematic of chromatin contacts captured in H3K27ac HiChIP. (b) Loop call overlap for cohesin HiChIP

More information

Activation of a Floral Homeotic Gene in Arabidopsis

Activation of a Floral Homeotic Gene in Arabidopsis Activation of a Floral Homeotic Gene in Arabidopsis By Maximiliam A. Busch, Kirsten Bomblies, and Detlef Weigel Presentation by Lis Garrett and Andrea Stevenson http://ucsdnews.ucsd.edu/archive/graphics/images/image5.jpg

More information

APPLICATION NOTE. Abstract. Introduction

APPLICATION NOTE. Abstract. Introduction From minuscule amounts to magnificent results: reliable ChIP-seq data from 1, cells with the True MicroChIP and the MicroPlex Library Preparation kits Abstract Diagenode has developed groundbreaking solutions

More information

Gene Expression Microarrays. For microarrays, purity of the RNA was further assessed by

Gene Expression Microarrays. For microarrays, purity of the RNA was further assessed by Supplemental Methods Gene Expression Microarrays. For microarrays, purity of the RNA was further assessed by an Agilent 2100 Bioanalyzer. 500 ng of RNA was reverse transcribed into crna and biotin-utp

More information

DNA:CHROMATIN INTERACTIONS

DNA:CHROMATIN INTERACTIONS DNA:CHROMATIN INTERACTIONS Exploring transcription factor binding and the epigenomic landscape Chris Seward Introductions Cell and Developmental Biology PhD Candidate in Dr. Lisa Stubbs Laboratory Currently

More information

Applied Bioinformatics - Lecture 16: Transcriptomics

Applied Bioinformatics - Lecture 16: Transcriptomics Applied Bioinformatics - Lecture 16: Transcriptomics David Hendrix Oregon State University Feb 15th 2016 Transcriptomics High-throughput Sequencing (deep sequencing) High-throughput sequencing (also

More information

Applications of short-read

Applications of short-read Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ Sequencing applications RNA-Seq includes experiments

More information

7.03, 2006, Lecture 23 Eukaryotic Genes and Genomes IV

7.03, 2006, Lecture 23 Eukaryotic Genes and Genomes IV 1 Fall 2006 7.03 7.03, 2006, Lecture 23 Eukaryotic Genes and Genomes IV In the last three lectures we have thought a lot about analyzing a regulatory system in S. cerevisiae, namely Gal regulation that

More information

7.05, 2005, Lecture 23 Eukaryotic Genes and Genomes IV

7.05, 2005, Lecture 23 Eukaryotic Genes and Genomes IV 7.05, 2005, Lecture 23 Eukaryotic Genes and Genomes IV In the last three lectures we have thought a lot about analyzing a regulatory system in S. cerevisiae, namely Gal regulation that involved a hand

More information

Lecture 22 Eukaryotic Genes and Genomes III

Lecture 22 Eukaryotic Genes and Genomes III Lecture 22 Eukaryotic Genes and Genomes III In the last three lectures we have thought a lot about analyzing a regulatory system in S. cerevisiae, namely Gal regulation that involved a hand full of genes.

More information

PIP-seq. Cells. Permanganate ChIP-Seq

PIP-seq. Cells. Permanganate ChIP-Seq PIP-seq ells Formaldehyde Permanganate 5 Harvest Lyse Sonicate First dapter Ligation 3 3 5 hip Elute Reverse rosslinks Piperidine cleavage 5 3 3 5 Primer Extension Second dapter Ligation 5 3 3 5 Deep Sequencing

More information

A Bioconductor pipeline for the analysis of ChIP- Seq experiments.

A Bioconductor pipeline for the analysis of ChIP- Seq experiments. A Bioconductor pipeline for the analysis of ChIP- Seq experiments. BioConductor 2013 Sangsoon Woo, Renan Sauteraud, Arnaud Droit, Xuekui Zhang, Fred Hutchinson Cancer Reserach Center, SeaKle Outline Introduction

More information

Computational Systems Biology Deep Learning in the Life Sciences

Computational Systems Biology Deep Learning in the Life Sciences Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490 HST.506 Christina Ji April 6, 2017 DanQ: a hybrid convolutional and recurrent deep neural network for quantifying

More information

Shin Lin CS229 Final Project Identifying Transcription Factor Binding by the DNase Hypersensitivity Assay

Shin Lin CS229 Final Project Identifying Transcription Factor Binding by the DNase Hypersensitivity Assay BACKGROUND In the DNA of cell nuclei, transcription factors (TF) bind regulatory regions throughout the genome in tissue specific patterns. The binding occurs for three reasons: 1) the TF is present, 2)

More information

Parts of a standard FastQC report

Parts of a standard FastQC report FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are

More information

Analysis of Biological Sequences SPH

Analysis of Biological Sequences SPH Analysis of Biological Sequences SPH 140.638 swheelan@jhmi.edu nuts and bolts meet Tuesdays & Thursdays, 3:30-4:50 no exam; grade derived from 3-4 homework assignments plus a final project (open book,

More information

Bioinformatics of Transcriptional Regulation

Bioinformatics of Transcriptional Regulation Bioinformatics of Transcriptional Regulation Carl Herrmann IPMB & DKFZ c.herrmann@dkfz.de Wechselwirkung von Maßnahmen und Auswirkungen Einflussmöglichkeiten in einem Dialog From genes to active compounds

More information

Ana Teresa Freitas 2016/2017

Ana Teresa Freitas 2016/2017 Finding Regulatory Motifs in DNA Sequences Ana Teresa Freitas 2016/2017 Combinatorial Gene Regulation A recent microarray experiment showed that when gene X is knocked out, 20 other genes are not expressed

More information

A more efficient, sensitive and robust method of chromatin immunoprecipitation (ChIP)

A more efficient, sensitive and robust method of chromatin immunoprecipitation (ChIP) A more efficient, sensitive and robust method of chromatin immunoprecipitation (ChIP) ADVANCEMENTS IN EPIGENETICS Introducing ChIP and Chromatrap Chromatrap is a more efficient, sensitive and robust method

More information

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq Sequencing applications Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ RNA-Seq includes experiments

More information

02 Agenda Item 03 Agenda Item

02 Agenda Item 03 Agenda Item 01 Agenda Item 02 Agenda Item 03 Agenda Item SOLiD 3 System: Applications Overview April 12th, 2010 Jennifer Stover Field Application Specialist - SOLiD Applications Workflow for SOLiD Application Application

More information

Methods and tools for exploring functional genomics data

Methods and tools for exploring functional genomics data Methods and tools for exploring functional genomics data William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington Outline Searching for

More information

Computational Technique for Improvement of the Position-Weight Matrices for the DNA/Protein Binding Sites

Computational Technique for Improvement of the Position-Weight Matrices for the DNA/Protein Binding Sites Wright State University CORE Scholar Physics Faculty Publications Physics 2005 Computational Technique for Improvement of the Position-Weight Matrices for the DNA/Protein Binding Sites Naum I. Gershenzon

More information

Interaktionen und Modifikationen von RNAs und Proteinen RNA-Protein Interactions II

Interaktionen und Modifikationen von RNAs und Proteinen RNA-Protein Interactions II Interaktionen und Modifikationen von RNAs und Proteinen RNA-Protein Interactions II (Modul 10-202-2208; Spezialvorlesung) Jörg Fallmann Institute for Bioinformatics University of Leipzig 11.05.2018 1 /

More information

XPRIME-EM: Eliciting Expert Prior Information for Motif Exploration Using the Expectation- Maximization Algorithm

XPRIME-EM: Eliciting Expert Prior Information for Motif Exploration Using the Expectation- Maximization Algorithm Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2012-06-22 XPRIME-EM: Eliciting Expert Prior Information for Motif Exploration Using the Expectation- Maximization Algorithm Wei

More information

Computational Methods for Analyzing and Modeling Gene Regulation Dynamics

Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst August 2008 CMU-ML-08-110 Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst August

More information

Measuring Protein-DNA interactions

Measuring Protein-DNA interactions Measuring Protein-DNA interactions How is Biological Complexity Achieved? Mediated by Transcription Factors (TFs) 2 Transcription Factors are genetic switches 3 Regulation of Gene Expression by Transcription

More information

nature methods A paired-end sequencing strategy to map the complex landscape of transcription initiation

nature methods A paired-end sequencing strategy to map the complex landscape of transcription initiation nature methods A paired-end sequencing strategy to map the complex landscape of transcription initiation Ting Ni, David L Corcoran, Elizabeth A Rach, Shen Song, Eric P Spana, Yuan Gao, Uwe Ohler & Jun

More information

Statistical Aspects of ChIP-Seq Data Analysis. Oleg Sergeyevich Mayba. Doctor of Philosophy. Statistics. Computational and Genomic Biology

Statistical Aspects of ChIP-Seq Data Analysis. Oleg Sergeyevich Mayba. Doctor of Philosophy. Statistics. Computational and Genomic Biology Statistical Aspects of ChIP-Seq Data Analysis by Oleg Sergeyevich Mayba A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Statistics and the

More information

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748 CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/8/07 CAP5510 1 Pattern Discovery 2/8/07 CAP5510 2 What we have

More information

Astrocyte GCRB/BICF Workflow for ChIP-Seq Analysis. Venkat Beibei

Astrocyte GCRB/BICF Workflow for ChIP-Seq Analysis. Venkat Beibei Astrocyte GCRB/BICF Workflow for ChIP-Seq Analysis Venkat Malladi @GCRB Beibei Chen @BICF What%is%ChIP+Seq?% Chromatin immunoprecipitation followed by Sequencing (ChIP-Seq): Identify the binding sites

More information

Deep learning frameworks for regulatory genomics and epigenomics

Deep learning frameworks for regulatory genomics and epigenomics Deep learning frameworks for regulatory genomics and epigenomics Chuan Sheng Foo Avanti Shrikumar Nicholas Sinnott- Armstrong ANSHUL KUNDAJE Genetics, Computer science Stanford University Johnny Israeli

More information

NGS Approaches to Epigenomics

NGS Approaches to Epigenomics I519 Introduction to Bioinformatics, 2013 NGS Approaches to Epigenomics Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Background: chromatin structure & DNA methylation Epigenomic

More information

MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC)

MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC) MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC) Lesson Plan: Title JAMIE SIDERS, MEG LAAKSO & WILSON LEUNG Identifying transcription start sites for Peaked promoters using chromatin landscape,

More information

Introduction to NGS analyses

Introduction to NGS analyses Introduction to NGS analyses Giorgio L Papadopoulos Institute of Molecular Biology and Biotechnology Bioinformatics Support Group 04/12/2015 Papadopoulos GL (IMBB, FORTH) IMBB NGS Seminar 04/12/2015 1

More information