Deep learning frameworks for regulatory genomics and epigenomics
|
|
- Colleen Arnold
- 6 years ago
- Views:
Transcription
1 Deep learning frameworks for regulatory genomics and epigenomics Chuan Sheng Foo Avanti Shrikumar Nicholas Sinnott- Armstrong ANSHUL KUNDAJE Genetics, Computer science Stanford University Johnny Israeli 1
2 Local chromatin architecture of histone marks regulatory elements TFs nucleosomes sequence motifs Adapted from Shlyueva et al. (2014) Nature Reviews Genetics. 2
3 Chromatin states Combinatorial chromatin states define broad classes of elements Promoter Transcribed Enhancer CTCF Heterochromatin Poised Repressed Quiescent Thurman et al. (2012) Nature
4 ATAC-seq: genome-wide chromatin accessibility from low input material Paired-end sequencing ATAC-seq peaks identify chromatin accessible regulatory elements
5 ATAC-seq reveals chromatin architecture in genome-wide fragment length distributions Buenrostro et al. (2013) Nature Methods.
6 Chromatin architecture reflects chromatin state Fragment lengths Promoter CTCF Enhancer Bivalent Transcribed Heterochromatin Represesed Quiescent
7 Fraglen Position-aware 2D fragment length distributions (V-plots) 400 bp Aggregate plot for CTCF state Positioned nucleosome 50 bp -1000bp Accessible binding site Peak summit (Midpoint of fragment) +1000bp Plot at single CTCF site sparse and noisy V-plots were first introduced by Henikoff et al. 2011, PNAS
8 Can we predict chromatin states/histone marks at ATAC-peaks? Which of 8 chromatin states? 2kb around ATAC-peak Image classification task! Which histone mark is present?
9 Deep neural networks (DNNs) for image classification Output label: Trump? Or Sanders? Input: image pixel values Lee et. al. (2009), ICML 9
10 Deep neural networks (DNNs) for V- plot classification Output label: Chromatin state/histone mark Input: V-plots Chromatin state/histone mark Nucleosome arrays, Large regions of open chromatin +1 / -1 nucleosome, Segments of open chromatin Nucleosomes, Open chromatin 10
11 An artificial neuron Sigmoid ReLu (Rectified Linear Unit)
12 Convolutional filters Input image Conv. Filter Feature map
13 Convolutional filters Input image Feature map
14 Convolutional layer: multiple filters learn distinct features
15 Pooling layers: locally smooth signal
16 How does a deep conv. neural network transform the raw V-plot input at each layer -1Kb 500 Pure CTCF Promoter Enhancer Kb Chromatin State Class Probabilities 2nd Fully Connected Layer 1st Fully Connected Layer 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing 1st set of Convolutional Maps Initial Smoothing V-Plot Input (300 x 2001)
17 After initial pooling (smoothing) Pure CTCF Chromatin State Class Probabilities 2nd Fully Connected Layer 1st Fully Connected Layer Promoter 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing Enhancer 1st set of Convolutional Maps Initial Smoothing V-Plot Input (300 x 2001)
18 Second set of convolutional maps Pure CTCF Chromatin State Class Probabilities 2nd Fully Connected Layer Promoter 1st Fully Connected Layer 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing Enhancer 1st set of Convolutional Maps Initial Smoothing V-Plot Input (300 x 2001)
19 Learning from multiple 1D functional data (e.g. DNase, MNase) Chromatin State Class Probabilities 2 nd FC Layer 1 st FC Layer 3 rd Convolution Layer 2 nd Convolution Layer 1 st Convolution Layer 1D MNase signal (1 x 2001) 3 rd Convolution Layer 2 nd Convolution Layer 1 st Convolution Layer 1D DNase signal (1 x 2001) Scan DNase profile using filter 19
20 Learning from raw DNA sequence Class Probabilities Higher layers learn motif combinations Score sequence using filters Convolutional layers learn motif (PWM) like filters
21 The Chromputer Integrating multiple inputs (1D, 2D signals, sequence) to simulatenously predict multiple outputs H3K9me 3 H2A.Z H3K36me3 H3K27me3 H3K4me1 Class Probabilities 2 nd FC Layer 1 st FC Layer H3K4me3 TF Binding Chromatin State Multi-task learning 2nd Combined FC Layer 1st Combined FC Layer 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing 1st set of Convolutional Maps 2nd Combined FC Layer 1st Combined FC Layer 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing 1st set of Convolutional Maps 2nd Combined FC Layer 1st Combined FC Layer 3rd Smoothing 2nd set of Convolutional Maps 2nd Smoothing 1st set of Convolutional Maps Initial Smoothing Initial Smoothing Initial Smoothing V-Plot Input (300 x 2001)
22 Chromatin architecture can predict chromatin state in held out chromosome (same cell type) Model + Input data types Majority class (baseline) 42% Gene proximity 59% Random Forest: ATAC-seq (150M reads) 61% Chromputer: DNase (60M reads) 68.1% Chromputer: Mnase (1.5B reads) 69.3% Chromputer: ATAC-seq (150M reads) 75.9% Chromputer: DNase + MNase 81.6% Chromputer: ATAC-seq + sequence 83.5% Chromputer: DNase + MNase + sequence 86.2% Label accuracy across replicates (upper bound) 88% 8-class chromatin state accuracy (%)
23 High cross cell-type chromatin state prediction Learn model on DNase and MNase only Learn on GM12878, predict on K562 (and vice versa) Requires local normalization to make signal comparable 8 class chromatin state accuracy Train / Test GM12878 K562 GM K
24 Predicting individual histone marks from ATAC/DNase/MNase/Sequence Area under Precision recall curve CTCF H3K27ac H3K4me3 H3K4me1 H3K9ac H2Az H3K36me3 H3K27me3 H3K9me3
25 Chromputer trained on TF ChIP-seq predicts cross cell-type in-vivo TF binding with high accuracy Area under Precision recall curve Chromputer Inputs: Seq + DNA shape + DNase profile Positives: Reproducible ChIP-seq peaks Negatives: All other DNase peaks + flanks + matched random sites c-myc YY1 CTCF Test sets: Held out chromosomes in held out cell types 25
26 DeepLIFT: Scoring predictive power of features in Deep Neural Networks LIFT: Linear Importance Feature Tracker, or LIFTing the top off the black box. Provides a predictive importance score for any raw input feature (e.g. pixels in V-Plot images, each nucleotide in sequence) intermediate learned features (e.g. convolutional filters) Linear breakdown of contribution of each input to immediate outputs Recursively apply to get contribution of any input to any output Can be computed efficiently with a single backpropagation (unlike insilico mutagenesis) Less susceptible to buffering effects than in-silico mutagenesis Technical details: ReLU networks: equivalent to Taylor approximation of change in softmax/sigmoid logit if input eliminated. i.e. gradient (w.r.t logit) * input
27 What architecture properties of the ATAC-seq V- plots predict different chromatin states? CTCF state: centered binding, symmetric phased nucleosomes Enhancer state: localized signal, heterogeneity Promoter state: broad regions of accessible chromatin what is the change in classification probability relative to an unbiased classifier if we *only* consider the contributions from each pixel
28 Architectural heterogeneity of accessible elements in different chromatin states Low dimensional t-sne embedding (like PCA) of accessible sites (colored by chromatin state) Distinct classes of enhancers 28
29 Top scoring MNase filters and activating input patterns for CTCF state Top scoring MNase filters Maximally activating input MNase profiles Well phased arrays of nucleosomes 29
30 Top scoring MNase filters and activating input patterns for promoter state Top scoring MNase filters Maximally activating input MNase profiles Asymmetric positioning with well positioned +1/-1 nucleosomes 30
31 What useful patterns can we extract from raw DNA sequence models? Gata1 Motif discovery! What PWM like filters are predictive? G A T A A G C A T T A C C G A T A A Which nucleotides in input sequence are contributing to binding!
32 Top sequence filters for CTCF state Canonical motif 32
33 High resolution point binding events and sequence grammars at CTCF peaks MNase-seq DNase-seq CTCF ChIP-seq Nuc. level importance (height of letter) shows coordination of multiple point binding events 33
34 Context-specific reuse of regulatory sequence in chromatin accessibility changes during hematopoiesis Ryan Corces-Zimmerman Jason Buenrostro Will Greenleaf Howard Chang Ravi Majeti ATAC-seq
35 Deep learning sequence determinants of chromatin accessibility Output: Accessible (+1) vs. not accessible (0) HSC B-cells Erythroid Peyton Greenside G C A T T A C C G A T A A Input: Raw DNA sequence
36 Context-specific re-use of regulatory sequence in HSC, B-cells and Erythroblasts Gata (Rc) Position along sequence Gata Gata (Rc) SPI1 Peyton Greenside
37 Importance in HSC s Context-specific re-use of regulatory sequence in HSC, B-cells and Erythroblasts ATAC-seq SPI1 ChIP-seq GATA1 ChIP-seq?? Position along sequence SPI1 Peyton Greenside
38 Importance in B-cells Context-specific re-use of regulatory sequence in HSC, B-cells and Erythroblasts ATAC-seq SPI1 ChIP-seq GATA1 ChIP-seq No peak No peak Not expressed Position along sequence Peyton Greenside
39 Importance in Erythroid Context-specific re-use of regulatory sequence in HSC, B-cells and Erythroblasts ATAC-seq SPI1 ChIP-seq GATA1 ChIP-seq Gata (Rc) Position along sequence Gata Gata (Rc) SPI1 Peyton Greenside
40 YY1 & GATA and much, much more GATA, SPI1, RUNX2 STAT1 & GATA TEAD4 & GATA AP1 in B-cells only ETS & GATA
41 Summary and ongoing work New predictive deep learning framework (Chromputer) for integrative genomics New interpretation engine for deep learning models. We can extract predictive features (motifs, grammars, footprints, architecture features) from the deep neural networks Local chromatin architecture is predictive of chromatin state and histone marks within and across cell types We can predict in-vivo binding profiles of TFs in new cell types from sequence + shape + DNase/ATAC-seq with high accuracy Context-specific reuse of sequence grammars in accessible sites Extensions: From binary to continuous signal prediction Extensions: Functional variant (QTL, GWAS, rare variant) prediction from raw sequence models 41 bioarxiv paper and code coming soon!
42 Acknowledgements Kundaje Lab members Chuan Sheng Foo Funding Avanti Shrikumar Nicholas Sinnott- Armstrong U01HG (GGR) U41-HG S1 R01ES Johnny Israeli Conflict of Interest: Deep Genomics (SAB), Epinomics (SAB) Rahul Mohan Nathan Boley Peyton Greenside Will Greenleaf 42
43 Guess the element from the V-plot AI vs. human What is this regulatory element? Pure CTCF, Promoter, or Enhancer?
44 Its an enhancer! 1st Fully Connected Layer 2nd Fully Connected Layer Enhancer (correct) 44
Discovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks
Discovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks Reesab Pathak Dept. of Computer Science Stanford University rpathak@stanford.edu Abstract Transcription factors are
More informationSupplementary Table 1: Oligo designs. A list of ATAC-seq oligos used for PCR.
Ad1_noMX: Ad2.1_TAAGGCGA Ad2.2_CGTACTAG Ad2.3_AGGCAGAA Ad2.4_TCCTGAGC Ad2.5_GGACTCCT Ad2.6_TAGGCATG Ad2.7_CTCTCTAC Ad2.8_CAGAGAGG Ad2.9_GCTACGCT Ad2.10_CGAGGCTG Ad2.11_AAGAGGCA Ad2.12_GTAGAGGA Ad2.13_GTCGTGAT
More informationThe ENCODE Encyclopedia. & Variant Annotation Using RegulomeDB and HaploReg
The ENCODE Encyclopedia & Variant Annotation Using RegulomeDB and HaploReg Jill E. Moore Weng Lab University of Massachusetts Medical School October 10, 2015 Where s the Encyclopedia? ENCODE: Encyclopedia
More informationDecoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics
Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing
More informationThe ChIP-Seq project. Giovanna Ambrosini, Philipp Bucher. April 19, 2010 Lausanne. EPFL-SV Bucher Group
The ChIP-Seq project Giovanna Ambrosini, Philipp Bucher EPFL-SV Bucher Group April 19, 2010 Lausanne Overview Focus on technical aspects Description of applications (C programs) Where to find binaries,
More informationChIP-Seq Data Analysis. J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015
ChIP-Seq Data Analysis J Fass UCD Genome Center Bioinformatics Core Wednesday 15 June 2015 What s the Question? Where do Transcription Factors (TFs) bind genomic DNA 1? (Where do other things bind DNA
More informationEnsembl Funcgen: A Database and API for Epigenomics and Gene Regulation Data.
Ensembl Funcgen: A Database and API for Epigenomics and Gene Regulation Data. Nathan Johnson Ensembl Regulation EBI is an Outstation of the European Molecular Biology Laboratory.! Workshop Overview http://www.ebi.ac.uk/~njohnson/courses/23.05.2013-
More informationMachine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University
Machine learning applications in genomics: practical issues & challenges Yuzhen Ye School of Informatics and Computing, Indiana University Reference Machine learning applications in genetics and genomics
More informationWhole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist
Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data
More informationChIP-seq/Functional Genomics/Epigenomics. CBSU/3CPG/CVG Next-Gen Sequencing Workshop. Josh Waterfall. March 31, 2010
ChIP-seq/Functional Genomics/Epigenomics CBSU/3CPG/CVG Next-Gen Sequencing Workshop Josh Waterfall March 31, 2010 Outline Introduction to ChIP-seq Control data sets Peak/enriched region identification
More informationPerm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior- Enhanced Read Mapping
RESEARCH ARTICLE Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior- Enhanced Read Mapping Xin Zeng 1,BoLi 2, Rene Welch 1, Constanza
More informationGreen Center Computational Core ChIP- Seq Pipeline, Just a Click Away
Green Center Computational Core ChIP- Seq Pipeline, Just a Click Away Venkat Malladi Computational Biologist Computational Core Cecil H. and Ida Green Center for Reproductive Biology Science Introduc
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1
Supplementary Figure 1 An extended version of Figure 2a, depicting multi-model training and reverse-complement mode To use the GPU s full computational power, we train several independent models in parallel
More informationChampionChIP Quick, High Throughput Chromatin Immunoprecipitation Assay System
ChampionChIP Quick, High Throughput Chromatin Immunoprecipitation Assay System Liyan Pang, Ph.D. Application Scientist 1 Topics to be Covered Introduction What is ChIP-qPCR? Challenges Facing Biological
More informationIntroduction to genome biology
Introduction to genome biology Lisa Stubbs We ve found most genes; but what about the rest of the genome? Genome size* 12 Mb 95 Mb 170 Mb 1500 Mb 2700 Mb 3200 Mb #coding genes ~7000 ~20000 ~14000 ~26000
More informationIntroduction to ChIP Seq data analyses. Acknowledgement: slides taken from Dr. H
Introduction to ChIP Seq data analyses Acknowledgement: slides taken from Dr. H Wu @Emory ChIP seq: Chromatin ImmunoPrecipitation it ti + sequencing Same biological motivation as ChIP chip: measure specific
More informationPIP-seq. Cells. Permanganate ChIP-Seq
PIP-seq ells Formaldehyde Permanganate 5 Harvest Lyse Sonicate First dapter Ligation 3 3 5 hip Elute Reverse rosslinks Piperidine cleavage 5 3 3 5 Primer Extension Second dapter Ligation 5 3 3 5 Deep Sequencing
More informationCS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016
CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene
More informationIdentifying and mitigating bias in next-generation sequencing methods for chromatin biology
STUDY DESIGNS Identifying and mitigating bias in next-generation sequencing methods for chromatin biology Clifford A. Meyer and X. Shirley Liu Abstract Next-generation sequencing (NGS) technologies have
More informationOccupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility
Dogan et al. Epigenetics & Chromatin (2015) 8:16 DOI 10.1186/s13072-015-0009-5 RESEARCH Open Access Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone
More informationBioinformatics of Transcriptional Regulation
Bioinformatics of Transcriptional Regulation Carl Herrmann IPMB & DKFZ c.herrmann@dkfz.de Wechselwirkung von Maßnahmen und Auswirkungen Einflussmöglichkeiten in einem Dialog From genes to active compounds
More informationNature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1
Supplementary Figure 1 Origin use and efficiency are similar among WT, rrm3, pif1-m2, and pif1-m2; rrm3 strains. A. Analysis of fork progression around confirmed and likely origins (from cerevisiae.oridb.org).
More informationWORKSHOP. Transcriptional circuitry and the regulatory conformation of the genome. Ofir Hakim Faculty of Life Sciences
WORKSHOP Transcriptional circuitry and the regulatory conformation of the genome Ofir Hakim Faculty of Life Sciences Chromosome conformation capture (3C) Most GR Binding Sites Are Distant From Regulated
More informationSNPs - GWAS - eqtls. Sebastian Schmeier
SNPs - GWAS - eqtls s.schmeier@gmail.com http://sschmeier.github.io/bioinf-workshop/ 17.08.2015 Overview Single nucleotide polymorphism (refresh) SNPs effect on genes (refresh) Genome-wide association
More informationBayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations
Published online 21 July 2015 Nucleic Acids Research, 2015, Vol. 43, No. 21 e147 doi: 10.1093/nar/gkv733 BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations Junbai
More informationWHY STUDY GENOMICS OF GENE REGULATION? Genomics of Gene Regulation 1. Motifs, Conservation, and Epigenetic Features
Genomics of Gene Regulation 1. Motifs, Conservation, and Epigenetic Features CSHL Course in Computational and Comparative Genomics 2016 Ross Hardison 10/29/16 1 WHY STUDY GENOMICS OF GENE REGULATION? 10/29/16
More informationGenome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions
Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions The MIT Faculty has made this article openly available. Please share how this access benefits you. Your
More informationUniversiteit Leiden Opleiding Informatica
Universiteit Leiden Opleiding Informatica Convolutional Neural Networks for Regulatory Genomics Name: Niels ten Dijke Date: 17/06/2017 1st supervisor: dr. Wojtek Kowalczyk 2nd supervisor: dr. Gerard van
More informationChromaSig: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome
: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome Gary Hon 1,2, Bing Ren 1,2,3 *, Wei Wang 1,4 * 1 Bioinformatics Program, University of California San Diego, La Jolla,
More informationMentor: Dr. Bino John
MAT Shiksha Mantri Mentor: Dr. Bino John Model-based Analysis y of Tiling-arrays g y for ChIP-chip X. Shirley Liu et al., PNAS (2006) vol. 103 no. 33 12457 12462 Tiling Arrays Subtype of microarray chips
More informationNeural Networks and Applications in Bioinformatics. Yuzhen Ye School of Informatics and Computing, Indiana University
Neural Networks and Applications in Bioinformatics Yuzhen Ye School of Informatics and Computing, Indiana University Contents Biological problem: promoter modeling Basics of neural networks Perceptrons
More informationMany transcription factors! recognize DNA shape
Many transcription factors! recognize DN shape Katie Pollard! Gladstone Institutes USF Division of Biostatistics, Institute for Human Genetics, and Institute for omputational Health Sciences ENODE Users
More informationThe SVM2CRM data User s Guide
The SVM2CRM data User s Guide Guidantonio Malagoli Tagliazucchi, Silvio Bicciato Center for genome research University of Modena and Reggio Emilia, Modena October 31, 2017 October 31, 2017 Contents 1 Overview
More informationLecture 21: Epigenetics Nurture or Nature? Chromatin DNA methylation Histone Code Twin study X-chromosome inactivation Environemnt and epigenetics
Lecture 21: Epigenetics Nurture or Nature? Chromatin DNA methylation Histone Code Twin study X-chromosome inactivation Environemnt and epigenetics Epigenetics represents the science for the studying heritable
More informationSupplemental Information. Antagonistic Activities of Sox2. and Brachyury Control the Fate Choice. of Neuro-Mesodermal Progenitors
Developmental Cell, Volume 42 Supplemental Information Antagonistic Activities of Sox2 and Brachyury Control the Fate Choice of Neuro-Mesodermal Progenitors Frederic Koch, Manuela Scholze, Lars Wittler,
More informationNucleosome Positioning and Organization Advanced Topics in Computa8onal Genomics
Nucleosome Positioning and Organization 02-715 Advanced Topics in Computa8onal Genomics Nucleosome Core Nucleosome Core and Linker 147 bp DNA wrapping around nucleosome core Varying lengths of linkers
More informationWednesday, November 22, 17. Exons and Introns
Exons and Introns Introns and Exons Exons: coded regions of DNA that get transcribed and translated into proteins make up 5% of the genome Introns and Exons Introns: non-coded regions of DNA Must be removed
More informationGenomics xxx (2011) xxx xxx. Contents lists available at SciVerse ScienceDirect. Genomics. journal homepage:
YGENO-08338; No. of pages: 11; 4C: 4, 5, 7 Genomics xxx (2011) xxx xxx Contents lists available at SciVerse ScienceDirect Genomics journal homepage: www.elsevier.com/locate/ygeno Distant cis-regulatory
More informationMicroarray Gene Expression Analysis at CNIO
Microarray Gene Expression Analysis at CNIO Orlando Domínguez Genomics Unit Biotechnology Program, CNIO 8 May 2013 Workflow, from samples to Gene Expression data Experimental design user/gu/ubio Samples
More information1 Najafabadi, H. S. et al. C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol doi: /nbt.3128 (2015).
F op-scoring motif Optimized motifs E Input sequences entral 1 bp region Dinucleotideshuffled seqs B D ll B1H-R predicted motifs Enriched B1H- R predicted motifs L!=!7! L!=!6! L!=5! L!=!4! L!=!3! L!=!2!
More informationDifferential Gene Expression
Biology 4361 Developmental Biology Differential Gene Expression September 28, 2006 Chromatin Structure ~140 bp ~60 bp Transcriptional Regulation: 1. Packing prevents access CH 3 2. Acetylation ( C O )
More informationAssay Standards Working Group Nov 2012 Assay Standards Working Group Recommendations, November 2012
Assay Standards Working Group Recommendations, November 2012 Contents Assay Standards Working Group Recommendations, August 2012... 1 Contents... 1 Introduction... 2 1: Reference Epigenome Criteria...
More informationQuality Control Assessment in Genotyping Console
Quality Control Assessment in Genotyping Console Introduction Prior to the release of Genotyping Console (GTC) 2.1, quality control (QC) assessment of the SNP Array 6.0 assay was performed using the Dynamic
More informationIntroductory Next Gen Workshop
Introductory Next Gen Workshop http://www.illumina.ucr.edu/ http://www.genomics.ucr.edu/ Workshop Objectives Workshop aimed at those who are new to Illumina sequencing and will provide: - a basic overview
More informationBiology 201 (Genetics) Exam #3 120 points 20 November Read the question carefully before answering. Think before you write.
Name KEY Section Biology 201 (Genetics) Exam #3 120 points 20 November 2006 Read the question carefully before answering. Think before you write. You will have up to 50 minutes to take this exam. After
More informationDifferential Gene Expression
Biology 4361 Developmental Biology Differential Gene Expression June 19, 2008 Differential Gene Expression Overview Chromatin structure Gene anatomy RNA processing and protein production Initiating transcription:
More informationStructure of nucleic acids II Biochemistry 302. January 20, 2006
Structure of nucleic acids II Biochemistry 302 January 20, 2006 Intrinsic structural flexibility of RNA antiparallel A-form Fig. 4.19 High Temp Denaturants In vivo conditions Base stacking w/o base pairing/h-bonds
More informationIntroduction to Microarray Analysis
Introduction to Microarray Analysis Methods Course: Gene Expression Data Analysis -Day One Rainer Spang Microarrays Highly parallel measurement devices for gene expression levels 1. How does the microarray
More informationRegulation of eukaryotic transcription:
Promoter definition by mass genome annotation data: in silico primer extension EMBNET course Bioinformatics of transcriptional regulation Jan 28 2008 Christoph Schmid Regulation of eukaryotic transcription:
More informationWhat is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur...
What is Evolutionary Computation? Genetic Algorithms Russell & Norvig, Cha. 4.3 An abstraction from the theory of biological evolution that is used to create optimization procedures or methodologies, usually
More informationChIP-seq analysis of protein DNA interactions on the Ion Proton System
APPLICATION NOTE Ion Proton System ChIP-seq analysis of protein DNA interactions on the Ion Proton System Key findings Optimized ChIP-seq protocol for limited cell populations The protocol combines the
More informationBayesian Variable Selection and Data Integration for Biological Regulatory Networks
Bayesian Variable Selection and Data Integration for Biological Regulatory Networks Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu Gary
More informationLearning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens Sheila M. Reynolds 1, Jeff A. Bilmes 1,2, William Stafford
More informationB&DA Committee Bioinformatics and Data Analysis. PAG January 2016
B&DA Committee Bioinformatics and Data Analysis PAG January 2016 B&DA progress so far Aim: to define standard bfx pipelines for FAANG data Group has skyped many times We have a Wiki: hosted by EBI Sub-groups
More informationChapter 5 DNA and Chromosomes
Chapter 5 DNA and Chromosomes DNA as the genetic material Heat-killed bacteria can transform living cells S Smooth R Rough Fred Griffith, 1920 DNA is the genetic material Oswald Avery Colin MacLeod Maclyn
More informationIntroduction to the UCSC genome browser
Introduction to the UCSC genome browser Dominik Beck NHMRC Peter Doherty and CINSW ECR Fellow, Senior Lecturer Lowy Cancer Research Centre, UNSW and Centre for Health Technology, UTS SYDNEY NSW AUSTRALIA
More informationAbout Strand NGS. Strand Genomics, Inc All rights reserved.
About Strand NGS Strand NGS-formerly known as Avadis NGS, is an integrated platform that provides analysis, management and visualization tools for next-generation sequencing data. It supports extensive
More informationBioinformatics Advice on Experimental Design
Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics
More informationPackage DBChIP. May 12, 2018
Type Package Package May 12, 2018 Title Differential Binding of Transcription Factor with ChIP-seq Version 1.24.0 Date 2012-06-21 Author Kun Liang Maintainer Kun Liang Depends R
More informationMocap: large-scale inference of transcription factor binding sites from chromatin accessibility
Published online 15 March 2017 Nucleic Acids Research, 2017, Vol. 45, No. 8 4315 4329 doi: 10.1093/nar/gkx174 Mocap: large-scale inference of transcription factor binding sites from chromatin accessibility
More informationSept 2. Structure and Organization of Genomes. Today: Genetic and Physical Mapping. Sept 9. Forward and Reverse Genetics. Genetic and Physical Mapping
Sept 2. Structure and Organization of Genomes Today: Genetic and Physical Mapping Assignments: Gibson & Muse, pp.4-10 Brown, pp. 126-160 Olson et al., Science 245: 1434 New homework:due, before class,
More informationChapter 1 Analysis of ChIP-Seq Data with Partek Genomics Suite 6.6
Chapter 1 Analysis of ChIP-Seq Data with Partek Genomics Suite 6.6 Overview ChIP-Sequencing technology (ChIP-Seq) uses high-throughput DNA sequencing to map protein-dna interactions across the entire genome.
More informationLecture 2: Biology Basics Continued
Lecture 2: Biology Basics Continued Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A-T and
More informationSupplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Wenxiu Ma 1, Lin Yang 2, Remo Rohs 2, and William Stafford Noble 3 1 Department of Statistics,
More informationFunctional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017
Functional Genomics Overview RORY STARK PRINCIPAL BIOINFORMATICS ANALYST CRUK CAMBRIDGE INSTITUTE 18 SEPTEMBER 2017 Agenda What is Functional Genomics? RNA Transcription/Gene Expression Measuring Gene
More informationChapter 24: Promoters and Enhancers
Chapter 24: Promoters and Enhancers A typical gene transcribed by RNA polymerase II has a promoter that usually extends upstream from the site where transcription is initiated the (#1) of transcription
More informationChromatin. Structure and modification of chromatin. Chromatin domains
Chromatin Structure and modification of chromatin Chromatin domains 2 DNA consensus 5 3 3 DNA DNA 4 RNA 5 ss RNA forms secondary structures with ds hairpins ds forms 6 of nucleic acids Form coiling bp/turn
More informationConvolutional Kitchen Sinks for Transcription Factor Binding Site Prediction
Convolutional Kitchen Sinks for Transcription Factor Binding Site Prediction Alyssa Morrow*, Vaishaal Shankar* Anthony Joseph, Benjamin Recht, Nir Yosef Transcription Factor A protein that binds to DNA
More informationPeakSeq enables systematic scoring of ChIP-seq experiments relative to controls
PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls Joel Rozowsky, Ghia Euskirchen 2, Raymond K Auerbach 3, Zhengdong D Zhang, Theodore Gibson, Robert Bjornson 4, Nicholas Carriero
More informationGene Expression Technology
Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene
More informationGenetics and Genomics in Medicine Chapter 3. Questions & Answers
Genetics and Genomics in Medicine Chapter 3 Multiple Choice Questions Questions & Answers Question 3.1 Which of the following statements, if any, is false? a) Amplifying DNA means making many identical
More informationMolecular Markers CRITFC Genetics Workshop December 9, 2014
Molecular Markers CRITFC Genetics Workshop December 9, 2014 Molecular Markers Tools that allow us to collect information about an individual, a population, or a species Application in fisheries mating
More informationIntro. ANN & Fuzzy Systems. Lecture 36 GENETIC ALGORITHM (1)
Lecture 36 GENETIC ALGORITHM (1) Outline What is a Genetic Algorithm? An Example Components of a Genetic Algorithm Representation of gene Selection Criteria Reproduction Rules Cross-over Mutation Potential
More informationEinführung in die Genetik
Einführung in die Genetik Prof. Dr. Kay Schneitz (EBio Pflanzen) http://plantdev.bio.wzw.tum.de schneitz@wzw.tum.de Twitter: @PlantDevTUM, #genetiktum FB: Plant Development TUM Prof. Dr. Claus Schwechheimer
More informationIntroduction to Bioinformatics and Gene Expression Technology
Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA
More informationPrimePCR Assay Validation Report
Gene Information Gene Name SRY (sex determining region Y)-box 6 Gene Symbol Organism Gene Summary Gene Aliases RefSeq Accession No. UniGene ID Ensembl Gene ID SOX6 Human This gene encodes a member of the
More informationEpigenetic CHaracterization and Observation (ECHO) Proposers Day
Epigenetic CHaracterization and Observation (ECHO) Proposers Day Eric Van Gieson, Ph.D. Program Manager Biological Technologies Office (BTO) February 23, 2018 ECHO Program Overview Determining Exposure
More informationChIPnorm: A Statistical Method for Normalizing and Identifying Differential Regions in Histone Modification ChIP-seq Libraries
ChIPnorm: A Statistical Method for Normalizing and Identifying Differential Regions in Histone Modification ChIP-seq Libraries Nishanth Ulhas Nair., Avinash Das Sahu 2., Philipp Bucher 3,4 *, Bernard M.
More informationComplete Sample to Analysis Solutions for DNA Methylation Discovery using Next Generation Sequencing
Complete Sample to Analysis Solutions for DNA Methylation Discovery using Next Generation Sequencing SureSelect Human/Mouse Methyl-Seq Kyeong Jeong PhD February 5, 2013 CAG EMEAI DGG/GSD/GFO Agilent Restricted
More informationInterpreting RNA-seq data (Browser Exercise II)
Interpreting RNA-seq data (Browser Exercise II) In previous exercises, you spent some time learning about gene pages and examining genes in the context of the GBrowse genome browser. It is important to
More informationMulti-omics in biology: integration of omics techniques
31/07/17 Летняя школа по биоинформатике 2017 Multi-omics in biology: integration of omics techniques Konstantin Okonechnikov Division of Pediatric Neurooncology German Cancer Research Center (DKFZ) 2 Short
More informationChroMoS Guide (version 1.2)
ChroMoS Guide (version 1.2) Background Genome-wide association studies (GWAS) reveal increasing number of disease-associated SNPs. Since majority of these SNPs are located in intergenic and intronic regions
More informationAnalysis of Biological Sequences SPH
Analysis of Biological Sequences SPH 140.638 swheelan@jhmi.edu nuts and bolts meet Tuesdays & Thursdays, 3:30-4:50 no exam; grade derived from 3-4 homework assignments plus a final project (open book,
More informationAssociation Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010
Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of
More informationarxiv: v1 [cs.lg] 7 Jul 2016
Bioinformatics doi.10.1093/bioinformatics/xxxxxx Advance Access Publication Date: Day Month Year Manuscript Category Subject Section DeepChrome: Deep-learning for predicting gene expression from histone
More informationIntroduction to Bioinformatics and Gene Expression Technologies
Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1 Vocabulary Gene: hereditary DNA sequence at a
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.
Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a
More informationMATH 5610, Computational Biology
MATH 5610, Computational Biology Lecture 2 Intro to Molecular Biology (cont) Stephen Billups University of Colorado at Denver MATH 5610, Computational Biology p.1/24 Announcements Error on syllabus Class
More informationLeveraging biological replicates to improve analysis in ChIP-seq experiments
, http://dx.doi.org/10.5936/csbj.201401002 CSBJ Leveraging biological replicates to improve analysis in ChIP-seq experiments Yajie Yang a,b, Justin Fear a,b, Jianhong Hu c, Irina Haecker d, Lei Zhou a,
More informationEnhancing motif finding models using multiple sources of genome-wide data
Enhancing motif finding models using multiple sources of genome-wide data Heejung Shim 1, Oliver Bembom 2, Sündüz Keleş 3,4 1 Department of Human Genetics, University of Chicago, 2 Division of Biostatistics,
More informationGenome Sequence Assembly
Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:
More informationWhat Are the Chemical Structures and Functions of Nucleic Acids?
THE NUCLEIC ACIDS What Are the Chemical Structures and Functions of Nucleic Acids? Nucleic acids are polymers specialized for the storage, transmission, and use of genetic information. DNA = deoxyribonucleic
More informationFACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY
FACTORS CONTRIBUTING TO VARIABILITY IN DNA MICROARRAY RESULTS: THE ABRF MICROARRAY RESEARCH GROUP 2002 STUDY K. L. Knudtson 1, C. Griffin 2, A. I. Brooks 3, D. A. Iacobas 4, K. Johnson 5, G. Khitrov 6,
More informationChromatin Structure. a basic discussion of protein-nucleic acid binding
Chromatin Structure 1 Chromatin DNA packaging g First a basic discussion of protein-nucleic acid binding Questions to answer: How do proteins bind DNA / RNA? How do proteins recognize a specific nucleic
More informationPersonal Genomics Platform White Paper Last Updated November 15, Executive Summary
Executive Summary Helix is a personal genomics platform company with a simple but powerful mission: to empower every person to improve their life through DNA. Our platform includes saliva sample collection,
More informationReconstruction of histone modification network from next-generation sequencing data
Reconstruction of histone modification network from next-generation sequencing data Ngoc Tu Le Japan Advanced Institute of Science and Technology 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan email: ngoctule@jaist.ac.jp
More informationDNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.
Epigenomics T TM activation SNP target ncrna validation metagenomics genetics private RRBS-seq de novo trio RIP-seq exome mendelian comparative genomics DNA NGS ChIP-seq bioinformatics assembly tumor-normal
More informationThe Human Protein PRR14 Tethers Heterochromatin to the Nuclear Lamina During Interphase and Mitotic Exit
Cell Reports, Volume 5 Supplemental Information The Human Protein PRR14 Tethers Heterochromatin to the Nuclear Lamina During Interphase and Mitotic Exit Andrey Poleshko, Katelyn M. Mansfield, Caroline
More information