Non-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18

Size: px
Start display at page:

Download "Non-coding Function & Variation, MPRAs. Mike White Bio5488 3/5/18"

Transcription

1 Non-coding Function & Variation, MPRAs Mike White Bio5488 3/5/18

2 Outline MONDAY Non-coding function and variation The barcode Basic versions of MRPA technology WEDNESDAY More varieties of MRPAs Some key results

3 Key Questions What counts as good evidence that something is an enhancer/causal non-coding variant? Is [MRPA experiment x] good evidence of enhancer function/non-coding variant effect?

4 Question 1: Which elements of the genome are functional? Two numbers:

5 ~8% of the Human Genome is Under Selective Constraint Ponting C, Biological function in the twilight zone of sequence conservation. BMC Biol Aug 16;15(1):71 Rands CM, et al, 8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage. PLoS Genet Jul 24;10(7):e

6 The vast majority (80.4%) of the human genome participates in at least one biochemical RNA and/or chromatin associated event in at least one cell type. The ENCODE Project Consortium, An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature Sep 6; 489(7414):

7 Two Numbers ~8% of the human genome is under selective constraint (~1.1 % protein coding) 80.4% of the human genome is active in an ENCODE assay in 147 cell types Which genome sequences encode enhancers?

8 Genome is packed with potential TF binding sites In a 1-kb segment of human DNA it is predicted that a new 7-8 bp protein-binding motif arises, by neutral evolution, on average every 60,000 years. Ponting C, Biological function in the twilight zone of sequence conservation. BMC Biol Aug 16;15(1):71

9 Characteristics of an Enhancer Heinz, et al. Nature Reviews Molecular Cell Biology volume 16, p (2015).

10 Beware of affirming the consequent! A classic logical fallacy: Premise: If A then B Observation: B is true Conclusion: A is true Enhancers exhibit bidirectional transcription. This sequence is bidirectionally transcribed. Therefore this sequence is an enhancer.

11 Bidirectional Transcription Not Specific to Enhancers ENHANCERS NON-ENHANCER STATE NO CHROMATIN MARKS Young RS, et al. Bidirectional transcription initiation marks accessible chromatin and is not specific to enhancers.

12 Question 2: Which genetic variants are causal?

13 How do you test the effect of a variant on enhancer function?

14 We need a method to: Test the function of tens of thousands of putative enhancer sequences Test functional effect of >> tens of thousands of putative enhancer variants

15 Reporter genes!

16 Classic Reporter Gene Minimal promoter Cis-regulatory element Reporter gene CRE DsRed

17 The Barcode Minimal promoter Cis-regulatory element Reporter gene Unique DNA Barcode (9bp) mrna CRE DsRed BC

18 Barcoded Reporter Library & MPRA CRE DsRed BC CRE DsRed BC CRE DsRed BC pooled library of 10^5 or more distinct reporters What are the advantages and limitations?

19 Two Technical Problems Problem 1: Where does your CRE DNA come from? DNA synthesis Genomic fragments Targeted regulome capture Problem 2: How do you read out the reporter? Synthetic barcodes Self-transcribing enhancers Sort-seq (fluorescence/flow cytometry + DNA sequencing)

20 Problem 1: Source of DNA

21 DNA Synthesis

22 Custom Array Oligo Synthesis SpeI SphI DsRed SpeI SphI SpeI SphI CRE BC CRE BC 180 bp Patwardhan, et al. Nat Biotechnol Feb 26;30(3): Melnikov, et al. Nat Biotechnol Feb 26;30(3):271-7 Kwasnieski, Mogno, et al. PNAS 109: (2012)

23 DNA Synthesis Extra sequencing step to link barcodes with CREs

24 Synthesized DNA Advantages? Disadvantages?

25 Sheared Genomic DNA (STARR-seq) Arnold, et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq, Science Mar 1;339(6123):1074-7

26 Sheared Genomic DNA (STARR-seq) Arnold, et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq, Science Mar 1;339(6123):1074-7

27 Sheared Genomic DNA Advantages? Disadvantages?

28 Targeted CRE Capture 1) Design biotinylated bait sequences 2) Pull down & clone targets Shen, et al. Genome Res Feb;26(2):238-55

29 Problem 2: Measuring Reporter Activity

30 The Barcode Minimal promoter Cis-regulatory element Reporter gene Unique DNA Barcode (9bp) mrna CRE DsRed BC

31 RNA-seq on Synthetic Barcodes Array-synthesized library Expression CRE DsRed BC BC RNA Transfection into cells, BC DNA CRE DsRed BC Sequence BCs BC RNA BC DNA CRE DsRed BC BC RNA BC DNA

32 CRE serves as its own barcode Arnold, et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq, Science Mar 1;339(6123):1074-7

33 FACS & Sequencing Sharon, et al. Nat Biotechnol Jun; 30(6): Kinney, et al. Proc Natl Acad Sci U S A May 18; 107(20):

34 What We ve Covered So Far Need for an assay to directly test function of non-coding sequences & variants at scale. The trick is to use barcodes Different ways to construct reporter libraries Three primary ways of reading out reporter activity

35 Three MPRA Examples

36 Example 1: Do enhancer marks predict reporter activity? Test of 1200 cell-type specific ENCODE-predicted enhancers in K562 cells How would you design this experiment?

37 Key Results Enhancers Weak Enhancers Activity (log2 RNA/DNA) 26% of predicted enhancers showed activity above scrambled controls Weak enhancers more likely to be active

38 Example 2: Which Non-coding Enhancer Variants Affect Function? Test putative 2700 CREs & variants near 75 GWAS tag SNPs Goal is to find SNPs that alter CRE activity How would you design this experiment?

39 Key Results 4% of tested CREs were highly active

40 Key Results 32 out of 2,756 variants in 23 out of 75 GWAS regions showed statistically significant fold change between major and minor alleles.

41 Example 3: Directly Measure Enhancer Activity Across Genomes STARR-seq to produce tracks of enhancer activity across small Drosophila genome How would you design the experiment?

42 Key Results

43 Key Results 5499 enhancer activity peaks in 169 Mb Drosophila genome 69% of strong peaks in DHS sites