Nature Genetics: doi: /ng Supplementary Figure 1

Size: px
Start display at page:

Download "Nature Genetics: doi: /ng Supplementary Figure 1"

Transcription

1 Supplementary Figure 1 Coding and noncoding transcription at the vg locus. (a,b) qpcr analysis of developmental and tissue-specific vg mrna (a) and PRE/TRE (b) transcription, shown as percentage of TBP levels. (c) Time course of vg mrna (diamonds) and PRE/TRE transcription (circles) during embryonic development. Left, scale for mrna; right, scale for the PRE/TRE transcript, both are shown as the percentage of TBP levels. Error bars show the s.e.m. of four independent cdnas prepared from at least two independent RNA preparations for each tissue or time point. 1

2 Supplementary Figure 2 Mapping of vg PRE/TRE transcripts. 2

3 (a) Transcript summary reproduced from Figure 1b. The length of each transcript is given. Colored bars indicate PRE/TRE DNA motifs. Gray box: the 1.6-kb core PRE/TRE 26 used for the transgenes (Figs. 2 4 and 6g). (b) Comparison of PRE/TRE prediction score and ChIP enrichments across the PRE/TRE and flanking regions. Genomic coordinates according to Drosophila release Dm3/R5 are shown below. The sequence included in the transgene is highlighted in gray. Top, PRE/TRE prediction score plot generated by jpredictor (Nucleic Acids Res. 34, W546 W550, 2006). Below are shown ChIP tracks for PC, E(Z) and H3K27me3. Dark gray highlights ChIP-seq results (this study) from 0- to 16-h wild-type embryos. Experiments were performed at least three times for each antibody, and representative tracks from one experiment are shown. Light gray highlights ChIP-chip results from the Sg4 derivative of Drosophila S2 cultured cells 25. (c) Key to the annotation in d of 5 ends mapped by 5 RACE for the transcripts shown in a. (d) Sequence of the 3.1-kb vg PRE/TRE region shown in a, with 5 ends shown for each transcript according to the color code in c. Each transcript has several 5 ends. The diagram in a shows the transcript with the most 5 start site (i.e., the longest transcript) in each case. These start sites are marked with the symbols outlined in black in d. 5 RACE mapping was performed at least twice on two independent cdnas for each tissue. The integrity of each transcript was confirmed by additional PCR analysis (see Supplementary Fig. 3). The 1.6-kb core PRE/TRE used for the transgenes is shown in bold type and outlined in gray. Transgenic transcripts were mapped in homozygous embryos carrying pkc27vg.fwd without induced transcription by GAL4, thus defining the transcripts that are generated by the PRE/TRE promoters carried on the transgene. Mapping of PRE/TRE transcripts in early (0- to 5-h) embryos in which no transcription from the endogenous vg PRE/TRE locus is detectable confirmed that the transgenes generate the two embryonic transcripts whose promoters are within the 1.6-kb transgene (embryo 798 and embryo 1194; see also Supplementary Fig. 3). Mapping was performed at least twice on two independent cdnas. However, in all other tissues, the background levels of endogenous transcripts precluded unambiguous mapping of the transgenic transcripts. Owing to their highly cell type specific expression, the transcripts were not detectable by genome-wide profiling methods such as RNA-seq, even in staged embryos (data not shown). They were also not detectable in published genome-wide profiling data sets, presumably for the same reasons. However, four independent methods agree on the identity and presence of transcripts: in situ hybridization (Fig. 1), 5 RACE (Supplementary Fig. 2d), primer walking (Supplementary Fig. 3b,c) and cdna cloning (Supplementary Fig. 3d). 3

4 Supplementary Figure 3 Primer walking analysis of vg PRE/TRE transcripts. (a) Transcript summary reproduced from Figure 1b. The length of each transcript is given. Colored bars indicate PRE/TRE DNA motifs. The gray box highlights the 1.6-kb core PRE/TRE 26 used for the transgenes (Figs. 2 4 and 6g). (b) The positions and names of the primers used for primer walking analysis. See Supplementary Table 1 for primer sequences. (c) Primer walking example. Left, mapping the 5 end of the endogenous embryo 798 transcript. RNA was prepared from 10- to 14-h wild-type embryos, and cdna was primed with primer 10R, to specifically amplify forward-strand transcripts. PCR was performed with 10R in combination with different reverse primers, as indicated and, as a control, on wild-type genomic DNA. Products are seen on cdna with forward primers 5F and 6F, indicating that the transcript starts downstream of primer 4F. Right, mapping the 5 end of the endogenous embryo 796 transcript. RNA was prepared from 10- to 14-h wild-type embryos, and cdna was primed with primer 12F, to specifically amplify reverse-strand transcripts. PCR was performed with primer 12F in combination with different forward primers, as indicated, and, as a control, on wildtype genomic DNA. Products are seen on cdna with forward primers 14R and 15R, indicating that the transcript starts upstream of primer 16R. 5 RACE analysis confirmed the results of primer walking and revealed several TSSs for each transcript (see Supplementary Fig. 2). The 3 ends were defined by performing the above analysis using different primers to prime the first cdna strand for each transcript. Mapping was performed at least twice on two independent cdnas. (d) The primers giving the longest transcript defined in each tissue on each strand are shown. In wing and brain, no transcripts were detected from the reverse strand. The transgenic transcripts were mapped in homozygous pkc27vg.fwd transgenic embryos in the absence of the GAL4 driver, as described in the legend to Supplementary Figure 2. 4

5 Supplementary Figure 4 Coding and noncoding vg transcription in late embryos. Double in situ hybridization on embryos showing vg mrna (red), PRE/TRE reverse-strand transcript (green) and DAPI (blue). Anterior is to the right, posterior is to the left. Scale bar, 20 µm. (a f) Single optical slices at high magnification through muscle precursors (m), wing (w) and haltere (h) primordia at 12 h (stage 15; a c) and at 14 h (stage 16; d f). vg mrna accumulates in all cells of the wing and haltere disc primordia (a,d) 15,16, while the PRE/TRE transcript is absent from wing and haltere discs but highly expressed in segment T2 and T3 muscles (m). (g l) Single optical slices at high magnification through abdominal body wall muscles in segments A1 A3, at 12 h (stage 15; g i) and at 14 h (stage 16; j l). At 12 h, vg mrna (g) and the PRE/TRE transcript (h) colocalize in nerves (arrows). The PRE/TRE is expressed strongly in muscles (m) at 12 h and 14 h (h,k), while vg mrna becomes strongly expressed in muscles at 14 h (j). Embryo in situ hybridizations were performed at least three times, and embryos were analyzed. 5

6 Supplementary Figure 5 vg mrna and PRE/TRE forward-strand transcription in embryos. Double in situ hybridization on 10-h (stage 13) embryo showing (a,c) vg mrna (red), (b,c) PRE/TRE forward-strand transcript (green) and (a c) DAPI (blue). Anterior is to the right, posterior is to the left. Lateral view, dorsal at top, ventral at bottom. Wing (w) and haltere (h) primordial, muscle precursors (m) and central nervous system (CNS) are indicated in c. vg mrna is expressed in specific cells (see also Fig. 1); the PRE/TRE forward strand is strongly detected in the head segments and in a diffuse pattern throughout the rest of the embryo. Embryo in situ hybridizations were performed at least three times, and embryos were analyzed. 6

7 Supplementary Figure 6 Coding and noncoding vg transcription in larval brain. Double in situ hybridization on 3rd instar larval brains, showing vg mrna (red), PRE/TRE forward-strand transcript (green) and DAPI (blue) (see Fig. 1b for PRE/TRE transcript orientation and in situ probe). (a c) Single optical slices showing that vg mrna and the PRE/TRE transcript are expressed in a complementary pattern. (d f) High magnification of the boxes in a c. (e) The PRE/TRE transcript is visible both in the cytoplasm and as nuclear dots (arrows). In situ hybridizations were performed at least three times, and at least five brains were analyzed. 7

8 Supplementary Figure 7 Three-dimensional DNA FISH in 3rd instar larval wing discs. The figure shows data from the individual discs that are shown as merged data in Figure 3a,b as the distribution of three-dimensional distances between the transgenic locus at site 1 and the endogenous vg locus. Transgenic flies (a) carrying a control construct (pkc27mw), (b) expressing the reverse strand (pkc27vg.rev ) or (c) expressing the forward strand (pkc27vg.fwd ) of the vg PRE/TRE were crossed to the line with the engal4 driver. engal4 is expressed in the posterior half of the wing disc (P). The anterior half of the wing disc (A) serves as a control (see Fig. 3g,l for expression patterns). For each genotype, data were normalized between discs. Note that, in Figure 3a,b, the data were further normalized between genotypes (for normalization procedures, see additional methods). The y axis shows the three-dimensional distance between the transgenic locus and the endogenous vg locus. Average nuclear radius: 3.6 µm. For two of three discs, induced transcription of the forward strand but not the reverse strand leads to a significant decrease in distances between the transgene and the endogenous locus in the posterior compartment in comparison to the anterior compartment (NS, P value 0.1, Wilcoxon rank-sum test). Note that all three discs from c are combined in Fig. 3a,b). 8

9 Supplementary Figure 8 In situ hybridization showing the effects of vg RNAi on vg mrna. In situ hybridization to detect vg mrna in the wing discs of transgenic 3rd instar larvae expressing UAS RNAi against vg under the control of the engal4 driver. Posterior is to the left, anterior is to the right. Scale bar, 100 µm. engal4 is expressed in the posterior half of each disc; see also Figure 3g,l. (a,b) Control UAS transgene without RNAi. (c f) Two examples of UAS RNAi. Red, vg mrna; blue, DAPI. Arrows indicate the borders of the domain of downregulation of vg mrna. vg mrna is only downregulated in the central domain of the disc (presumably the domain of highest expression of engal4), and the size of the domain in which maximum vg mrna downregulation occurs is different between discs. The experiment was performed once, and 10 discs were analyzed. Two representative discs are shown to illustrate the variability in vg downregulation. 9

10 Supplementary Figure 9 DELFIA assay for HMTase activity. (a) Detection of H3 peptide mono-, di-, tri- or unmethylated at lysine 27 (500 nm peptide) by DELFIA assay. The y axis shows raw fluorescence measurements. The addition of 380 nm RNA (control RNA 1 in b) leads to a slight decrease in fluorescence (columns 4 and 6) but cannot account for the decrease observed in the presence of enzyme and RNA in b. (b) Comparison of inhibition of human PRC2 HMTase activity by different molecules by DELFIA assay. Unmodified H peptide (500 nm) was incubated with 155 nm human PRC2 complex and either 10 mm of the EZH2 inhibitor GSK126 or 380 nm of different nucleic acids as indicated. RNA fwd, rev: a 1.6-kb RNA in vitro transcribed from the forward or reverse strand of the PRE/TRE shown in Figures 1b (gray box) and 2a or control bacterial RNAs of the same length (ctr1, ctr2; see additional methods). DNA vg: a 1.6-kb double-stranded DNA corresponding to the vg PRE/TRE as shown in Figures 1b (gray box) and 2a. Ctr1 and 2 DNAs: double-stranded DNAs containing the same sequence as the ctr1 and 2 RNAs. (a,b) The mean and s.e.m. of three replicates are shown. 10

11 Supplementary Figure 10 ChIP-seq on the endogenous vg locus in the presence of overexpression of PRE/TRE transcripts. ChIP-seq analysis of E(Z) and H3K27me3 in 0- to 16-h embryos overexpressing the forward- (top; pkc27vg.fwd ) or reverse- (bottom; pkc27vg.rev ) strand from the transgene at site 1. No difference in E(Z) or H3K27 profiles was detectable at the endogenous vg locus. The experiment was repeated three times for H3K27me3 on pkc27vg.fwd, once for H3K27me3 on pkc27vg.rev and once for E(Z) on both pkc27vg.fwd and pkc27vg.rev. Tracks from one representative ChIP experiment are shown. 11

12 Supplementary Table 1. Primer sequences Primers used for 5' RACE, primer walking, strand-specific cdna synthesis, qpcr and DNA FISH probes are shown. See Online Methods for details. Primers for 5' RACE and 3' primer walking gene primer name primer sequence vg PRE/TRE 1F ATTACCACATCACGTCACATCG 1R GCAACGACATGCCTCTGTG 2F CACACACAGAGGCATGTCGTT 2R CCACCCACCTCGCTCTCTA 3F TAGAGAGCGAGGTGGGTGG 3R CATGCCTGAAATGAATTGTTTG 4F CAAACAATTCATTTCAGGCATG 4R GGTGCGTGTGTGTGTGTGTT 4.1F GCCCAATAATGCAAAGAGGTT 5F AACACACACACACACGCACC 5R CCCTCCCCATCCTACATTTTT 5.3F GCCAGGTCGCTCCTAAATAC 6F AAAAATGTAGGATGGGGAGGG 6R AGCGACGACAGTGGCATTC 7F GAATGCCACTGTCGTCGCT 7R GCTCTCTGCCTCTCCCTCTC 8F GAGAGGGAGAGGCAGAGAGC 8R CGAGTGGATTTACGTGGAATTG 9F AGGATTGGGGTCGCAGAAG 9R TGCCAAGTGTGTGTGAGTGTG 10F ACATGAGCTTGCGTTTGTCAC 10R TCATTACCATATCGGGTTGTCA 11bF TTGACAACCCGATATGGTAATGA 11bR CCCGTATAAATCGAGTGGCTCT 11F TGACAACCCGATATGGTAATGA 11R CTGTAAATGGCTGTGGATTGG 12F AAAAGCAAAGTCCGAGGCAAT 12R GATGGCTGATGGCTAAAGTTGA 13F TCAACTTTAGCCATCAGCCATC 13R TTCACTCGCTTTTCTTTTCCACT 14F AGTGGAAAAGAAAAGCGAGTGAA 14R GCAACCTTATGCCCATTGAAC 15F GTTCAATGGGCATAAGGTTGC 15R CTCCCAAGGTGCTCTACTTCCA 16F TGGAAGTAGAGCACCTTGGGAG 16R CCGGCTTACCTGATTTACCTG 17F CAGGTAAATCAGGTAAGCCGG 17R TTAGGCGAGGGGTAAAAAGGA Primers for strand specific cdna synthesis gene primer name primer sequence anneals to

13 TBP TBP_rev GAAACCGAGCTTTTGGATGAT + strand vg PRE/TRE core PRE_fwd TTGACAACCCGATATGGTAATGA - strand vg PRE/TRE core PRE_rev CCCGTATAAATCGAGTGGCTCT + strand vg PRE/TRE flank Flank_fwd TTTCCGTCAGTGTACGCTCTT - strand miniwhite reproter mw_rev ATTTGATGTTGCAATCGCAGT + strand vg mrna vg_mrna_rev GAGGGAAGGGGGCTAAAACTT + strand Timp Timp_rev AATCCACGCGATTTGGTAAG + strand GFP GFP_rev GAACTCCAGCAGGACCATGT + strand Primers for qpcr gene primer name primer sequence TBP TBP_fwd CATCGTGTCCACGGTTAATCT TBP_rev GAAACCGAGCTTTTGGATGAT vg PRE/TRE core PRE_fwd TTGACAACCCGATATGGTAATGA PRE_rev CCCGTATAAATCGAGTGGCTCT vg PRE/TRE flank Flank_fwd TTTCCGTCAGTGTACGCTCTT Flank_rev CTCAGGTCCCGCGAGTTAT miniwhite reporter mw_fwd TAAGCAGGGGAAAGTGTGAAA mw_rev ATTTGATGTTGCAATCGCAGT vg mrna vg_mrna_fwd AAATTACCCACGAACGGGAAT vg_mrna_rev GAGGGAAGGGGGCTAAAACTT Timp Timp_fwd AGCAATCCGTAAACCAATGC Timp_rev AATCCACGCGATTTGGTAAG GFP GFP_fwd TATATCATGGCCGACAAGCA GFP_rev GAACTCCAGCAGGACCATGT primers for DNA FISH probes locus primer name primer sequence vestigial locus 1F ATT TTG TCG CTG GTT CGT CT 1R ATA ATC ACG CCG CAA TCT TC 2F TCG CGT TTT CCA TCA ACA TA 2R GAA CGA ACG CGT GTA AAC AA 3F GGA AGA GCG CGA TAG TCA TC 3R AGT TTG GGG AAG CAC AAG AA 4F CTG GCG ATG CCT CTT AAT GT 4R CCT ACC GCT GCT ATC TCG TC 5F GTA ACC GAG CCA CAG AGT CC 5R TGG TGC AGC CTG TAA GAC AC 6F TAA TAA GGC GCC TGG CTA AA 6R ATG GCA GGT GAG CCT ATG AG 7F TCC TGT GAC TTC TGG CAT TG 7R GCC GAG TGG AGT CAA AGT TC site 1 1F TGA AAA CCC ACT CGT CTT CC 1R CGT AGG GAA CTG GTG TGT CC 2F CCA GAG ATC AGC GAG ATT CA 2R CTC CAT TCG CGC TTT ATG AT 3F AAG GCT TTA AGC ACC TGC AA

14 3R 4F 4R 5F 5R 6F 6R 7F 7R 8F 8R CAC CAG CGT GAA GTA GAC GA CGA TCA CGA ATT GCT CTC AA AAA ATG CAC TCG GGA TTG AC GCT GAG GAT TCC CAG AAT GA TCG ATC TTT GGC ACA GTG AG TTT CCG CGT AGA ATC AAA CC GCA GTT CTT CCC TGT TCA GC CAG CTG ACT TTA GCG GTT CC ACT TTT CCA AAA TGC CAA CG GGC ATC GAC ATT GAA CAC AC TAA ATC CGC CCA AGA GTG TC site 2 1F TCC GTA TCC GAC TGA CAT GA 1R GCA CCG AAG ATT ATC CCG TA 2F TCG ATT GAG GAA TGG TAG CC 2R CTG TCA TGA AGC AGG AAC GA 3F CTG CAT GAG CAT CGG TCT TA 3R ATA AAC CGG CAG CTG TTG TC 4F AAG CAA AGA TGG GAA AGC AA 4R TTA ACC TGA ACC CCA ACT GC 5F TTA GCT GAC GGC TCA TGT TG 5R GAG GTT GTG GTG GTC TGG AT 6F ATG TTT TCG TAT CGC CCA AG 6R TCA CCA CCA CCC CAT AGT TT 7F GAA AAG CAT GCG TTG CTA CA 7R CAG CGT ATC CCT AGG CTG AG primers for RNA in situ probes locus primer name primer sequence vg mrna vg.fwd AATGTTTGCTGCTCGTTT vg mrna vg.rev TCAAGTCAAGGCGTGTTT vg PRE/TRE PRE.fwd AAAAATGTAGGATGGGGAGGG vg PRE/TRE PRE.rev GATGGCTGATGGCTAAAGTTGA primers to generate plasmid templates for RNA in vitro transcription plasmid of origin primer name primer sequence pcr-xl-topo plasmid ctr1.fwd TAGCTTGCAGTGGGCTTACA pcr-xl-topo plasmid ctr1.rev ACGTGTCAGTCCTGCTCCTC pwalium20 plasmid ctr2.fwd TACGAGCCGGAAGCATAAAG pwalium20 plasmid ctr2.rev ACACTGCGGCCAACTTACTT

15 Supplementary Table 2. PcG bound regions in fly and mouse that show convergent or divergent transcripts. The Table is available online as an excel sheet. Genomic coordinates of PcG bound regions with identity of closest gene. See Online Methods and legend in the Table for details. (a) mouse_cage. Mouse ES cell PcG peaks from 34, that overlap with convergent or divergent CAGE tags from the FANTOM3 CAGE dataset 32. In mouse, the CAGE data provides tags from multiple developmental stages and tissues, and the majority of PcG bound peaks have multiple tags. Individual tags are not listed in the table and the coordinates of these are available on request. (b) Fly_modENCODE_MACE. Fly PcG peaks from 21, that overlap with convergent or divergent MACE tags from 20. (c) Fly_modENCODE_MACE_tags. Fly PcG peaks from 21, that overlap with convergent or divergent MACE tags from 20, showing all MACE tags overlapping with each peak. (d) Fly_modENCODE_CAGE. Fly PcG peaks from 21, that overlap with convergent or divergent CAGE tags from 31. (e) Fly_modENCODE_CAGE_tags. Fly PcG peaks from 21, that overlap with convergent or divergent CAGE tags from 31, showing all CAGE tags overlapping with each peak. The source of tag data is listed as "peaks" (indicating data from CAGE peaks in supplemental Data file 4 of 31 ),or "file 3" (indicating RACE and other annotated transcription start sites from supplemental Data file 3 of 31 ). (f) Fly_Enderle_MACE. Fly PcG peaks from 20, that overlap with convergent or divergent MACE tags from the same study. (g) Fly_Enderle_MACE_tags. Fly PcG peaks from 20, that overlap with convergent or divergent MACE tags from the same study, showing all MACE tags overlapping with each peak. (h) Fly_Enderle_CAGE. Fly PcG peaks from 20, that overlap with convergent or divergent CAGE tags from 31. (i) Fly_Enderle_CAGE_tags. Fly PcG peaks from 20, that overlap with convergent or divergent CAGE tags from 31, showing all CAGE tags overlapping with each peak.

16 Supplementary Table 3. Comparison of ChIP and RNA tag overlaps a) Fly: ChIP peaks from 21 or 20 compared with MACE tags from 20 or CAGE tags from 31. Mouse: Suz12, Jarid1 and Jarid2 ChIP peaks from 34, Sox2 ChIP peaks from 33, compared with CAGE tags from the FANTOM3 CAGE dataset 32. % of total peaks overlapping with tags in the categories shown are indicated. Convergent tags: tags on opposite strands are transcribed towards each other. Divergent tags: tags on opposite strands are transcribed away from each other (see Online Methods for details). The mouse CAGE tag data contains information from multiple tissues. The fly MACE data contains information from embryos and S2 cells (combined). The fly CAGE data contains information from embryos only. See also Online Methods, and Supplementary Table 2. (b) CAGE tag status of PcG ChIP peaks at promoters was compared to that of non-pcg bound promoters. Fly: data as for (a). Mouse, combined Jarid2 and SUZ12 data from 34 were used for PcG peaks. c) Significance of overlaps shown in figure 7c. Fly: modencode vs. CAGE; mouse vs. CAGE. The frequency of occurrence of the category of interest (convergent or divergent) compared to all other categories including no transcription, was compared for pairs of data sets as indicated. p- values (Yates Chi-squared test, one sided). In all cases except those marked *, the category of interest was more abundant in PcG than non-pcg bound sites. See also Supplementary Table 2 and online methods. a) Overlap between ChIP peaks and MACE or CAGE tags. Fly vs MACE no mono divergent convergent total Caudal Hairy Enderele PcG modencode PcG Fly vs. CAGE no mono divergent convergent total Caudal Hairy Enderele PcG modencode PcG Mouse vs. CAGE no mono divergent convergent total Sox Jarid Suz Jarid b) Overlap between promoters and MACE or CAGE tags. Fly vs MACE no mono divergent convergent total Enderle, non PcG Enderle, PcG bound modencode, non PcG modencode, PcG bound

17 Fly vs. CAGE no mono divergent convergent total Enderle, non PcG Enderle, PcG bound modencode, non PcG modencode, PcG bound Mouse vs. CAGE no mono divergent convergent total non PcG PcG bound c) Significance of overlaps shown in Figure 7c. Category comparison divergent convergent All ChIP peaks: fly PcG - Hairy 2.6 x x 10-6 PcG - Caudal 8.3 x *1.4 x 10-7 All ChIP peaks: mouse Jarid2 - Sox2 6.4 x x Jarid2 - Jarid1 *4.4 x x promoters: fly PcG - non PcG 2.7 x x 10-1 promoters: mouse PcG - non PcG 7.6 x x 10-32

18 Supplementary Movie 1. 3D animation of vg mrna showing down-regulation upon ncrna expression The movie is available online. The animation shows 3D reconstruction of Z-stack from double in situ hybridizations shown in Figure 3l-n. The animation shows a transgenic 3rd instar larval wing disc over-expressing the forward strand from (pkc27vg.fwd') under the control of the engal4 driver, which expresses in the posterior half of the disc. At the start of the animation, posterior is to the left, anterior is to the right. The domain of engal4 expression was defined in 3D using the ncrna signal, and is shown in green at the start of the animation, with the remaining volume shown in blue. As the animation plays, first the blue, then the green mask is removed, showing the in situ signals: green, forward ncrna; red, vg mrna; blue, DAPI. Subsequently, the ncrna and DAPI channels are removed, leaving the vg mrna signal in the red channel. The image is then rotated to show the downregulation of vg mrna in the posterior part of the disc. See also Figure 3l-p.