Supplementary Figure 1 Infiltrating immune cells do not differ between T2E and non-t2e samples and represent a small fraction of total cellularity. (a) ESTIMATE immune score from mrna expression array data for non-t2e and T2E samples. (b) Estimated percentage of tumor infiltrating lymphocytes within each tumor sample as assessed by H&E staining.
Supplementary Figure 2 ChIP-seq signal distribution across the genome of primary prostate tissue samples. (a) Distribution of H3K27ac enriched cis-regulatory elements across promoters, introns, exons, downstream/3 UTR, and promoter/5 UTR elements in each of the 19 samples profiled by ChIP-seq. Snapshot of H3K27ac levels surrounding (b) the housekeeping ACTB gene and (c) prostate-specific KLK3 gene.
Supplementary Figure 3 Classification of samples into T2E and non-t2e on the basis of ERG expression. (a) Complete linkage hierarchical clustering of HOXB13, AR, ERG, and FOXA1 mrna expression values in 17 CPCG tumor samples. (b) Immunohistochemical staining of ERG in 6 T2E and 5 non-t2e CPCG samples.
Supplementary Figure 4 T2E-up and T2E-down elements are proximal to known ERG targets and genomic regions previously not classified as T2E specific. Snapshots of H3K27ac levels in proximity to (a) LAMC2, (b) LINC00898, (c) POTEB, and (b) HTR7 genes.
Supplementary Figure 5 Promoter and enhancer acetylation changes between T2E and non-t2e primary prostate tumors are significantly predictive of one another and of associated gene expression changes. Volcano plots of log 2 fold change vs -log 2 FDR-corrected q-value for statistical significance of differential H3K27ac ChIP-seq signal between T2E and non-t2e tumors at enhancer regions linked to (a) T2E-up promoters or (b) T2E-down promoters. Volcano plots of log fold change vs -log FDR corrected q-value for statistical significance of matched differential mrna expression between T2E and non-t2e tumors for mrna associated with genes whose promoters (c) or predicted enhancers (d) show differential H3K27ac ChIP-seq signal between T2E and non-t2e tumors.
Supplementary Figure 6 ERG recruits prostate master transcription factors to T2E-up elements.
(a) Motif enrichment for selected motifs in cis-regulatory elements that were not significantly different between T2E and non-t2e tumors. (FKH: Forkhead; ARE: androgen response element; ETS: E26 transformation specific; HBOX: homebox). (b) Western blot of ERG, FOXA1, and HOXB13 following depletion of ERG with sirna in VCaP cells. Vinculin was blotted as a loading control. (sictl: non-targeting sirna; sierg: ERG targeting sirna). (c) ChIP-seq signal plots of average HOXB13 tag density in VCaP cells treated with control non-targeting sirna or ERG targeting sirna at HOXB13/ERG co-bound T2E-up elements, (d) HOXB13 bound T2E-up elements not ERG bound, (e) HOXB13 bound T2Edown elements, and (f) HOXB13 peaks that are not T2E-up, T2E down, and not bound by ERG. (g) ChIP-seq plots of average FOXA1 tag density in VCaP cells treated with control non-targeting sirna or ERG targeting sirna at FOXA1/ERG co-bound T2E-up elements, (h) FOXA1 bound T2Eup elements not ERG bound, (i) FOXA1 bound T2E-down elements, and (j) FOXA1 peaks that are not T2E-up, T2E down, and not bound by ERG. (k). Co-immunoprecipitation of FOXA1, ERG, and HOXB13 transcription factors in VCaP cells. Arrows indicate expected size of protein on blot. (IP: immunoprecipitation; WB: western blot).
Supplementary Figure 7 ERG colocalizes with FOXA1, HOXB13 and AR at T2E-up elements. Snapshot depicting H3K27ac ChIP-seq signal in primary tumors and VCaP cells as well as FOXA1, HOXB13, AR and ERG binding in VCaP cells at seven loci tested (Fig. 3g) by ChIP-reChIP-qPCR.
Supplementary Figure 8 COREs surround prostate master transcription factors, and CORE deregulation in T2E prostate cancer corresponds to T2E-specific gene expression changes. Snapshot depicting H3K27ac ChIP-seq signal in T2E and non-t2e primary prostate cancer for (a) FOXA1 and (b) HOXB13. (c) Volcano plot of log 2 fold change vs -log 2 FDR-corrected q-value for statistical significance of differential expression from MSKCC dataset of all genes that were overlapping a T2E-up or T2E-down CORE.
Supplementary Figure 9 CRISPR-mediated deletions of the TMPRSS2 ERG CORE. (a) Snapshot depicting H3K27ac ChIP-seq signal in primary tumors, VCaP cells, and HUVEC cells as well as master transcription factor binding in VCaP cells across the control AAVS1 locus in the PPP1R12C gene. (b) PCR blots to verify deletions of EC1, EC2, EC3, and control loci. PCR products running below expected size products represent deletion fragments. (*: non-specific PCR product).
Supplementary Figure 10 Cis-regulatory elements affected by ERG enrich for genes involved in NOTCH and developmental signaling pathways. log 10 FDR values for top ten GO biological processes terms in (a) VCaP sierg H3K27ac down cis-regulatory elements, (b) primary tissue defined T2E-up cis-regulatory elements, and (c) primary tissue defined T2E-up COREs.
Supplementary Figure 11 Key NOTCH pathway genes associated with T2E-up cis-regulatory elements and COREs are overexpressed in T2E prostate cancer. Log 2 mrna array expression values for T2E and non-t2e samples within the entire CPCG cohort for (a) HES1, (b) JAG1, and (c) DLL1.
Supplementary Figure 12 BMP and Nodal signaling genes do not show essentiality in a T2E-specific manner. Average gene level Z-score in the Achilles project dataset from (a) BMP and (b) Nodal signaling pathway genes in 22Rv1and VCaP cells. Distributions were obtained from 1,000 random gene sets.
Supplementary Figure 13 Saturation analysis of called H3K27ac peaks stratified by T2E status suggests we have identified at least 95% of potential peaks. A non-linear regression model was fit to the number of new peaks obtained sequentially over all samples for 1,000 permutations of sample orders and stratified according to (a) T2E and (b) nont2e status. Individual points show mean new peaks per sample added and associated standard error of the mean fitted blue line shows fitted model with saturation line (red) marked as well as estimated number of samples needed to identify 95, 97 and 99% of peaks (grey lines).