Systematic comparison of CRISPR/Cas9 and RNAi screens for essential genes

Similar documents
High-throughput functional genomics using CRISPR Cas9

Gene Expression Data Analysis (I)

A comparison of methods for differential expression analysis of RNA-seq data

Improving CRISPR-Cas9 Gene Knockout with a Validated Guide RNA Algorithm

Supplementary Figure 1

A permutation-based non-parametric analysis of CRISPR screen data

Genome Sequence Enables Systematic Approaches. Genome Sequence. High-throughput studies:

Supplementary Fig. 1 related to Fig. 1 Clinical relevance of lncrna candidate

SUPPLEMENTAL MATERIALS

Analysis of barcode sequencing

Machine learning applications in genomics: practical issues & challenges. Yuzhen Ye School of Informatics and Computing, Indiana University

EECS730: Introduction to Bioinformatics

SUPPLEMENTARY INFORMATION

MISSION shrna Library: Next Generation RNA Interference

Genome-Wide Association Studies (GWAS): Computational Them

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material

Adding CRISPR to Your Bio-ARROW Protocol

Dharmacon TM solutions for studying gene function

Axiom mydesign Custom Array design guide for human genotyping applications

Supplementary materials

SNP calling and VCF format

Genetics of extreme body size evolution in mice from Gough Island

Gene Signal Estimates from Exon Arrays

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Roche Molecular Biochemicals Technical Note No. LC 10/2000

Variant calling in NGS experiments

Supplementary Figure 1: sgrna library generation and the length of sgrnas for the functional screen. (a) A diagram of the retroviral vector for sgrna

Online Model Evaluation in a Large-Scale Computational Advertising Platform

Life Cycle Assessment A product-oriented method for sustainability analysis. UNEP LCA Training Kit Module f Interpretation 1

SUPPLEMENTARY INFORMATION

Protein-Protein-Interaction Networks. Ulf Leser, Samira Jaeger

RNA spike-in controls & analysis methods for trustworthy genome-scale measurements

computational analysis of cell-to-cell heterogeneity in single-cell rna-sequencing data reveals hidden subpopulations of cells

QuantStudio 3D Digital PCR System

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary

BIOINFORMATICS ORIGINAL PAPER

Figure S1: NUN preparation yields nascent, unadenylated RNA with a different profile from Total RNA.

Learning Objectives. Define RNA interference. Define basic terminology. Describe molecular mechanism. Define VSP and relevance

Determining NDMA Formation During Disinfection Using Treatment Parameters Introduction Water disinfection was one of the biggest turning points for

SECTION 11 ACUTE TOXICITY DATA ANALYSIS

CRISPR/Cas9 Mouse Production

Introduction to Next Generation Sequencing (NGS) Data Analysis and Pathway Analysis. Jenny Wu

CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer

Sort-seq under the hood: implications of design choices on largescale characterization of sequence-function relations

Machine Learning in Computational Biology CSC 2431

Discovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks

THE LEAD PROFILE AND OTHER NON-PARAMETRIC TOOLS TO EVALUATE SURVEY SERIES AS LEADING INDICATORS

Exploring the Genetic Basis of Congenital Heart Defects

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

University of Groningen. The value of haplotypes Vries, Anne René de

Analysis of a Tiling Regulation Study in Partek Genomics Suite 6.6

measuring gene expression December 5, 2017

Why learn sequence database searching? Searching Molecular Databases with BLAST

Characterization of Allele-Specific Copy Number in Tumor Genomes

Lecture 7: April 7, 2005

On polyclonality of intestinal tumors

Introduction to RNA sequencing

RNA-Seq analysis using R: Differential expression and transcriptome assembly

Choon-Kong Yap 1, Birgit Eisenhaber 1, Frank Eisenhaber 1,2* and Wing-Cheong Wong 1*

August 24, Jean-Philippe Mathevet IAQG Performance Stream Leader SAFRAN Paris, France. Re: 2014 Supplier Performance Data Summary

SureSilencing sirna Array Technology Overview

Database Searching and BLAST Dannie Durand

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Genome Biology and Biotechnology

Nature Structural & Molecular Biology: doi: /nsmb Supplementary Figure 1

Application Note. NGS Analysis of B-Cell Receptors & Antibodies by AptaAnalyzer -BCR

Human linkage analysis. fundamental concepts

BST227 Introduction to Statistical Genetics. Lecture 8: Variant calling from high-throughput sequencing data

Validating, Verifying, and Evaluating Your Test Methods: It s NOT a Regulatory Exercise!

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

DEPArray Technology. Sorting and Recovery of Rare Cells

Gap hunting to characterize clustered probe signals in Illumina methylation array data

Gene Annotation and Gene Set Analysis

Cancer Genetics Solutions

Analysing the Immune System with Fisher Features

Credit Card Marketing Classification Trees

University of California, Berkeley

Thermo Scientific Dharmacon SMARTvector 2.0 Lentiviral shrna Particles

Adaptive Mechanism Design: A Metalearning Approach

Highly Confident Peptide Mapping of Protein Digests Using Agilent LC/Q TOFs

Disclaimer This presentation expresses my personal views on this topic and must not be interpreted as the regulatory views or the policy of the FDA

Project 2 - β-endorphin Levels as a Response to Stress: Statistical Power

Prioritization of Neoantigens without Predictions: Comprehensive T cell Screening using ATLAS

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

User Bulletin. Veriti 96-Well Thermal Cycler AmpFlSTR Kit Validation. Overview

Genomic models in bayz

A Propagation-based Algorithm for Inferring Gene-Disease Associations

General aspects of genome-wide association studies

Supporting Information

LS50B Problem Set #7

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput

Network System Inference

Harbingers of Failure: Online Appendix

UNIVERSITY OF TORINO. Department of Clinical and Biological Sciences. Doctoral School in Complex Systems in Medicine and Life Sciences

ARTICLE Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS

Lees J.A., Vehkala M. et al., 2016 In Review

Exam 1 Answers Biology 210 Sept. 20, 2006

Technical note: Molecular Index counting adjustment methods

Estimation, Forecasting and Overbooking

Generative Models for Networks and Applications to E-Commerce

Transcription:

CORRECTION NOTICE Nat. Biotechnol. doi:10.1038/nbt. 3567 Systematic comparison of CRISPR/Cas9 and RNAi screens for essential genes David W Morgens, Richard M Deans, Amy Li & Michael C Bassik In the version of this supplementary file Supplementary Data 2 originally posted online, the Cas9 plasmid sample counts were incorrectly labeled as Cas9 T14 replicate 2 and vice versa. The error has been corrected in this file as of 13 March 2018. NATURE BIOTECHNOLOGY

Supplementary Figure 1 Distribution of targeting and control elements. (a) Distribution of negative controls for a single replicate of Cas9 and shrna screens. Enrichments are calculated as a mediannormalized log ratio of counts. (b,c) Distribution of targeting elements is shown in meta-gene plots for the top 50 (b) enriched and (c) disenriched genes found in a single replicate of the Cas9 and shrna screens as identified by castle. To normalize, the enrichment of each individual element was divided by the effect size estimate for the gene generated by castle. The dotted line is placed at the estimated effect size and normalized to one.

Supplementary Figure 2 Distribution of targeting sgrnas for top disenriched genes. (a-d) Enrichment of targeting elements and estimated effect size is shown for the top four disenriched genes from Cas9 data from a single replicate. Enrichments are calculated as a median-normalized log ratio of counts. Gray lines represent the smoothed distribution of non-targeting controls. Red vertical lines represent enrichment of individual targeting guides towards indicated genes. Vertical dotted line represents effect size estimate from castle. Red distribution is a smoothed distribution of guides targeting the genes indicated.

Supplementary Figure 3 Distribution of targeting sgrnas for top disenriched genes. (a-d) Enrichment of targeting elements and estimated effect size is shown for the top four disenriched genes from shrna data from a single replicate. Enrichments are calculated as a median-normalized log ratio of counts. Gray lines represent the smoothed distribution of non-targeting controls. Blue vertical lines represent enrichment of individual targeting hairpins towards indicated genes. Vertical dotted line represents effect size estimate from castle. Blue distribution is a smoothed distribution of hairpins targeting the genes indicated.

Supplementary Figure 4 castle provides a statistical framework to account for high-throughput screens. The unknown relationship between gene dosage and measured phenotype as well as the unknown distribution of shrna and Cas9 efficacies restricts the predicted effect size of reagents to a bounded region, marked as the blue shaded region, between 0 and the maximum effect I, marked by the dotted line. Some fraction (1-θ) of the reagents have no on-target effect at all. The phenotype observed is thus the true effect obscured by noise, which is estimated using the distribution of non-targeting controls. The likelihood of models for different values of I and θ are calculated and by marginalizing θ the most likely effect size is selected. A likelihood ratio is then calculated by comparing to a null model where I is zero.

Supplementary Figure 5 Reanalysis of previous screens. (a) Results are shown for a previously published shrna screen for ricin sensitivity reanalyzed with castle and compared to published results based on a MW test 1. (b) Previous CRISPR/Cas9 deletion screen for LPS-induced TNF expression in primary mouse bonemarrow derived dendritic cells, analyzed with castle and the published DESeq results 16. (c) Previous CRISPRi screen for sensitivity to the fusion toxin CTx-DTA, analyzed with castle versus the average of the top three sgrna effects 17. (d) Previous CRISPRa screen for sensitivity to the fusion toxin CTx-DTA, analyzed with castle versus the average of the top three sgrna effects 17.

Supplementary Figure 6 Comparison of castle to other methods. (a-d) ROC curves indicate screen performance in identifying essential genes from changing composition between the plasmid library and two weeks growth. True positive rates and false positive rates are calculated using a previously established gold standard set of essential and nonessential genes 15. Genes are ranked by likelihood to be essential using the indicated methods, including castle. Highest effect heuristic was calculated by ranking the genes according to their most disenriched element. Data is shown from single replicates of the (a,c) Cas9 and (b,d) shrna screens for (a,b) replicate 1 and (c,d) replicate 2.

Supplementary Figure 7 Performance of combination of shrna and Cas9 data. (a) ROC curves from combination of different replicates of Cas9 and shrna using castle. ROC curves indicate screen performance in identifying essential genes from changing composition between the plasmid library and two weeks growth. True positive rates and false positive rates are calculated using a previously established gold standard set of essential and nonessential genes 15. (b) Combination score has high reproducibility. A large positive castle score indicates a high confidence increase in growth rate, while a highly negative castle indicates a high confidence decrease in growth rate, i.e. gene essentiality. The graphs compare replicate measurements of likelihood ratio between plasmid and T14 of the combination score based on replicates 1 for Cas9 and shrna and replicates 2 for Cas9 and shrna. Density is in log scale.

Supplementary Figure 8 Comparison of castle combination to castle analysis of single screens. (a) ROC curves indicate screen performance in identifying essential genes by comparing the library composition between the plasmid library and cells after two weeks growth. ROC curves for Cas9 (red) and shrna (blue) screens based on duplicate data combined using castle. Alternatively, data from single replicates of both Cas9 and shrna screens were combined using castle (purple). (b) The number of essential genes at 10% false positive rate and their overlap based on the duplicate data from Cas9 and shrna screens, as well as combination of a single replicate from both screens. False positive rate was estimated using gold standard nonessential genes. (c) Precision recall curve for Cas9, shrna, and combination data using castle.

Supplementary Figure 9 Comparison to an in silico four-shrna-per-gene library. Results from the 25 shrna library were downsampled by only including four hairpins per gene, selected by previous computational ranking. (a) ROC curves indicate screen performance in identifying essential genes by comparing the library composition between the plasmid library and cells after two weeks growth. (b) The number of essential genes at 10% false positive rate and their overlap based on the duplicate data from Cas9 and shrna screens, as well as combination of a single replicate from both screens. (c) Comparison of castle scores derived from castle between single replicates of Cas9 and shrna data. (d) Adjusted p-values for select GO terms for shrna and Cas9 screens as well as for data from both screens combined with castle.

Supplementary Figure 10 Screen reproducibility and time-dependence of phenotypes. (a,b) shrna and Cas9 screens have high reproducibility. A large positive castle score indicates a high confidence increase in growth rate, while a highly negative castle score indicates a high confidence decrease in growth rate, i.e. gene essentiality. The graphs compare replicate measurements of castle scores between plasmid and T14 for (a) Cas9 and (b) shrna screens. Density is in log scale. (c,d) Time dependence of phenotypes. castle scores in different time-frames for (c) Cas9 and (d) shrna screens.

Supplementary Figure 11 Analysis of gene expression and yeast essential homologs. Genesets are defined for Cas9, shrna, and Combination by a 10% FPR cutoff. Genesets are defined for Cas9-combo and shrnacombo by the genes present in Cas9 or shrna set and not in the Combination set. Overlap set is defined as genes present in both the Cas9 and shrna set (See Supplementary Fig. 8b). (a,b) ~7,000 genes with detectable expression in K562 were binned by expression. The fraction of genes identified as essential in each bin is reported versus the average expression level of the bin. (c,d) Fraction of genes that are homologs of essential yeast genes versus genes that are homologs of nonessential yeast genes. P-values calculated using Fisher s exact test.

Supplementary discussion Heterogeneity in screening reagents Heterogeneity of reagents has historically been associated with poor performance in RNAi-based screens 9,10. That is, while a cell population with a given shrna should have low phenotypic variability, two different shrnas targeting the same gene are not expected to have the same knockdown efficiency 10. How this affects the measured phenotype depends on the relationship between gene-dosage and fitness, which is unknown and unique to each gene 9 (Supplementary Fig. 4). Cas9 screens also suffer from heterogeneity of sgrna phenotypes 4,10 (Supplementary Fig. 1-3). In non-screening formats, this heterogeneity can be controlled for by creating clonal cell lines. Without single-cell selection, a population of cells with a given guide contains a mixture of true knockouts, heterozygotes, and wild-type cells 4,10 ; the array of genotypes that exist during the course of a screen likely depends on the efficiency of guide cutting as well as the relative fitness between these subpopulations. Over time the proportion of each subpopulation, and thus the measurable fitness, will change as on-going knockouts remove alleles and selection favors particular subpopulations. Even sgrnas with high cutting efficiency generate a significant number of functional alleles via in-frame indels 28,36. This variability in both RNAi and Cas9 screens precludes a simple phenotype to measure for a given gene. Thus, to determine whether a gene is involved in a given process, we can instead measure the maximum phenotype possible, which may not correspond to the highest degree of knockdown or the null phenotype. castle accounts for reagent heterogeneity In order to calculate this maximum phenotype for each gene using a clear statistical framework, we have developed cas9 high-throughput maximum Likelihood Estimator (castle) (Supplementary Fig. 4), which we can use for direct comparison of shrna and CRISPR/Cas9 screens. For each gene, castle gives an effect size estimate for the gene perturbation as well as a p-value associated with that effect, combining the advantages of previous approaches. Distribution-based methods, such as MW, RIGER, and RSA, provide robust statistical testing by comparing the distribution of phenotypes for elements targeting a given gene to the distribution of phenotypes from non-targeting control elements or all other elements 1,19,20,37 ; however, these

methods do not give a readily interpretable estimate of gene effect. Alternative, heuristic approaches have easily interpretable effect size estimation, for example by taking the phenotype of the single most active shrna/sgrna as a proxy for the phenotype of the gene or the median phenotype 23,37 39 ; however, these methods lack a statistical framework to assign confidence to a hit. castle uses a semi-parametric approach to provide both a statistical framework and biologically-interpretable effect sizes. Intuitively, castle estimates the maximum possible phenotype such that targeting elements are most likely to be found between this phenotype and zero (Supplementary Fig. 2, 3 and see also Supplementary Methods). Additionally, castle allows data from multiple screen types or from replicates of the same screen type to be compared and combined by finding a single effect size consistent with all data. Here we apply castle to compare the abilities of shrna and Cas9 screens to identify essential genes, though the framework should be widely applicable to detect positive and negative effects for other phenotypes. Validation of castle In order to validate castle as a broadly applicable tool, we used it to re-analyze data from several published screens. These include an shrna screen for modulators of ricin toxicity in K562 cells 1, a CRISPR/Cas9 deletion screen for LPS-induced TNF expression in primary mouse dendritic cells 16, and both CRISPRi and CRISPRa screens for modulators of the fusion toxin CTx-DTA 17 (Supplementary Data 5-7). For each of these, castle produced results broadly consistent with previous findings, with even higher correlation for validated hits from each screen when available (Supplementary Fig. 5). Consistent with the previously validated results of the TNF CRISPR/Cas9 deletion screen, 18 of the 20 top depleted hits identified by castle in the CRISPR/Cas9 deletion screen were identified in the previous study, with 16 successfully validating and two failing to validate 16. While the previous study reported difficulties in detecting enriched hits, we find that castle identified positive regulators of TNF expression (Supplementary Fig. 5b), and that the top enriched hit, TNFRSF9, has been recently implicated in TNF regulation 40. These analyses of previous data demonstrate castle can be widely used across screening technologies and phenotypes, simultaneously detecting positive and negative regulators. All individual analyses of replicates of the shrna and CRISPR/Cas9 screens show that

castle performs well in the detection of essential genes (AUC of the ROC curve > 0.91). In addition, the same data was analyzed with the previously published statistical tools MAGeCK 18, Mann-Whitney 1, RSA 19, RIGER 20, and HiTSelect 21 as well as two commonly used heuristics, the highest effect and the median effect (Supplementary Fig. 6, Supplementary Data 2-4). Although some algorithms performed better in the high-error region, castle performs better than or on par with existing methods in the low-error region, which is most relevant for the selection of top hits for follow-up analysis. This quantitative analysis, along with the broad agreement with previous screen data, establishes castle as a valid tool for screen analysis. Supplementary references 36. Bae, S., Kweon, J., Kim, H. S. & Kim, J.-S. Microhomology-based choice of Cas9 nuclease target sites. Nat. Methods 11, 705 6 (2014). 37. Birmingham, A. et al. Statistical methods for analysis of high-throughput RNA interference screens. Nat. Methods 6, 569 75 (2009). 38. Acosta-Alvear, D. et al. Paradoxical resistance of multiple myeloma to proteasome inhibitors by decreased levels of 19S proteasomal subunits. Elife 4, e08153 (2015). 39. Moffat, J. et al. A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell 124, 1283 98 (2006). 40. Haga, T. et al. Attenuation of experimental autoimmune myocarditis by blocking T cell activation through 4-1BB pathway. J. Mol. Cell. Cardiol. 46, 719 27 (2009).

SUPPLEMENTARY METHODS 1. CASTLE We built cas9 High Throughput maximum Likelihood Estimator (cas- TLE), which uses an Empirical Bayesian framework to account for multiple sources of variability. For each gene, we have the phenotypes of multiple targeting reagents, measured as a median normalized log ratio of counts. From this, and the phenotypes of negative controls, we obtain an effect size estimate for each gene and an associated log likelihood ratio. By shuffling targeting reagents, we can generate an expected negative distribution of log likelihood ratios, allowing hypothesis testing. For a given gene, i, we have a set of elements, j S i, each of which has an observed enrichment γ ij. We can then define the relationship between the true phenotype of the element, ξ ij, and the observed enrichment, P (γ ij ξ ij ), by taking advantage of the non-targeting controls, which represent both measurement noise and off-target effects. We can fit this distribution with a Gaussian kernel as N(γ) and use the shifted distribution P (γ ij ξ ij ) = N(γ ij ξ ij ). The distribution of true effects, ξ ij, for a given gene, i, is bounded by a maximum effect, I i, and by zero. Due to ineffective reagents, a certain percentage 1 θ i of true effects will be zero. The effective fraction θ i is assumed to be uniformly distributed between I i and 0. This gives us a distribution of true effects for a given gene parametrized by a maximum effect I i and fraction effective θ i. Together, this gives an Empirical Bayesian framework, where our observables, γ ij are drawn from the distribution P (γ ij ξ ij ) = N(γ ij ξ ij ), depending on the unknown element specific parameter ξ ij, which itself is distributed according to the gene-specific hyperparameters I i and θ i. It is these hyperparameters we need to estimate, which we can do with a Maximum Likelihood approach. The probability of observing γ ij given I i and θ i can be separated into three regions. Without loss of generality, let I i > 0. With probability θ i, ξ ij was drawn from a uniform distribution between [0, I i ]. If γ ij < 0, then the most likely estimate of ξ ij is 0. If γ ij [0, I i ], then the most likely estimate of ξ ij is γ ij. If γ ij > I i, then the most likely estimate of ξ ij is I i. In all regions, there is a (1 θ i ) probability that ξ ij = 0. Combining these gives us the probability of observing our data: 1

2 SUPPLEMENTARY METHODS or (1 θ i )N(γ ij ) + θ i N(γ ij 0) : γ ij < 0 P r(γ ij I i, θ i ) = (1 θ i )N(γ ij ) + θ i N(γ ij γ ij ) : 0 γ ij I i (1 θ i )N(γ ij ) + θ i N(γ ij I i ) : γ ij > I i N(γ ij ) : γ ij < 0 P r(γ ij I i, θ i ) = (1 θ i )N(γ ij ) + θ i N(0) : 0 γ ij I i (1 θ i )N(γ ij ) + θ i N(γ ij I i ) : γ ij > I i A grid search is performed over possible values of θ i and I i, and the likelihood of the model is calculated with the above probability function. θ i is then marginalized to calculate a maximum likelihood estimate Îi and an associated 95% credible interval. For hypothesis testing, a likelihood ratio test is performed against the following null model: P r(γ ij 0, 0) = N(γ ij ) The distribution of the log-likelihood ratio is estimated by randomly drawing sets of targeting elements and calculating the log-likelihood ratio as above. To combine data from replicates or from disparate screens, the grid search is performed to find the likelihood of (I i, θ i1 ) and (I i, θ i2 ). Both θ i1 and θ i2 are then marginalized to find the most likely I i.