Nature Methods: doi: /nmeth Supplementary Figure 1. Pilot CrY2H-seq experiments to confirm strain and plasmid functionality.

Size: px
Start display at page:

Download "Nature Methods: doi: /nmeth Supplementary Figure 1. Pilot CrY2H-seq experiments to confirm strain and plasmid functionality."

Transcription

1 Supplementary Figure 1 Pilot CrY2H-seq experiments to confirm strain and plasmid functionality. (a) RT-PCR on HIS3 positive diploid cell lysate containing known interaction partners AT3G62420 (bzip53) and AT5G28770 (bzip63). Lanes 2-4, RT reaction with Cre primers, Lanes 8-10, RT reaction with Gal4 DBD primers (expression control). Control lane, genomic DNA from HIS3 positive diploid lysate. M, 100 bp DNA ladder. (b) Colony PCR on HIS3 positive diploids containing interacting pair bzip53 and bzip63 using primers to amplify Cre-recombined ORFs. Lanes 1,3,5 show lox plasmids in Y8800/CRY8930; lanes 2,4,6 show lox plasmids in Y8800/ Y8930; lane 7 shows two non-recombined lox plasmids (negative PCR control); lane 8 shows Crerecombined ORF products (positive PCR control). (c) Colony PCR to amplify Cre-recombined ORFs from cells containing noninteracting pairs. Lanes 1,3,5 and 9,11,13 show padlox-ztl and pdblox-bzip63 in Y8800/CRY8930; lanes 2,4,6 and 10, 12,14 show padlox-bzip53 and pdblox-ztl in Y8800/CRY8930; lanes 7 and 15 show two non-recombined lox plasmids (negative PCR control); lanes 8 and 16 show Cre-recombined plasmid (positive PCR control).

2 Supplementary Figure 2 Comparing diploid strains Y8930/Y8800 and CRY8930/Y8800 using a range of interactions. (a) Plate locations of interactions screened in standard 1x1 yeast two-hybrid. Green boxes indicate expected positive interactions. CRY signifies the strain CRY8930/Y8800. Protein names Z53, Z63, ASK1, ZTL, ASK2, and PP2 correspond to the genes AT3G62420, AT5G28770, AT1G75950, AT5G57360, AT5G42190, and AT3G61060, respectively. (b) Diploid culture concentrations reported as OD 600 /ml. (c) Pictures of diploids after three days of selection on either 1x SC Leu/-Trp (diploid selection) or 1x SC Leu/-Trp/-His + 1mM 3-AT (interaction selection); (-) indicates Y8930/Y8800 diploids and (+) indicates CRY8930/Y8800 diploids.

3 Supplementary Figure 3 CrY2H-seq bait, prey, and interaction libraries show high coverage and minimal bias. (a) Size distribution of transcription factor ORFs detected in prey library ( AD, blue) and bait library ( DB, teal) compared to the size distribution of all transcription factor ORFs in the ORFeome ( theoretical, pink). (b) Sizes of bait library ORFs plotted as a function of their log 10 fragments detected, Pearson correlation, (c) Sizes of prey library ORFs plotted as a function of their log 10 fragments detected, Pearson correlation, (d) Size distribution of all potential PCR products from 1.9 million non-redundant interactions that could have been detected (pink) compared to size distribution of all PCR products that were detected in CrY2H-seq screens (teal). (e) CrY2H-seq detected PCR product sizes plotted as a function of their log 10 NPIF values. Pearson correlation,

4 Supplementary Figure 4 Determining optimal sequencing coverage and basal fragment threshold. (a) Only PPIs (protein-protein interactions) with 3 or more fragments in the 20 million read depth dataset showed the expected 4-fold or greater increase in coverage in the 80 million read depth dataset, while PPIs with only one or two fragments in the 20 million read depth dataset were frequently not even detected (orange) at the 80 million read depth. (b) At 80 million read depth, the majority of new PPIs that were not detected at the 20 million read depth had less than 10 fragments, with over 90% of PPIs showing less than 3 fragments.

5 Supplementary Figure 5 The CrY2H-seq analysis pipeline. (a) Reads are first mapped to a custom genome composed of Arabidopsis TF ORF sequences, S. cerevisiae genome, Gal4 AD and Gal4 DB domain sequences amplified by AD and DB primers (primer region), and empty plasmid sequence (not pictured). Overall alignments were as follows: 74.8% Arabidopsis, 16% primer region, 4.8% not aligned, 4.2% Yeast, and 0.2% empty plasmid. (b) Paired-end high-quality mapped reads are paired by read IDs and clonal fragments are removed. (c) Fragments are filtered for DNA strandedness to remove reads mapping to only one ORF. Examples of one protein interaction fragment (blue, green, and orange fragment) and a fragment mapping to only one ORF (purple fragment) are shown. Fragments are filtered by fragment size based on a 300bp size range (~220 bp - ~520 bp). (d) Fraction of fragments remaining after each filtering step in the analysis pipeline. Average and standard deviation were calculated from the ten replicate screen datasets. (e) Fragments for unique ORF combinations are totaled only if read1 maps to a different ORF than read2. Paired-end reads mapping to three example protein interaction PCR products are shown. (f) A basal fragment cutoff is applied for reasons described in Supplementary Figure 4 and Online Methods. This removes any interaction product with less than 3 fragments from the data, as shown by the carry-over of interaction products showing more than 2 fragments (peach/lime and blue/green interaction product) and absence of the blue/red interaction product which had only 2 fragments. (g) Datasets from each screen are normalized by calculating the median total number of interaction fragments and calculating a scale factor based on the fold difference between the screen and median total interaction fragments. Unique interaction fragments are then multiplied by the calculated scale factor and rounded down to the nearest integer. An example is shown for screen 8. See Online Methods for more details.

6 Supplementary Figure 6 Homodimers do not yield AD-DB PCR products from Cre-recombined plasmids. (a) Test DB-ORFs and AD-ORFs screened in CRY8930 and Y8800, respectively, using standard 1x1 yeast two-hybrid. (b) Yeast twohybrid positive spots grown on selection media (1xSC Leu/-Trp/-His + 1mM 3-AT) that correspond to ORF pairs in a. (c) Colony PCR of yeast two-hybrid positive spots using AD and DB primers to detect the presence of Cre-recombined plasmids. (d) Model of an AD and DB primer-pcr amplicon of a homodimer, that contains on average a 1.66 kilobase pair inverted repeat. Blue boxes indicate homodimer pairs tested.

7 Supplementary Figure 7 Predicted saturation curve for CrY2H-seq transcription factor screening. Predicted average number of interactions that would be detected after each CrY2H-seq screen from a Michaelis-Menton model based on the average CrY2H-seq detection rate over ten screens. Error bars, standard deviations.

8 Supplementary Figure 8 Results from 1x1 yeast two-hybrid retest of AtTFIN-1 interactions detected in CrY2H-seq screens. (a) Representative scoring of retested interactions. (b) Fraction of AtTFIN-1 interactions positive in 1x1 matrix style Y2H retest screen (retest rate) as a function of normalized protein interaction fragments (NPIFs). Novel PPI bin sizes: 59 (1-2.5), 77 ( ), 62 (2.8-3), 78 (3-3.2), 64 ( ), 84 ( ), 82 ( ), 74 (> 4.1). Known + novel PPI bin sizes: 65 (1-2.5), 83 ( ), 71 (2.8-3), 91 (3-3.2), 71 ( ), 99 ( ), 92 ( ), 84 (> 4.1).

9 Supplementary Figure 9 Determining wnappa z-score threshold. Fraction of known and novel interactions detected by CrY2H-seq, and random interactions that scored positive in wnappa plotted as a function of z-score thresholds. Dashed line indicates threshold for which the maximum number of known interactions and minimum number of random interactions are detected.

10 Supplementary Figure 10 Overlap with known interactions. (a) Fraction of BioGRID interactions positive in AtTFIN-1 binned by experimental derivation, with actual number of AtTFIN-1 interactions listed inside bars. (b) Overlap between interaction datasets used for evaluating AtTFIN-1 quality.

11 Supplementary Figure 11 Comparing CrY2H-seq to the array-based HT-Y2H approach that was used to generate Arabidopsis Interactome-1 (AI-1). (a) Fraction of unique TF interactions that were detected out of all TF interactions screened in CrY2H-seq and in HT-Y2H. 8,577 TF interactions were detected among the 1,880,232 unique TF interactions that were screened in CrY2H-seq, while 229 TF interactions were detected among the 270,480 unique TF interactions that were screened in HT-Y2H. (b) Coverage of commonly screened TF interactions between AtTFIN-1 and AI-1. (c) Fraction of 114 commonly screened LCI pairs recalled by CrY2H-seq and by HT-Y2H; CrY2H-seq recalled 38 interactions while HT-Y2H recalled 14.

12 Supplementary Figure 12 AtTFIN-1 is enriched for biologically relevant interactions. Distribution of Pearson correlation coefficients (PCC) for AtTFIN-1 interacting pairs and random pairs across 6057 mrna expression arrays 22. Inset, percentage of AtTFIN-1 interacting pairs (9/5723) and random pairs (18/42262) with PCC > Error bars, standard error of proportion. P value, one-sided Fisher s exact test.

13 Supplementary Figure 13 Intra- and interfamily interactions in AtTFIN-1. Discrete empirical P values of interactions observed more frequently in AtTFIN-1 than expected by random chance. Families are hierarchically clustered by common family interactions. Color key: ND = not detected, NS = not significant, * p<0.05, ** p< 0.01, *** p< Examples of known intra-family and inter-family dimers are highlighted in green and purple, respectively.

14 Supplementary Figure 14 CrY2H-seq shows reduced screening costs and time compared to traditional Y2H and BFG-Y2H methods. (a) Screening cost estimates for two replicate screens of 1000 x 1000 proteins. (b) Screening cost estimates for one screen of 30,000 x 30,000 proteins. Costs for sequencing of Y2H positive clones have been omitted here because we were unable to find a report of the number of reads required per screen size for traditional Y2H or for BFG-Y2H. (c) Screening time estimates. It should be noted that these cost estimates do not take into account preliminary cloning steps as any cloning strategy could be applied to move ORFs into expression plasmids. It should also be noted that this cost comparison is conservative as the cost difference would be larger from traditional Y2H had labor and pipetting costs been accounted for.

15 Supplementary Figure 15 Determining bait and prey orientation of interacting proteins. (a) Bait and prey orientations can be derived from fragments mapping to interactions that contain at least 15 base pairs of Lox77 sequence. Blue and green squares outline the lox sequence parts that reveal prey and bait identity, respectively. The black line exemplifies an orientation-revealing fragment. Amplicon, example fragment, and reads are not drawn to scale; read 1 and read 2 are each a total of 100bp. (b) Overview of fragments mapping to protein-protein interactions (PPIs).