Nature Methods: doi: /nmeth Supplementary Figure 1. Construction of a sensitive TetR mediated auxotrophic off-switch.

Size: px
Start display at page:

Download "Nature Methods: doi: /nmeth Supplementary Figure 1. Construction of a sensitive TetR mediated auxotrophic off-switch."

Transcription

1 Supplementary Figure 1 Construction of a sensitive TetR mediated auxotrophic off-switch. A Production of the Tet repressor in yeast when conjugated to either the LexA4 or LexA8 promoter DNA binding sequences. B Increasing the number of LexA8::TetR copies in the yeast correlates with an increased repression of yeast growth in response to two protein pairs that do interact (+) or two protein pairs that do not (-) interact. C Identification of the minimal ADE2 promoter sequence that can maintain wild type yeast growth in the absence of an external adenine source. Truncation of the TGCCTC boxes resulted in yeast that grew more slowly and with a red coloration, indicative of insufficient adenine production and unhealthy yeast. Further truncation towards the TATA box resulted in almost total cessation of growth. D Increasing the number of TetO sequences in the ADE2 promoter increases the ADE2 repression mediated by the TetR. E Examples of the final Int-Seq strain differential growth signals observed in response to protein pairs that do (+) or do not (-) interact.

2 Supplementary Figure 2 Domain architecture and construct sizes of the BBSome components used in Y2H screen.

3 Supplementary Figure 3 Coverage and percentage of total mutations sequenced in the Y2H vector libraries. 150mer sequence reads were grouped based on the number of mis-match mutations they contained (1,2,3,4 or 5). To generate high confidence estimations, each individual mutation was counted only if 3 sequences were observed in the sequencing results. Red circles in each diagram represent the libraries generated using a random mutagenesis protocol. A The proportion of amino acids targeted by mutagenesis that contained either a programmed A,K or E mutation. B The proportion of all potential programmed AKE mutations present in the final mutant libraries. C The proportion of all other mutations contained in the final mutant libraries.

4 Supplementary Figure 4 Sequencing counts of BBS7 mutant Y2H vector library. Box plots of each amino acid codon representing the number of times each codon was sequenced as a mutation across all positions in the Y2H construct.

5 Supplementary Figure 5 Comprehensive validation of the BBSome Int-Seq data. A Majority of individual point mutants generated in BBS1 reconfirm in both a single pairwise Y2H experiment and in a LUMIER-style medium throughput experiment. Y2H panel numbers correspond to the indicated mutants in the LUMIER experiment.* = BBS1 G516A not tested in LUMIER experiments. B Comprehensive LUMIER style experiments reconfirm the Int-Seq data in an orthologous experimental system. LUMIER data is a representative result from two independent experiments. Each bar plot represents the average signal from triplicate transfections, with the error bars representing the highest and lowest values. C Expression analysis of protein-a tagged wild type and mutant constructs used in LUMIER-style experiments. Constructs that show a significantly decreased expression in comparison to the corresponding wild type are marked with a line underneath the lane. * represents non-specific band in each lane.

6 Supplementary Figure 6 Detailed depiction of the BBS4-BBS18 interaction. Int-Seq identifies the central, highly mutated TPR domain containing region of BBS4 as essential for BBS18 interaction. A The C terminal half of BBS4 interacted with the full length BBS18 construct and was subject to Int-Seq analysis. BBS4 has seven annotated TPR domains, three of which C terminal of the centre of the protein are highly annotated with disease causing mutations. B While the total number of mutations sequenced spans the entire BBS4 construct, the majority of the signal is carried across the three central TPR domains. Inset: Two individual BBS4 mutations were validated as disrupting the BBS4-BBS18 interaction C The full length of the small 104 amino acid BBSome subunit BBS18 was subject to Int-Seq analysis. Two individual BBS18 mutations were validated as disrupting the BBS18-BBS4 interaction. D The secondary structure, predicted disorder, solvent accessibility and enriched mutagenic profile across the full length of BBS18.

7 Supplementary Figure 7 Detailed depiction of the BBS5 Int-Seq defined amino acids required for maintenance of the BBS9 interaction. A The full length clone of BBS5 interacted with the full length BBS9 construct and was subject to Int-Seq analysis. BBS5 has one N terminal annotated GLUE domain and one C terminal annotated PH domain. B While the total number of mutations sequenced is strongly biased towards the C terminus of the BBS5 construct, the largest signal is carried towards the N terminus of the construct, with further clusters distributed towards the middle and C terminal section. Inset: Thee individual mutations were validated as disrupting the BBS5-BBS9 interaction (MTs 1,2,4), with one null mutation showing no-effect on the interaction (MT 3).

8 Supplementary Figure 8 Detailed depiction of the amino acids required to maintain the BBS2-7-9 core subnetwork. A The C terminal domain of BBS7 is required for the BBS2 interactions and was subject to Int-Seq analysis. The majority of mutant enrichment is distributed across an alpha helical region in the mid-section of the domain. Inset: Four individual mutations were validated as disrupting the BBS5-BBS9 interaction (MTs 1-3,5), with one Int-Seq null mutation showing no-effect on the interaction (MT 4). *for visualisation purposes the axis limits exclude a single mutation peak. B The C terminal domain of BBS2 is required for the BBS9 and 7 interactions and was subject to Int-Seq analysis. The majority of mutant enrichment is distributed across an alpha helical region in the mid-section of the domain. Inset: Five individual mutations were disrupted both the BBS7 and BBS9 interaction. *for visualisation purposes the axis limits exclude a single mutation peak. C The C terminal domain of BBS9 is required for the BBS1,4 and 2 interactions and was subject to Int-Seq analysis. The majority of mutant enrichment is distributed across a beta sheet region towards the C terminus of the domain. Inset: Three individual Int-Seq identified mutations showed distinct interaction patterns with BBS1,BBS2 and BBS4.

9 Supplementary Figure 9 DNA constructs that comprise the Int-Seq system. A pag25 construct containing 2 copies of the LexA8::TetR_NLS in parallel for insertion in the MET2 locus of the yeast genome via homologous recombination. B Schematic diagram of the final synthetic ADE2 promoter with inserted restriction sites annotated. The first two tet operators were inserted via overlap stitch PCR. For the other three tet operators, a restriction site was first cloned into the promoter sequence (AvrII, SacI and NcoI). This allowed restriction digestion of the promoter sequence, removal of wild type sequence and insertion of a synthetic promoter sequence containing the tet operator in place of the wild type DNA sequence.

10 Supplementary Figure 10 Enriched mutant identification from next generation sequencing data. A To expedite the alignment of obtained sequences, unique 150mers were collated in parallel to unique sequences pairs. B Unique 150mers were aligned against wild type BBS genes using the SCHRiMP software package. C Enrichment scores were calculated using a linear model for each mutant codon across all positions using the R statistical analysis environment. D Unique paired end sequences containing each identified enriched mutation were collated to identify co-segregating mutations. E The proportion of read pairs containing only the enriched mutation were plotted against the proportion of read pairs that contained the secondary mutation with the highest recall statistics. This enabled filtering of the data to remove sequences that contain co-segregating mutations (F), while retaining sequences that showed only a single enriched mutation across the gene body (G). H These sequences were then collated into a final profile through summing the enrichment across all identified mutations at any given residue.