Chromosomeomosome 13 Chromosomeomosome 17 Gene a AST b nssnps c Gene AST nssnps BRCA BRCA RB1 2 3 ERBB IRS2 1 3 TP

Size: px
Start display at page:

Download "Chromosomeomosome 13 Chromosomeomosome 17 Gene a AST b nssnps c Gene AST nssnps BRCA BRCA RB1 2 3 ERBB IRS2 1 3 TP"

Transcription

1 Table 1. Features of salient genes on chromosomes 13 and 17 with respect to the presence of alternatively spliced transcripts and non-synonymous single-nucleotide polymorphisms Chromosomeomosome 13 Chromosomeomosome 17 Gene a AST b nssnps c Gene AST nssnps BRCA BRCA RB1 2 3 ERBB IRS2 1 3 TP a Ensembl protein and AST information ( b AST = alternative splice transcript. c non-synomous single-nucleotide polyphorphism = nssnp assembled from data from the 1000 Genomes Projects. 1

2 Supplemental Information Supplemental Table 1: Additional Information for Table 1 Chromosome 13 Chromosome 17 Gene AST a nssnps b Gene AST a nssnps b BRCA BRCA RB1 2 3 ERBB IRS2 1 3 TP Chromosome 13 a BRCA2 AST: ENSP ENSP ENSP RB1 AST: ENSP ENSP IRS2 AST: ENSP Chromosome 17 BRCA1 AST: ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ERBB2 AST: ENSP ENSP ENSP ENSP ENSP ENSP TP53 AST: ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP ENSP Chromosome 13 b BRCA2 nssnps: 2

3 exon2:r18h, exon3:p94s, exon10:c315s; N289H; N372H; Q347R; Y600H, exon11:1290y; 1902N; 1990A; D707N; D946E; 1895K; 2044V; 1561N; 2116R; 1364L; 1418V; 1903T; 1929V; L929S; M784V; 1459S; 1880K; N991D; Q713L; 2034C; 1074C; 1479T; 1733F; 1414M; 1915M; 2109I, exon14:2440r; 2339N; 2466A, exon15:2534v;2490t;2480v, exon16:e2571g, exon18:2698t; 2729N; 2728I, exon20:e2856a, exon22:2951t; 2944F; 2950N; 2969M, exon24:v3079i, exon26:k3196e, exon27:3412v; 3292L; 3374I; 3244I RB1 nssnps: exon2:t58i, exon17:a525g, exon25:r876c IRS2 nssnps: exon1: S667N; V999M; G1057D Chromosome 17 BRCA1 nssnps: exon3:l201p, exon4:h193r, exon7:s160y, exon9:q330r; D219Y; I353M, exon10:e508g; S610G; K653R; R817G; N193D; K290E; P341L; P620S; S510N; V651I; D163N; C464R; Y326H, exon11:s859n; exon16:1052i; 1143I; 1119T. ERBB2 nssnps: exon3:e79a, exon7:e286k, exon10:a386d, exon12:w452c, exon17: I654V; I655V, exon25:g1015e, exon26: P1135S; L1061P, exon27: P1170A; A1216D; D1144H; E1244K. TP53 nssnps: exon4:p47s; P72R, exon5:r181h, exon8:r273h, exon9:t312s 3

4 Supplemental Table 2 Deliverables of the C-HPP and their applications Deliverable Category 1st sub-category 2nd sub-category Applications and significance Parts list Protein parts list Gene based protein list 1. New biomarker candidates Observed protein list Missing protein list 2. Novel functional attributes of proteins Proteogenomic Known/observed nssnp list 3. Complementary clues for parts list disease Known/observed alternative 4. New isoforms with new splicing variants functions Characterized Observed acetylated proteins 5. Integration of transcriptomics proteomic parts list Observed glycosylated with proteomics proteins 6. Enlarged disease networks Observed phosphorylated 7.Use of genome information for proteins more targeted studies Other PTMs (TBA) 8. Integrating classical genetics with modern biochemistry Other parts list MRM/SRM peptides 9. Expression of adjacent genes Transcriptome to proteins with related function Resources Labeled peptides Proteotypic N-glycosite peptides Phosphoproteotypic peptides MRM/SRM peptides 1. Protein parts list 2. Cell line/tissue bank. 3. Databases (genomics/proteomic). Antibody Monoclonal/polyclonal 4. References for other antibodies (and antigens) researchers. Epitope characterization Samples Representative tissues (normal and disease) Rare specimens (e.g., bone, hair, nasal, embryonic, fetal tissues) Disease cell lines/samples Specific differentiation/cell cycle states SOPs For sample preparation For peptide MS data For quantitation Functional Annotation Pathway mapping 1. New biological knowledge study Gene Ontology 2. Provide missing links between Tissue localization gene and phenotype Disease Disease associated gene 3. Transform classical genetics clusters with modern biochemistry. Protein s role(s) in disease(s) Functions Confirmation of predicted functions Unexpected protein function 4

5 Supplemental Figure 1. Organization of HPP Consortium. Shown here is the structure of Human Proteome Project (HPP) consortium which is composed of the chromosome-centric C-HPP and biology and disease-driven B/D-HPP programs and the mass spectrometry, protein capture, and knowledge base resource pillars. In the C-HPP, multiple teams across a given country/countries will work together to analyze the protein parts list encoded by each chromosome, with respect to both functional analysis and identification of novel proteins and their isoforms. In the B- D/HPP, existing organ-based and biofluid-based HUPO initiatives will be joined by new participating laboratories committed to data sharing and standardization. Ab, antibody and affinity protein capture resource pillar; EC, executive committee; SSAB, senior scientific advisory board; KB, knowledge base resource pillar; MS, mass spectrometry resource pillar; PIC, principal investigator council. 5

6 Supplemental Figure 2. Components of the C-HPP research module. Representative working components of the C-HPP in which genomics, proteomics and related molecular and cellular technologies are integrated according to the location of genes in each chromosome. An outcome of these integrative technologies is to give a better view of the relationship between genomic, transcriptomic and proteomic measurements and phenotypes. 6

7 Supplemental Figure 3. A web of bioinformatics designed for the chromosomecentric proteomics studies. The HPP will construct a web portal that will contain links to all these major available proteomic and genomic databases that have been built up over many years and will be a communications center for participating laboratories and the interested research community. The C-HPP portal will be a dynamic working interface for the laboratories and national teams participating in the C-HPP across all the chromosome assignments. It will be used to facilitate the overall progress and management of the C-HPP, and will become a central focal point of the C-HPP and the linked HPP portal in terms of publicizing the project, goals, and results. It will provide links to all data and knowledge resources related to human proteins and provide access to proteomics identification and analysis tools, based on user requirements. Examples include the PRIDE, Tranche, PeptideAtlas, and GPMDB for mass spectrometry data; UniProt, RefSeq, and Ensembl for reference sequence databases; IMEX for interaction; Human ProteinAtlas and Human ProteinPedia for organ and subcellular localization; SRMAtlas for spectra identifying proteotypic peptides; and Protein Data Bank (PDB) for protein structures. This web will help the C-HPP interact in new ways with ongoing HUPO initiatives and other laboratories absorbed into the B/D-HPP component of the HPP. 7