Scanned chip images were analyzed in two ways: levels and obtain Presence /Absence (P/A) calls for targeted genes. The metrics sheets

Size: px
Start display at page:

Download "Scanned chip images were analyzed in two ways: levels and obtain Presence /Absence (P/A) calls for targeted genes. The metrics sheets"

Transcription

1 Supplementary Methods Microarray Data Analysis Scanned chip images were analyzed in two ways: 1. The Affymetrix Microarray Suite (MAS) 5.0 was used to quantitate expression levels and obtain Presence /Absence (P/A) calls for targeted genes. The metrics sheets were exported from MAS and loaded into GeneSpring software (Silicon Genetics, Redwood City, CA). Differentially expressed genes were generated in each pair of EGFP-positive and EGFP-negative samples if they fulfilled the following criteria: i) they were called Present or Marginal in the EGFP-positive sample, and ii) The fold change (FC) of expression level between EGFP-positive and EGFP-negative samples was above 2. Venn diagrams were used to generate a list of genes that were differentially expressed in all three pairs of samples (see Supplementary Tables 3 and 4 online). 2. dchip software ( Wong Lab, Harvard) was used to obtain model-based gene expression indices 1. A total of 6 chips (from 3 separate EGFP-high and EGFP-negative cell isolations) were used to develop the model, and an invariant set normalization method was applied 2. Pairwise comparison was performed between each EGFP-positive sample and baseline (EGFP-negative sample). The genes with the 90% lower confidence bound (LCB) of the fold change above 1.2 in all three pairs of samples were considered to be differentially expressed (Supplementary Tables 3 and 4 online). The LCB is a conservative estimate of the FC that takes into account the standard errors and the levels of the expression indices. An LCB>1.2 corresponds to an estimated fold change of 1.9 3, 4. The lists of differentially expressed genes generated from dchip were imported

2 into GeneSpring and compared with the gene lists composed of genes with FC of >2. The overlapping genes from the dchip and Genespring lists were compiled. The original and combined lists are presented in Excel files in Supplementary Tables 3 and 4 online. All of the raw data is presented in an Excel file in Supplementary Table 6 online. To analyze whether the probe sets from Affymetrix matched their annotated genes specifically, we first used the interactive query tool from the NetAffx Analysis Center to acquire the target sequence for the probe set, and then BLAST searched the target sequence in Genbank. i) If the target sequence matched the annotated gene, we checked the full length sequence (if applicable) of the gene using the probe match tool in NetAffx Analysis Center to retrieve all possible matching probe sets on the same Affymetrix array. The genes that matched multiple probe sets were noted in Supplementary Tables 3 and 4 online. For example, CD 34 matched probe sets _at and 97773_at. Within individual probe sets, we also noted the number of probes out of 16 that matched the gene sequence. We recorded this information in the Affymetrix probe set analysis column in Supplementary Tables 3 and 4 online. Specific refers to 16/16 matches. Probe sets with fewer matches are also noted. ii) If the target sequence did not match the annotated genes or any other genes (for example, _at, Nephronectin), we removed them from our final lists in Tables 1 and and Supplementary Tables 1 and 2. EST sequences were also BLAST searched and updated information provided in the tables. For example, the EST AW sequence detects the Disabled homolog 2 gene, and the AI sequence detects Actinin alpha 1.

3 The genes in Tables 1 and 2 and Supplementary Tables 1 and 2 were classified according to information obtained from literature searches on PubMed. A different classification generated by the GeneSpring software is presented in Supplementary Table 5 online. RNA Isolation and Quantitative Real Time PCR The SYBR Green PCR assays were performed on an Opticon 2 DNA Engine (MJ Research). The QuantiTect SYBR Green PCR kit (Qiagen, Inc.) was used for all PCR reactions. GAPDH was used as an internal control to normalize each sample. A melting curve was generated for each gene product and the reading temperature was chosen as 3-5ºC below the melting point to ensure the specificity of PCR product. The PCR reaction was then carried out as follows: 94ºC 15 seconds, annealing time30 seconds (see table below for annealing temperatures), 70º C 30 seconds, reading temperature 15 seconds, plate reading, and followed by 40 cycles. After 40 cycles, data were analyzed using Opticon software (MJ Research). The linearity of the fluorescence response for each sample and the baseline was checked to generate accurate Cts. The concentration curves were constructed using total RNA extracted from mouse back skin or positive control tissues with serial dilution of 1:10 and each concentration was run in duplicate. The amount of each gene and GAPDH in the samples was measured in duplicate and their relative amount was calculated by comparing to the concentration curve. The primer sequences and PCR temperatures used are summarized in this table:

4 Gene GAPDH CCK FEX CD34 Potassium Channel, subfamily K Cysteine knot, BMP antagonist TNF receptor superfamily, OPG Fzd 2 DKK 3 Primer Pair Annealing Temp ºC Product Tm ºC Plate Read TempºC 5 -TGCCCAGAACATCATCCCTG-3 5 -ATCCACGACGGACACATTGG GGAAACAACCACACATACGACCC-3 5 -GGAGTCACTGAAGGAAACACTGCC CACCGCTTGTATCATTTGTTCAGC-3 5 CATCCAACGAGTCTTCTCACTATG TGGGTCAAGTTGTGGTGGGAA-3 5 -GAAGAGGCGAGAGAGGAGAAATG AAACCCACCGTGCTTGCTTC-3 5 -ATCTCCTGAGGCTGCTCCAATG-3 5 -TCCCATAGCCCATCCCTTTC-3 5 -TCTGTCCCGTTTGCCATCAC-3 5 -ACCGAGTGTGTGAGTGTGAGGAAG-3 5 -ACCTGAGAAGAACCCATCTGGAC CAAGGACATCGGCTACAACACC-3 5 -GCGAACACAGGAAGAAGCGAAG TACCTCTGAAAGCCAGTGCTCG-3 5 -CTTGGTTGTGACTTCTCGGTGTG We also used Taqman probes and the Applied Biosystems Prism 7700 Sequencing Detection System (PE Applied Biosystems) to study differential expression of the ID2 gene using previously described protocols 5. The ID2 primer sequences were 5 -GAGAACACGTTGAATGGACCTTT-3 and 5 - AAGTCTCTCATAAATAACGGTATCACAGTC-3. The TAQMAN probe sequence for ID2 was: 5 -CGTCTTGCCCAGGTGTCTTGTTCTCC-3. References for Supplementary Methods: 1. Li, C. & Wong, W.H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. 98, (2001). 2. Wong, S., Fares, M.A., Zimmermann, W., Butler, G. & Wolfe, K.H. Evidence from comparative genomics for a complete sexual cycle in the 'asexual' pathogenic yeast Candida glabrata. Genome Biol 4, R10 (2003).

5 3. Ramalho-Santos, M., Yoon, S., Matsuzaki, Y., Mulligan, R.C. & Melton, D.A. "Stemness": transcriptional profiling of embryonic and adult stem cells. Science 298, (2002). 4. Yuen, T., Wurmbach, E., Pfeffer, R.L., Ebersole, B.J. & Sealfon, S.C. Accuracy and calibration of commercial oligonucleotide and custom cdna microarrays. Nucleic Acids Res. 30, e48 (2002). 5. Xu, X., Lyle, S., Liu, Y., Solky, B. & Cotsarelis, G. Differential expression of cyclin D1 in the human hair follicle. Am. J. Pathol. 163, (2003).