Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek This example data set consists of 20 selected HapMap samples, representing 10 females and 10 males, drawn from a mixed ethnic population of 6 CEPH (Caucasians with ancestry from northern and western Europe), 6 Yoruban (African), 4 Chinese, and 4 Japanese. Some of the samples were selected because they have interesting common copy number variations. The data consists of a subset of the full 270 sample data provided by Affymetrix Inc., and was run on the Affymetrix GeneChip Human Genome-Wide SNP 6.0 Mapping Arrays. This tutorial will illustrate how to: Import data from CEL files and generate copy number Normalize data to controls Visually and statistically identify regions of copy number aberration Generate a list of genes in the regions Note: It is recommended that you go through the Pattern Visualization System chapter in the Partek On-Line Help before going through this tutorial if you have not done so. In addition, this tutorial covers specific information; for general information covering a variety of subjects, see the Partek On-line Help. The data and most library files for this experiment can be downloaded from Affymetrix s Support Materials and Data Resource Center; however, links to the necessary files can be found on the Partek tutorials page, found by selecting Help > On-line Tutorials in the Partek main menu. For this example, put the unzipped Data files (.CEL and sample information files) in the folder C:\Partek Example Data\SNP6. Put the unzipped Library Files (.cdf, annot.csv) and the Baseline Files (.pbf) in the folder C:\ Microarray Libraries\Affymetrix\SNP6. Please refer to the suggested guidelines on where to store these files on your computer in the Guidelines for Storing Affymetrix Library Files section of Chapter 4 of the Partek On-line Help. Importing Affymetrix CEL Files and Estimating Copy Number Select Copy Number from the Workflows drop-down menu on the right side of the main window Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 1
Figure 1: Selecting the Unpaired Copy Number option within the Partek GS main window Select Load Study Data the Copy Number workflow (Figure 1); the dialog in Figure 2 will appear Figure 2: Loading allele intensity Click Import CEL (Figure 2) Click on the Browse button to select the folder C:\Partek Example Data\ SNP6. By default, all the files with a.cel extension name are selected (Figure 3) Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 2
Figure 3: Selecting the folder that contains.cel files for the experiment Click the -> button to move all the.cel files to the right panel. 20 files will be processed Click Next > to specify the library files and the output file (Figure 4) Select C:\Partek Example Data\SNP6\sampleInfo20.txt.fmt as the Sample Information File By default, the output file will be stored in the same folder as the.cel files Please refer to Chapter 4 Importing and Exporting Data in the Partek On-line Help for other ways of creating and editing a sample information spreadsheet. Figure 4: Specifying library files and output file Click Import (Figure 4) Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 3
NetAffx integration will automatically download and use any needed library files if they are not already stored on your computer. You must specify a valid NetAffx account and a file directory, which will store your library files. If you would like to specify existing files, click Cancel when prompted for the location to store the Affymetrix library files. Figure 5: Files downloaded from NetAffx Note: If NetAffx was unable to fill in any of the files or the files are unavailable through NetAffx, you must specify the library and annotation files for the Human Genome-Wide SNP 6 chip. The Chip Sequence Summary and Fragment Length Summary are files distributed by Partek (used to adjust the signal estimates based on sequence and fragment length), while the rest can be found on Affymetrix s chip support page (http://www.affymetrix.com/support/technical/byproduct.affx?product=genomewid esnp_6). Click OK (Figure 5) Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 4
Figure 6: Viewing the spreadsheet of 20 samples with Human Genome-wide SNP 6.0 intensity Note: Once the 20TutorialSamples.fmt is generated after import, the file can be opened by selecting File > Open, so no additional future importing for this file is needed. Now that the data is successfully imported, it can be normalized to the baseline to create unpaired copy number. Select Create Copy Number from the Copy Number Workflow. A dialog like the one shown in Figure 7 will appear Figure 7: Selecting the Study Type Select Unpaired (Figure 7) Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 5
Figure 8: Specify the desired baseline that will be used as a reference when computing Copy Number for the imported data Browse to the baseline that will be used to create copy number from your data Click OK (Figure 8) Next, we will visually and statistically detect regions with copy number aberration. Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 6
Figure 9: Configuring the HMM-based test for significant regions This computation produces two result spreadsheets. The first spreadsheet is called hmm_regions and contains each region for each sample that has a state that corresponds to amplification or deletion. The child of the hmm_regions spreadsheet is called regions. This spreadsheet contains the regions which contained variations in at least 2 of the 20 samples. Figure 10: Viewing the regions spreadsheet To visualize the genomic location of all of the detected regions select View > Region Genome View. A region view similar to the view in Figure 11 will appear Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 7
Figure 11: Viewing the region view This plot shows the genomic location of all regions across the entire genome. Clicking on a region indicator will select the corresponding sample and all of the probes within the region. You can select a region and right click for a gene list. Close the Region View The regions spreadsheet gives the union of all samples that have overlapping regions. It may be more informative to show the regions for each individual sample. To look at each individual sample s detected variations, select the hmm_regions spreadsheet (Figure 12) Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 8
Figure 12: Invoking the chromosome view on a region spreadsheet Select View > Region Chromosome View (Figure 13) Figure 13: Chromosome 8 Region Chromosome View Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 9
End of Tutorial This is the end of the mapping SNP 6.0 copy number tutorial. If you need additional assistance with this data set, you can call our technical support staff at +1-314-878-2329 or email our technical support staff at support@partek.com. Copyright 2007 by Partek Incorporated. All Rights Reserved. Reproduction of this material without expressed written consent from Partek Incorporated is strictly prohibited. Analyzing Affymetrix GeneChip SNP 6 Copy Number Data in Partek 10