Source1 Source2 Target Std. Err. SNPs Samples Supplementary Table 1. Groups with significant evidence of East Asian admixture.

Size: px
Start display at page:

Download "Source1 Source2 Target Std. Err. SNPs Samples Supplementary Table 1. Groups with significant evidence of East Asian admixture."

Transcription

1 Source1 Source2 Target f 3 Std. Err. Z SNPs Samples Mala CHB BEB (Bengali) Mala CHB Thakur Mala CHB Hazara Mala CHB Wan Mala CHB Syon Mala CHB Shah Mala CHB Bink Mala CHB Khasi Mala CHB Tharu_Uttrakhand Mala CHB Tharu_UP Mala CHB Newar Mala CHB Magar Mala CHB Burusho Mala CHB Minero Mala CHB Sherpa Mala CHB Chakehshanega Mala CHB Nyshi Mala CHB Poumainaga Mala CHB Nagaseema Mala CHB Rajbanshi Mala CHB Vysya Supplementary Table 1. Groups with significant evidence of East Asian admixture. Table of statistics of the form f 3 (Target; Mala, CHB). Negative statistics indicate that the target population descends from an admixture of Mala (proxy for ANI and ASI ancestry), and CHB (proxy for East Asian-like ancestry). Standard errors are based on Block Jackknife (see Online Methods).

2 Genotyping Platform Comparison Group Correlation (r) Human Origins IBD Score and F ST Score Human Origins IBD Score and Population Specific Drift Human Origins F ST Score and Population Specific Drift Affymetrix 6.0 IBD Score and F ST Score Affymetrix 6.0 IBD Score and Population Specific Drift Affymetrix 6.0 F ST Score and Population Specific Drift Illumina IBD Score and F ST Score Illumina IBD Score and Population Specific Drift Illumina Illumina_Omni Illumina_Omni Illumina_Omni F ST Score and Population Specific Drift IBD Score and F ST Score IBD Score and Population Specific Drift F ST Score and Population Specific Drift Supplementary Table 2. Measurements of founder event strength. For each group on the Indian Cline, we computed three measures of founder event strength: IBD Score, F ST score, and model-based population-specific drift. The three measurements of founder effect strength have high correlation.

3 Supplementary Table 3. IBD sharing across groups. Groups with more than one match for high shared IBD across groups (greater than an IBD score of 3, corresponding to ~1/3 the founder effect size of Ashkenazi Jews).

4 19 (A) (B) (C) Supplementary Figure 1. Power calculation to determine number of samples needed to detect accurately a strong founder event. In Chenchu (a), Finns (b), and Ashkenazi Jews (c), only 3-5 individuals were needed to attain enough power to calculate mean IBD sharing with relatively small standard errors. This indicates that in groups with a strong founder effect, one can determine the relative founder effect size by sampling a small number of individuals. Group with smaller founder effects than those in Ashkenazi Jews and Finns will likely require larger samples sizes to detect founder events, but these groups are of less medical interest from the perspective of founder event disease gene mapping. Based on this analysis, we aimed to genotype 5 individuals per groups for the new genotyping reported in this study.

5 Supplementary Figure 2. Principal Components Analysis of Indian Human Origins data along with CEU, CHB, and YRI.

6 32 A) B) C) D) Supplementary Figure 3. Principal Components Analyses subdivided by SNP array platform. (A) Affy 6.0, (B) Human Origins, (C) Illumina, and (D) Illumina_Omni datasets. These plots are used to separate the groups into different clusters for F ST analyses.

7 Supplementary Figure 4. Phylogenetic tree of Indian groups. PHYLIP 25 was used to create a neighbor-joining tree based on F ST distances with Yoruba as outgroup. Itol 26 was used for visualization (branch lengths are ignored to allow more uniform spacing).

8 55 A) Comparison Group Correla on (r 2 ) Affymetrix 6.0 and Illumina Affymetrix 6.0 and Human Origins Affymetrix 6.0 and Illumina_Omni Human Origins and Illumina Human Origins and Illumina_Omni Illumina and Illumina_Omni B) Comparison Group Correla on (r 2 ) Affymetrix 6.0 and Illumina Affymetrix 6.0 and Human Origins Affymetrix 6.0 and Illumina_Omni Human Origins and Illumina Human Origins and Illumina_Omni Illumina and Illumina_Omni Supplementary Figure 5. Platform differences in raw IBD score. IBD Scores for 1000 Genomes populations in all 4 genotyping platforms without normalization (A) and after normalizing to CEU in each dataset (B).

9 Supplementary Figure 6. Model used for estimating group specific drift in Indian groups (Paniyas is an example Indian group). R=root; OoA=out of Africa; ASA=ancestral South Asian; ASI=ancestral Southern Indian; AWE=ancestral West Eurasian; ANI=ancestral North Indian; APOP=ancestral Indian population.