Supplementary Table 1 - Repetitive elements in X-degenerate sequences of chimpanzee and human Y chromosomes

Size: px
Start display at page:

Download "Supplementary Table 1 - Repetitive elements in X-degenerate sequences of chimpanzee and human Y chromosomes"

Transcription

1 Supplementary Table 1 - Repetitive elements in X-degenerate sequences of chimpanzee and human Y chromosomes Chimpanzee Human Genome Average # of % of # of % of % of Repeat family elements sequence elements sequence sequence Alu L Retroviral DNA transposon Total Repeat content analyzed using RepeatMasker ( Genome averages derive from the following reference: Human Genome Sequencing Consortium, Initial sequencing and analysis of the human genome. Nature 409, (2001).

2 Supplementary Table 2 - Lineage-specific repetitive element integrations into X-degenerate sequences of chimpanzee and human Y chromosomes Repetitive Element Chimpanzee Human AluY 21 9 AluYa AluYa8 0 1 AluYb8 0 7 AluYb9 0 1 AluYc2 2 2 AluYd3a1 1 0 AluYd8 0 2 AluYe5 0 2 All Alu s L1 1 0 L1HS 0 2 L1P 0 1 L1PA All L1 s 14 7 CERV1* 18 0 CERV2 3 0 HERV9 1 1 HERV-K 1 0 All ERV s 23 1 Grand totals Summed length of sequence (bp) Repeat content analyzed using RepeatMasker ( *Six of the CERV1 elements are solo-ltrs, which represent integrations of full-length elements followed by LTR-LTR recombination resulting in deletion of the internal sequ

3 Supplementary Table 3 - Human-chimpanzee nucleotide substitution rates in coding sequences and introns of X-degenerate genes and pseudogenes. Much of this data is represented graphically in Figure 2a. p-value sequence # of (vs. introns; classification substitutions # of sites subst./site z-test) genes coding < a non-synon < b introns pseudogenes coding < c introns a For the 16 X-degenerate genes, the coding sequence substitution rate (K coding ) is significantly lower than the intron substitution rate (K intron ). b For the 16 X-degenerate genes, the rate of non-synonymous substitutions within the coding sequence (K a ) is significantly lower than K intron. c For the 11 X-degenerate pseudogenes, K coding is significantly higher than K intron. The higher G+C content of (former) coding vs. intron sequence (50.9% vs. 40.3%) likely accounts for the elevated substitution rate.

4 Supplementary Table 4 - Analyzing transcription of chimpanzee X-degenerate genes: RT-PCR primers and expected product sizes Product sizes (bp) Gene Forward primer Reverse primer (cdna / genomic) Cyorf15A TTGATGCAAGCCCTAAACAC TGCATTTACAATATGGAGTTGC 358 / 828 Cyorf15B AGGAAACGGAAAAGCTGACA CCATCAGTACAATGATCACACTTC 461 DDX3Y ATCGTATTGGCCGTACAGGA GCTTCTTAAAGGTAAGGGTGACTT 499 / 1557 EIF1AY CCCACCTGCTGCATCTTAGT TTCTTAATTGATGATGGCCAAA 605 JARID1D CTTAGCATTAAGGCCCGAC TGGTTGGCTCCAGACTGAA 537 NLGN4Y GCAAGGCGAGTTCCTCAAT GGCCACTTCTTGAAAGCGA 601 PRKY CTACCGCCTGCAGGACTTC CCCAAAGTCCGTGAGCTTG 448 RPS4Y1 GATGTCATCAGCATCGAGAAG GAAAAAAAGATATGCTGCTACTGC 558 RPS4Y2 AACAAGAATTCATTGTTTATTTGTGC GCAAGGTTCGAGTGGACA 628 SRY ATGAACGCATTCTTCGTGTG GGCCTTTATTAGCCAGAGAAAA 548 TBL1Y ATAATCACTGAAAGCCAATGGAA TTATTACTTGGGACGTCGTGC 250 TMSB4Y GTCCCCTGCGTTTTGAAATA GCCTGTTTAAGATTCGCCTG 358 / 1110 USP9Y TGGGCAGTGGAATGGCTA AGAGGCAGGTGAATGAGGATAT 268 / 1024 UTY GAACACAACACAAAGATCATTCA CATGCCCAGAATGGTATGC 227 / 796 ZFY GAAGAGGCAGATGTATCTGAAAAT CCATTTCTGATTCTGCATCG 451 All primer pairs are intron-spanning, with the exception of SRY, which lacks introns. For those genes whose cdna and genomic product sizes differed by roughly 1000 bp or less, both sizes are given.

5 Supplementary Table 5 - GenBank accession numbers for chimpanzee cdna sequences Gene CYorf15A CYorf15B DDX3Y EIF1AY JARID1D NLGN4Y PRKY RPS4Y1 RPS4Y2 SRY TBL1Y USP9Y UTY ZFY Accession number AY AY AY AY AY AY AY AY AY AY AY AY AY AY679779

6 Supplementary Table 6 - GenBank accession numbers for chimpanzee BAC and fosmid clones BAC (or fosmid) name 0490L F O K K13 P010N A A E N F P M O M B K21 P057F F17 P007D P I K M A D14 P044N F15 P164C H P H F19 P051P N02 P097L E C15 Accession number AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC147710

7 BAC (or fosmid) name 0170A I K L J E02 C3184H06 (fosmid) C1187E02 (fosmid) 0417F A K L J L K N L B F D C M10 P021M J A N A06 C1322F05 (fosmid) C0653B12 (fosmid) C3278C11 (fosmid) C0124C12 (fosmid) C1825H08 (fosmid) 0065E I K N I19 P043D G02 P002P22 P135J09 Accession number AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC AC142308

8 BAC (or fosmid) name P012I19 P001K P H E E L D F21 Accession number AC AC AC AC AC AC AC AC AC146168