SUPPLEMENTARY INFORMATION

Size: px
Start display at page:

Download "SUPPLEMENTARY INFORMATION"

Transcription

1 doi: /nature17405 Supplementary Information 1 Determining a suitable lower size-cutoff for sequence alignments to the nuclear genome Analyses of nuclear DNA sequences from archaic genomes have until now been restricted to fragments of length 35bp or longer. However, to maximize the yield of DNA fragments we have used fragments as short as 30bp when reconstructing mtdna genomes from the SH specimens. Lowering the size-cutoff for the analysis of the nuclear genome of the SH specimens increases the number of informative sequences available for analysis, but also increases the risk of including incorrectly aligned microbial sequences that have spurious similarity to the much larger nuclear genome. The problem of spurious alignments becomes apparent when plotting the percentage of sequences that map to the human genome as a function of their length (Supplementary Figure 1). With the alignment software and parameters chosen here and in previous studies 1,2 (BWA 3 with options -n 0.01 o 2 l ), we observe a steep increase in the success of mapping for sequences shorter than 35bp. However, this is driven largely by alignments that include mismatches. The fraction of alignments without mismatches only increases at a much shorter length of around 25bps. Since neither evolutionary differences (approximately 1 in 700bp 1 ), nor deamination, nor sequencing error would reduce the fraction of perfectly aligning sequences in a manner that is dependent on sequence length to the extent observed here, we deem spurious alignments the best explanation for the difference between perfect matches and alignments allowing mismatches. To quantify the proportion of spurious alignments with size cutoffs of 30 and 35bp we identified 11,299 sites where the human reference genome differs from all present-day humans in the 1000 genomes project phase I data 4 as well as from the chimpanzee genome (pantro2) 5. We expect that nearly 100% of endogenous hominin sequences overlapping these sites, which represent errors in the human reference genome sequence or rare polymorphisms observed with ~0.1% frequency in presentday humans, will carry the state of other humans and the chimpanzee (i.e. the ancestral state). In contrast, less than ~3% of evolutionarily unrelated microbial sequences are expected to do so due to random similarity to the human reference genome (Supplementary Figure 2). By counting how often DNA fragments overlapping these positions share the derived state we obtained an approximate estimate of the proportion of spurious alignments. To minimize the impact of cytosine deamination in this analysis, we disregarded alignments if either the ancestral or the derived state at a given position was C in the orientation sequenced, or G if the complementary strand was sequenced. Using this 1

2 approach (Supplementary Figure 2), we estimate that at a fragment length cut-off of 30bp between 9% and 67% of all aligned DNA fragments from the five specimens are likely of microbial origin. When we increase the fragment length cut-off to 35bp we find no evidence for spurious microbial alignments although the numbers are admittedly small (Supplementary Table 1). Supplementary Table 1: The proportion of sequences aligned randomly to the human genome, estimated as 1 minus the frequency of sharing the ancestral state, using lower size cut-offs of 30 and 35bp. 95% binomial confidence intervals are provided in brackets. All sequences used in this analysis aligned to the genome with a map quality score

3 Supplementary Figure 1: Percentage of sequences from SH femur XIII that map to the human reference genome when mismatches are allowed (solid line) or perfect matches required (dashed line) as a function of their size. The alignment parameters chosen here decrease the number of allowed mismatch by one for reads shorter than 22bp, leading to the observed dip in the percentage of mapped sequences. 3

4 Supplementary Figure 2: Probability of sharing rare variants with the human reference genome. These variants (green) are identified by requiring that all of the present-day human genomes sequenced in phase I of the 1000 genomes project as well as the chimpanzee differ from the reference genome, i.e. share the ancestral state (red). With the parameters chosen for mapping, the number of mismatches between a DNA fragment and the reference is limited to 3 for fragments 41bp, to 4 for fragments >=42bp, and to 5 for fragments >=64bp, etc., i.e. less than 10% of the bases are allowed differ from the reference genome in any given alignment. Given that differences to the reference genome can be due to three different bases, less than 3.3% of spuriously aligned microbial DNA fragments are expected to share the ancestral state at these positions. In contrast, nearly all (>99.9%) hominin sequences are expected to be ancestral at these positions. 4

5 Supplementary Information 2 Using conditional substitution frequencies for detecting and quantifying ancient DNA in mixtures with recent contamination Library-based sample preparation techniques in combination with high-throughput sequencing have revealed that authentic ancient DNA sequences show a distinct elevation of C to T substitutions close to their ends (or G to A substitutions depending on the method used for library preparation) 6. These substitutions, which arise due to cytosine deamination, accumulate predominantly at molecule ends and are much less frequent in contaminant DNA that was introduced during or after excavation of fossils 7. Elevated frequencies of damage-induced substitutions can thus be used to provide evidence for the presence of endogenous ancient DNA. Although there is no strict linear relationship between the strength of deamination signals and the age of specimens 8, nearly all samples that are thousands of years in age have been reported to show terminal C to T substitution frequencies exceeding 10% (see for example ref. 9 and 10). Some samples in which authentic ancient DNA may still be present, including some of the SH specimens described here, exhibit lower frequencies of C to T substitutions, not compatible with the presumed age of the sequences due to an overwhelming background of presentday contamination that dilutes the deamination signal. To increase our ability to detect ancient DNA in highly contaminated samples based on patterns of DNA damage, we have recently introduced the concept of conditional substitution frequencies in the analysis of mtdna sequences of SH femur XIII 11. Conditional substitution frequencies are obtained by isolating sequences that show a C to T difference to the reference at their 5 ends and determining the frequency of C to T substitutions at their 3 ends, and vice versa. In the case of femur XIII, C to T substitution frequencies increased from 12 and 17% at the 5 and 3 ends of all sequences to 55 and 62% in the population of sequences with a C to T substitution at the other end. The latter numbers are close to the deamination signals detected in a similar-age cave bear from the same site, suggesting that they might be a good proxy for the deamination frequencies of the fraction of endogenous DNA fragments in the specimen. However, it is also conceivable that decay processes leading to deamination at one end of a molecule may increase the likelihood of deamination at the other end, thereby reducing the power to detect and quantify mixtures of ancient and contaminant DNA based on conditional substitution frequencies. 5

6 To test whether a population of highly deaminated molecules may also be present among contaminant DNA fragments, we analyzed published mtdna sequences from two ancient hominin specimens, SH femur XIII 11 and a Late Pleistocene Neanderthal (Mezmaiskaya 1) 12, that show high levels of modern human contamination. Using diagnostic positions that differentiate the mtdna genomes of 311 present-day humans from the two archaic individuals (132 positions for femur XIII and 41 for Mezmaiskaya 1), we separated contaminant and endogenous sequences based on their sharing of either the human-derived or the ancestral state. To minimize erroneous assignments due to deamination we disregarded sequences that overlap a diagnostic site with their first or last 3 positions. We then determined the frequencies of C to T substitutions in both sets of sequences, both with and without conditioning on the presence of a C to T substitution on the other end (Supplementary Table 2). As expected for mixtures of ancient and contaminant sequences, conditional substitution frequencies increase compared to regular substitution frequencies prior to separation of the sequences. In the subset of contaminant sequences, conditional substitution frequencies increase slightly in femur XIII but remain well below 10%, providing little evidence for the presence of a highly deaminated population of DNA fragments among the contaminant DNA in the two specimens. To test whether deamination occurs independently on both ends of molecules we determined regular and conditional substitution frequencies using larger data sets of nuclear sequences from six archaic individuals 1,2,13 that have been estimated to contain negligible levels of modern human contamination (1% or less). Regular and conditional substitution frequencies are expected to be equal if deamination occurs independently at either end of the molecules. However, conditional substitution frequencies are between 10% lower and 14% higher than regular substitution frequencies, indicating that deamination frequencies at the ends are not entirely independent (Supplementary Table 3). A reduction in conditional substitution frequencies in same samples may be explained by mapping bias, i.e. the difficulty of mapping sequences with 2 terminal C to T substitutions and possibly other differences to the reference genome. Comparisons of regular and conditional substitution frequencies therefore do not allow for accurate estimation of modern human contamination, but they can be used to detect the presence of small amounts of ancient sequences in a background of human contamination, and provide rough estimates of contamination in heavily contaminated samples. When applied to the nuclear DNA sequences from the Sima de los Huesos specimens, we estimate that modern human DNA contamination is present in at least a 2.7-fold excess in all libraries, contributing 63% or more of all sequences (Supplementary Table 3). To minimize the impact of contamination on our analyses, we therefore restricted down-stream analyses to only those fragments showing evidence of deamination. 6

7 Supplementary Table 2: C to T substitution frequencies and conditional substitution frequencies in mtdna sequences from two archaic individuals. Diagnostic sites that differentiate the respective archaic individual from 311 present-day humans were used to identify subsets of sequences that are putatively of endogenous origin or from recent human contamination. The percentage of sequences identified as contamination from diagnostic sites is provided in the second column. Specimen SH femur XIII 714,601 sequences Human contamination based on diagnostic Population of sites [%] 94.9 All sequences 6.2 (9,406/151,703) C to T substitution frequency [%] (observations) Conditional sequences 5 end 3 end 5 end 3 end (10,099/112,213) (1081/2,335) Human contamination 1.6 (532/33,269) 2.3 (539/23,453) 3.8 (5/132) 54.7 (1,082/1,978) 5.2 (5/96) Endogenous 59.8 (1,354/2,264) 54.2 (1,004/1,852) 56.5 (152/269) 67.0 (152/227) Mezmaiskaya 1 (library B9687) 43.3% All sequences 8.1 (512/6,320) 13.7 (723/5,277) 15.5 (28/181) 25.9 (28/108) 28,955 sequences Human contamination 0.9 (5/572) 0.7 (4/550) 0.0 (0/1) 0.0 (0/1) Endogenous 13.5 (93/690) 18.3 (104/568) 15.6 (5/32) 16.7 (5/30) 7

8 Supplementary Table 3: Damage-induced substitution frequencies and conditional substitution frequencies in 6 archaic genomes and the SH libraries sequenced in the present study. Substitution frequencies at the 3 ends of sequences in the Vindija and Mezmaiskaya samples correspond to G to A substitutions, as the respective libraries were generated using a double-stranded method. Rough contamination estimates based on the comparison of the two substitution frequencies are provided in the final column. Specimen C to T (or G to A) substitution frequency [%] Conditional 5 end 3 end 5 end 3 end Conditional / regular substitution frequencies Contamination estimate based on deamination Altai Neanderthal (chromosome 21 only) Denisova manual phalanx (chromosome 21 only) Vindija (all sequences) Vindija (all sequences) Vindija (all sequences) Mezmaiskaya 1 (all sequences) SH Femur XIII (AT2944) SH Incisor (AT-5482) SH Femur frag. (AT-5431) SH Molar (AT-5444) SH Scapula (AT-6672)

9 Supplementary Information 3 Estimating present-day human contamination based on the sharing of alleles that have risen to high frequency in modern humans since the split from the archaic hominins The use of diagnostic sites, i.e. sites that show fixed differences between the hominin group from which the sequence comes and the potential contaminant group, is well established for ancient mtdna analysis. Here, we extend this strategy to the autosomes. We identified 28,552 diagnostic positions where all present-day human genomes sequenced in phase I of the 1000 genomes project differ from the high-coverage genome sequences of the Denisovan individual, the Altai Neanderthal and the chimpanzee, bonobo, gorilla, orangutan and rhesus macaque. The states of the outgroups were retrieved from the Ensembl Compara EPO 6 primate whole genome alignments 14,15. All sites were required to be located within unique regions of the human genome as defined by the map35_50% criteria described in Prüfer et al (Supplementary Section 5b) 2. Sequences from the SH specimens showing the derived, i.e. the human state at such sites are either due to present-day human contamination, variants segregating in the archaic populations that are not present in any of the four archaic chromosomes used in this analysis, or sequencing error. The percentage of sequences sharing the derived state can therefore be used to estimate an upper limit on the present-day human contamination present in the libraries. To test the suitability of this approach we first performed the analysis using low coverage sequence data from four Neanderthal individuals that were previously determined to contain less than 1% human contamination 2,13. To reduce errors due to cytosine-deamination, we masked out thymines in the first three positions and adenines in the last three to account for the occurrence of damage-derived C to T substitutions at the 5 end and G to A substitutions at the 3 end. Based on the percentage of sequences sharing the derived state our contamination estimate for these samples is between 6.3 and 8.6% (Supplementary Table 4). Because this approach measures not only contamination, but also unknown variation in the archaic hominins and sequencing error, these estimates are higher than the actual contamination estimates but allow us to exclude contamination above the 10% level. We thus performed the same analysis for the SH sequences, this time masking out only thymines within the first and last three sequence positions as single-stranded library preparation fully retains the strand information of the sequenced molecules and does not induce artificial G to A substitutions. Estimates of contamination are high in all SH libraries (78% or more), but drop to 20% for incisor AT-5482 and 0% for 9

10 both femur AT-5431 and molar AT-5444 after filtering for putatively deaminated sequences by requiring a C to T substitution at the 5 or 3 terminus. Since the confidence intervals of our estimates are extremely wide due to the limited data, we tried to further increase the power of the analysis by considering sites (totaling 128,004) where the derived allele has risen to 90% or more frequency in modern humans. While providing more power, this analysis may slightly under-estimate contamination as some contaminant sequences may share the archaic state. Overall, contamination estimates are similar to the first estimates but with narrower confidence intervals (Supplementary Table 4). 10

11 Supplementary Table 4: Nuclear contamination estimates based on sites that are derived in all, or more than 90%, of present-day humans and ancestral in two Denisovan and two Neanderthal chromosomes. 95% binomial confidence intervals are provided in brackets. Specimen Femur XIII Library A2021 Incisor AT-5482 Femur fragment AT-5431 Molar AT-5444 Scapula AT-6672 Filtered for sequences with terminal C to T substitution Derived allele frequency in present-day humans #sites #presentday human state Present-day human contamination [%] (95% C.I.) 81.1 ( ) ( ) 83.9 ( ) 20.0 ( ) 68.1 ( ) 0.0 ( ) 76.7 ( ) 0.0 ( ) 92.7 ( ) #sites #present-day human state ,811 1, N/A 1 1 Vindija , Vindija , Vindija , Mezmaiskaya 1-13,068 1, ( ) 6.0 ( ) 6.3 ( ) 8.6 ( ) 36,470 1,931 38,037 1,676 33,972 1,528 59,783 4,100 Present-day human contamination [%] (95% C.I.) 87.5 ( ) 62.5 ( ) 85.7 ( ) 20.8 ( ) 76.1 ( ) 18.2 ( ) 88.0 ( ) 0.0 ( ) 97.6 ( ) ( ) 5.3 ( ) 4.4 ( ) 4.5 ( ) 6.9 ( ) 11

12 References 1. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, (2012). 2. Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43-9 (2014). 3. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, (2009) Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, (2010). 5. The Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, (2005). 6. Briggs, A.W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci U S A 104, (2007). 7. Krause, J. et al. A complete mtdna genome of an early modern human from Kostenki, Russia. Curr Biol 20, (2010). 8. Sawyer, S., Krause, J., Guschanski, K., Savolainen, V. & Paabo, S. Temporal patterns of nucleotide misincorporations and DNA fragmentation in ancient DNA. PLoS One 7, e34131 (2012). 9. Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat Commun 5, 5257 (2014). 10. Allentoft, M.E. et al. Population genomics of Bronze Age Eurasia. Nature 522, (2015). 11. Meyer, M. et al. A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature 505, (2014). 12. Gansauge, M.T. & Meyer, M. Selective enrichment of damaged DNA molecules for ancient genome sequencing. Genome Res 24, (2014). 13. Green, R.E. et al. A draft sequence of the Neandertal genome. Science 328, (2010). 14. Paten, B., Herrero, J., Beal, K., Fitzgerald, S. & Birney, E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res 18, (2008). 15. Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res 18, (2008). 12

An early modern human from Romania with a recent Neanderthal ancestor

An early modern human from Romania with a recent Neanderthal ancestor An early modern human from Romania with a recent Neanderthal ancestor The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation

More information

Fossils From Vindija Cave, Croatia (38 44 kya) Admixture between Archaic and Modern Humans

Fossils From Vindija Cave, Croatia (38 44 kya) Admixture between Archaic and Modern Humans Fossils From Vindija Cave, Croatia (38 44 kya) Admixture between Archaic and Modern Humans Alan R Rogers February 12, 2018 1 / 63 2 / 63 Hominin tooth from Denisova Cave, Altai Mtns, southern Siberia (41

More information

Detecting ancient admixture using DNA sequence data

Detecting ancient admixture using DNA sequence data Detecting ancient admixture using DNA sequence data October 10, 2008 Jeff Wall Institute for Human Genetics UCSF Background Origin of genus Homo 2 2.5 Mya Out of Africa (part I)?? 1.6 1.8 Mya Further spread

More information

Inconsistencies in Neanderthal Genomic DNA Sequences

Inconsistencies in Neanderthal Genomic DNA Sequences Inconsistencies in Neanderthal Genomic DNA Sequences Jeffrey D. Wall *, Sung K. Kim Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America

More information

LETTER. An early modern human from Romania with a recent Neanderthal ancestor

LETTER. An early modern human from Romania with a recent Neanderthal ancestor doi:10.1038/nature14558 An early modern human from Romania with a recent Neanderthal ancestor Qiaomei Fu 1,2,3 *, Mateja ajdinjak 3 *, ana Teodora Moldovan 4, Silviu Constantin 5, Swapan Mallick 2,6,7,

More information

Supplementary information ATLAS

Supplementary information ATLAS Supplementary information ATLAS Vivian Link, Athanasios Kousathanas, Krishna Veeramah, Christian Sell, Amelie Scheu and Daniel Wegmann Section 1: Complete list of functionalities Sequence data processing

More information

Overview One of the promises of studies of human genetic variation is to learn about human history and also to learn about natural selection.

Overview One of the promises of studies of human genetic variation is to learn about human history and also to learn about natural selection. Technical design document for a SNP array that is optimized for population genetics Yontao Lu, Nick Patterson, Yiping Zhan, Swapan Mallick and David Reich Overview One of the promises of studies of human

More information

A mitochondrial genome sequence of a hominin from Sima de los Huesos

A mitochondrial genome sequence of a hominin from Sima de los Huesos LETTER doi:10.1038/nature12788 A mitochondrial genome sequence of a hominin from Sima de los Huesos Matthias Meyer 1, Qiaomei Fu 1,2, Ayinuer Aximu-Petri 1, Isabelle Glocke 1, Birgit Nickel 1, Juan-Luis

More information

Supporting Information

Supporting Information Supporting Information Eriksson and Manica 10.1073/pnas.1200567109 SI Text Analyses of Candidate Regions for Gene Flow from Neanderthals. The original publication of the draft Neanderthal genome (1) included

More information

File S1 Technical details of a SNP array optimized for population genetics

File S1 Technical details of a SNP array optimized for population genetics File S1 Technical details of a SNP array optimized for population genetics Yontao Lu, Nick Patterson, Yiping Zhan, Swapan Mallick and David Reich Overview One of the promises of studies of human genetic

More information

Learning about human population history from ancient and modern genomes

Learning about human population history from ancient and modern genomes APPLICATIONS OF NEXT-GENERATION SEQUENCING Learning about human population history from ancient and modern genomes Mark Stoneking* and Johannes Krause Abstract Genome-wide data, both from SNP arrays and

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:138/nature10532 a Human b Platypus Density 0.0 0.2 0.4 0.6 0.8 Ensembl protein coding Ensembl lincrna New exons (protein coding) Intergenic multi exonic loci Density 0.0 0.1 0.2 0.3 0.4 0.5 0 5 10

More information

Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences

Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences de Filippo et al. BMC Biology (2018) 16:121 https://doi.org/10.1186/s12915-018-0581-9 RESEARCH ARTICLE Open Access Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA

More information

Genome 373: Mapping Short Sequence Reads II. Doug Fowler

Genome 373: Mapping Short Sequence Reads II. Doug Fowler Genome 373: Mapping Short Sequence Reads II Doug Fowler The final Will be in this room on June 6 th at 8:30a Will be focused on the second half of the course, but will include material from the first half

More information

The Red Queen Model of Recombination Hotspots Evolution in the Light of Archaic and Modern Human Genomes

The Red Queen Model of Recombination Hotspots Evolution in the Light of Archaic and Modern Human Genomes The Red Queen Model of Recombination Hotspots Evolution in the Light of Archaic and Modern Human Genomes Yann Lesecque 1, Sylvain Glémin 2, Nicolas Lartillot 1, Dominique Mouchiroud 1, Laurent Duret 1

More information

Letter. The genome of the offspring of a Neanderthal mother and a Denisovan father

Letter. The genome of the offspring of a Neanderthal mother and a Denisovan father Letter https://doi.org/10.1038/s41586-018-0455-x The genome of the offspring of a Neanderthal mother and a Denisovan father Viviane Slon 1,7 *, Fabrizio Mafessoni 1,7, Benjamin Vernot 1,7, Cesare de Filippo

More information

Supplementary Figures

Supplementary Figures Supplementary Figures 1 Supplementary Figure 1. Analyses of present-day population differentiation. (A, B) Enrichment of strongly differentiated genic alleles for all present-day population comparisons

More information

Addressing Challenges of Ancient DNA Sequence Data Obtained with Next Generation Methods

Addressing Challenges of Ancient DNA Sequence Data Obtained with Next Generation Methods DISSERTATION Addressing Challenges of Ancient DNA Sequence Data Obtained with Next Generation Methods submitted in fulfillment of the requirements for the degree Doctorate of natural science doctor rerum

More information

Supplementary Information for The ratio of human X chromosome to autosome diversity

Supplementary Information for The ratio of human X chromosome to autosome diversity Supplementary Information for The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes Michael F. Hammer, August E. Woerner, Fernando L. Mendez, Joseph

More information

Supplementary Figures

Supplementary Figures Supplementary Figures A B Supplementary Figure 1. Examples of discrepancies in predicted and validated breakpoint coordinates. A) Most frequently, predicted breakpoints were shifted relative to those derived

More information

Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4

Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4 WHITE PAPER Oncomine Comprehensive Assay Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4 Contents Scope and purpose of document...2 Content...2 How Torrent

More information

Course summary. Today. PCR Polymerase chain reaction. Obtaining molecular data. Sequencing. DNA sequencing. Genome Projects.

Course summary. Today. PCR Polymerase chain reaction. Obtaining molecular data. Sequencing. DNA sequencing. Genome Projects. Goals Organization Labs Project Reading Course summary DNA sequencing. Genome Projects. Today New DNA sequencing technologies. Obtaining molecular data PCR Typically used in empirical molecular evolution

More information

Supplemental Figures Supplemental Figure 1.

Supplemental Figures Supplemental Figure 1. Supplemental Material: Annu. Rev. Genom. Hum. Genet. 2017. 18:321-356 https://doi.org/10.1146/annurev-genom-091416-035526 A Robust Framework for Microbial Archaeology Warinner et al. Supplemental Figures

More information

After cell death, DNA steadily decays. If microbes do

After cell death, DNA steadily decays. If microbes do How reliable are genomes from ancient DNA? Brian Thomas and Jeffrey Tomkins Many reports of ancient DNA (adna) assert recovery from specimens with age assignments that greatly exceed Scripture s age of

More information

Supplement to: The Genomic Sequence of the Chinese Hamster Ovary (CHO)-K1 cell line

Supplement to: The Genomic Sequence of the Chinese Hamster Ovary (CHO)-K1 cell line Supplement to: The Genomic Sequence of the Chinese Hamster Ovary (CHO)-K1 cell line Table of Contents SUPPLEMENTARY TEXT:... 2 FILTERING OF RAW READS PRIOR TO ASSEMBLY:... 2 COMPARATIVE ANALYSIS... 2 IMMUNOGENIC

More information

Supplementary Material online Population genomics in Bacteria: A case study of Staphylococcus aureus

Supplementary Material online Population genomics in Bacteria: A case study of Staphylococcus aureus Supplementary Material online Population genomics in acteria: case study of Staphylococcus aureus Shohei Takuno, Tomoyuki Kado, Ryuichi P. Sugino, Luay Nakhleh & Hideki Innan Contents Estimating recombination

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature26136 We reexamined the available whole data from different cave and surface populations (McGaugh et al, unpublished) to investigate whether insra exhibited any indication that it has

More information

Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach.

Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach. Supplementary Figure 1 Schematic view of phasing approach. A sequence-based schematic view of the serial compartmentalization approach. First, barcoded primer sequences are attached to the bead surface

More information

Chang Xu Mohammad R Nezami Ranjbar Zhong Wu John DiCarlo Yexun Wang

Chang Xu Mohammad R Nezami Ranjbar Zhong Wu John DiCarlo Yexun Wang Supplementary Materials for: Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller Chang Xu Mohammad R Nezami Ranjbar Zhong Wu John

More information

REPORT. Complex History of Admixture between Modern Humans and Neandertals. Benjamin Vernot 1, * and Joshua M. Akey 1, *

REPORT. Complex History of Admixture between Modern Humans and Neandertals. Benjamin Vernot 1, * and Joshua M. Akey 1, * REPORT Complex History of Admixture between Modern Humans and Neandertals Benjamin Vernot 1, * and Joshua M. Akey 1, * Recent analyses have found that a substantial amount of the Neandertal genome persists

More information

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère C3BI VARIANTS CALLING November 2016 Pierre Lechat Stéphane Descorps-Declère General Workflow (GATK) software websites software bwa picard samtools GATK IGV tablet vcftools website http://bio-bwa.sourceforge.net/

More information

ESTIMATING GENETIC VARIABILITY WITH RESTRICTION ENDONUCLEASES RICHARD R. HUDSON1

ESTIMATING GENETIC VARIABILITY WITH RESTRICTION ENDONUCLEASES RICHARD R. HUDSON1 ESTIMATING GENETIC VARIABILITY WITH RESTRICTION ENDONUCLEASES RICHARD R. HUDSON1 Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104 Manuscript received September 8, 1981

More information

CS262 Computation Genomics Winter 2015 Lecture 15 Human Population Genomics (02/24/2015) Scribed by: Junjie (Jason) Zhu Image Source: Lecture Notes

CS262 Computation Genomics Winter 2015 Lecture 15 Human Population Genomics (02/24/2015) Scribed by: Junjie (Jason) Zhu Image Source: Lecture Notes CS262 Computation Genomics Winter 2015 Lecture 15 Human Population Genomics (02/24/2015) Scribed by: Junjie (Jason) Zhu Image Source: Lecture Notes Introduction As the cost of sequencing individuals is

More information

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Third Pavia International Summer School for Indo-European Linguistics, 7-12 September 2015 HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Brigitte Pakendorf, Dynamique du Langage, CNRS & Université

More information

Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human. Supporting Information

Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human. Supporting Information Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human Angela Re #, Davide Corá #, Daniela Taverna and Michele Caselle # equal contribution * corresponding author,

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Read Complexity

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Read Complexity Supplementary Figure 1 Read Complexity A) Density plot showing the percentage of read length masked by the dust program, which identifies low-complexity sequence (simple repeats). Scrappie outputs a significantly

More information

Whole Human Genome Sequencing Report This is a technical summary report for PG DNA

Whole Human Genome Sequencing Report This is a technical summary report for PG DNA Whole Human Genome Sequencing Report This is a technical summary report for PG0002601-DNA Physician and Patient Information Physician name: Vinodh Naraynan Address: Suite 406 222 West Thomas Road Phoenix

More information

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types Lecture 12 Reading Lecture 12: p. 335-338, 346-353 Lecture 13: p. 358-371 Genomics Definition Species sequencing ESTs Mapping Why? Types of mapping Markers p.335-338 & 346-353 Types 222 omics Interpreting

More information

Understanding Accuracy in SMRT Sequencing

Understanding Accuracy in SMRT Sequencing Understanding Accuracy in SMRT Sequencing Jonas Korlach, Chief Scientific Officer, Pacific Biosciences Introduction Single Molecule, Real-Time (SMRT ) DNA sequencing achieves highly accurate sequencing

More information

Nature Genetics: doi: /ng.3254

Nature Genetics: doi: /ng.3254 Supplementary Figure 1 Comparing the inferred histories of the stairway plot and the PSMC method using simulated samples based on five models. (a) PSMC sim-1 model. (b) PSMC sim-2 model. (c) PSMC sim-3

More information

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses Course Information Introduction to Algorithms in Computational Biology Lecture 1 Meetings: Lecture, by Dan Geiger: Mondays 16:30 18:30, Taub 4. Tutorial, by Ydo Wexler: Tuesdays 10:30 11:30, Taub 2. Grade:

More information

Genetics 101. Prepared by: James J. Messina, Ph.D., CCMHC, NCC, DCMHS Assistant Professor, Troy University, Tampa Bay Site

Genetics 101. Prepared by: James J. Messina, Ph.D., CCMHC, NCC, DCMHS Assistant Professor, Troy University, Tampa Bay Site Genetics 101 Prepared by: James J. Messina, Ph.D., CCMHC, NCC, DCMHS Assistant Professor, Troy University, Tampa Bay Site Before we get started! Genetics 101 Additional Resources http://www.genetichealth.com/

More information

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next.

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. 1. True or False? A typical chromosome can contain several hundred to several thousand genes, arranged in linear order along the DNA molecule present in the chromosome. 2. True or False? The sequence of

More information

Mate-pair library data improves genome assembly

Mate-pair library data improves genome assembly De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate

More information

Population Genetics, Systematics and Conservation of Endangered Species

Population Genetics, Systematics and Conservation of Endangered Species Population Genetics, Systematics and Conservation of Endangered Species Discuss Population Genetics and Systematics Describe how DNA is used in species management Wild vs. Captive populations Data Generation:

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

BST227 Introduction to Statistical Genetics. Lecture 8: Variant calling from high-throughput sequencing data

BST227 Introduction to Statistical Genetics. Lecture 8: Variant calling from high-throughput sequencing data BST227 Introduction to Statistical Genetics Lecture 8: Variant calling from high-throughput sequencing data 1 PC recap typical genome Differs from the reference genome at 4-5 million sites ~85% SNPs ~15%

More information

Introduction to Algorithms in Computational Biology Lecture 1

Introduction to Algorithms in Computational Biology Lecture 1 Introduction to Algorithms in Computational Biology Lecture 1 Background Readings: The first three chapters (pages 1-31) in Genetics in Medicine, Nussbaum et al., 2001. This class has been edited from

More information

Separating Population Structure from Recent Evolutionary History

Separating Population Structure from Recent Evolutionary History Separating Population Structure from Recent Evolutionary History Problem: Spatial Patterns Inferred Earlier Represent An Equilibrium Between Recurrent Evolutionary Forces Such as Gene Flow and Drift. E.g.,

More information

Rev. Cell Biol. Mol. Medicine Paleogenomics 243

Rev. Cell Biol. Mol. Medicine Paleogenomics 243 Rev. Cell Biol. Mol. Medicine Paleogenomics 243 Paleogenomics Peter D. Heintzman *, André E. R. Soares *, Dan Chang, and Beth Shapiro Department of Ecology and Evolutionary Biology, University of California

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/314/5801/960/dc1 Supporting Online Material for The Transcriptome of the Sea Urchin Embryo Manoj P. Samanta, Waraporn Tongprasit, Sorin Istrail, R. Andrew Cameron, Qiang

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature12886 Contents SI 1: Sampling, Library Preparation and Sequencing............ 1 SI 2a: Processing and Mapping........................ 6 SI 2b: Altai Neandertal Mitochondrial Genome Sequence.........

More information

Proceedings of the World Congress on Genetics Applied to Livestock Production,

Proceedings of the World Congress on Genetics Applied to Livestock Production, Genomics using the Assembly of the Mink Genome B. Guldbrandtsen, Z. Cai, G. Sahana, T.M. Villumsen, T. Asp, B. Thomsen, M.S. Lund Dept. of Molecular Biology and Genetics, Research Center Foulum, Aarhus

More information

1 (1) 4 (2) (3) (4) 10

1 (1) 4 (2) (3) (4) 10 1 (1) 4 (2) 2011 3 11 (3) (4) 10 (5) 24 (6) 2013 4 X-Center X-Event 2013 John Casti 5 2 (1) (2) 25 26 27 3 Legaspi Robert Sebastian Patricia Longstaff Günter Mueller Nicolas Schwind Maxime Clement Nararatwong

More information

Deleterious mutations

Deleterious mutations Deleterious mutations Mutation is the basic evolutionary factor which generates new versions of sequences. Some versions (e.g. those concerning genes, lets call them here alleles) can be advantageous,

More information

Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA

Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA Published online 22 December 2009 Nucleic Acids Research, 2010, Vol. 38, No. 6 e87 doi:10.1093/nar/gkp1163 Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA Adrian W.

More information

Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA

Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA Published online 22 December 2009 Nucleic Acids Research, 2010, Vol. 38, No. 6 e87 doi:10.1093/nar/gkp1163 Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA Adrian W.

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

Lesson Overview DNA Replication

Lesson Overview DNA Replication 12.3 THINK ABOUT IT Before a cell divides, its DNA must first be copied. How might the double-helix structure of DNA make that possible? Copying the Code What role does DNA polymerase play in copying DNA?

More information

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY.

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY. The psoriasis associated deletion of late cornified envelope genes LCE3B and LCE3C has been maintained under balancing selection since Human Denisovan divergence Petar Pajic 1 *, Yen Lung Lin 1 *, Duo

More information

SUPPLEMENTARY FIGURES AND TABLES. Exploration of mirna families for hypotheses generation

SUPPLEMENTARY FIGURES AND TABLES. Exploration of mirna families for hypotheses generation SUPPLEMENTARY FIGURES AND TABLES Exploration of mirna families for hypotheses generation Timothy K. K. Kamanu, Aleksandar Radovanovic, John A. C. Archer, and Vladimir B. Bajic King Abdullah University

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Obtaining DNA from degraded samples for NGS sequencing

Obtaining DNA from degraded samples for NGS sequencing Obtaining DNA from degraded samples for NGS sequencing A brief overview of Alexander (Sasha) Mikheyev s lecture at USC 03/13/14 Presented by Jacqueline Robinson 04/23/2014 NGS is great, but Standard protocols

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/328/5979/710/dc1 Supporting Online Material for A Draft Sequence of the Neandertal Genome Richard E. Green,* Johannes Krause, Adrian W. Briggs, Tomislav Maricic, Udo

More information

Supplementary Figure 1

Supplementary Figure 1 Nucleotide Content E. coli End. Neb. Tr. End. Neb. Tr. Supplementary Figure 1 Fragmentation Site Profiles CRW1 End. Son. Tr. End. Son. Tr. Human PA1 Position Fragmentation site profiles. Nucleotide content

More information

More often heard about on television dramas than on the news, DNA is the key to solving crimes the scientific way. Although it has only been

More often heard about on television dramas than on the news, DNA is the key to solving crimes the scientific way. Although it has only been DNA Matching More often heard about on television dramas than on the news, DNA is the key to solving crimes the scientific way. Although it has only been relatively recent (compared the course of forensic

More information

Thecompletegenomesequenceofa Neanderthal from the Altai Mountains

Thecompletegenomesequenceofa Neanderthal from the Altai Mountains ARTICLE doi:10.1038/nature12886 Thecompletegenomesequenceofa Neanderthal from the Altai Mountains Kay Prüfer 1, Fernando Racimo 2, Nick Patterson 3, Flora Jay 2, Sriram Sankararaman 3,4, Susanna Sawyer

More information

BIOINFORMATICS ORIGINAL PAPER

BIOINFORMATICS ORIGINAL PAPER BIOINFORMATICS ORIGINAL PAPER Vol. 27 no. 21 2011, pages 2957 2963 doi:10.1093/bioinformatics/btr507 Genome analysis Advance Access publication September 7, 2011 : fast length adjustment of short reads

More information

Title: A high-coverage Neandertal genome from Vindija Cave in Croatia

Title: A high-coverage Neandertal genome from Vindija Cave in Croatia Title: A high-coverage Neandertal genome from Vindija Cave in Croatia Authors: Kay Prüfer 1,*, Cesare de Filippo 1,, Steffi Grote 1,, Fabrizio Mafessoni 1,, Petra Korlević 1, Mateja Hajdinjak 1, Benjamin

More information

Ancient DNA from pre-columbian South America

Ancient DNA from pre-columbian South America Ancient DNA from pre-columbian South America Guido Marcelo Valverde Garnica Australian Centre for Ancient DNA Department of Genetics and Evolution School of Biological Sciences The University of Adelaide

More information

Chapter 17. PCR the polymerase chain reaction and its many uses. Prepared by Woojoo Choi

Chapter 17. PCR the polymerase chain reaction and its many uses. Prepared by Woojoo Choi Chapter 17. PCR the polymerase chain reaction and its many uses Prepared by Woojoo Choi Polymerase chain reaction 1) Polymerase chain reaction (PCR): artificial amplification of a DNA sequence by repeated

More information

Toward high-resolution population genomics using archaeological samples

Toward high-resolution population genomics using archaeological samples DNA Research, 2016, 23(4), 295 310 doi: 10.1093/dnares/dsw029 Advance Access Publication Date: 19 July 2016 Invited Review Invited Review Toward high-resolution population genomics using archaeological

More information

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies Eric T. Weimer, PhD, D(ABMLI) Assistant Professor, Pathology & Laboratory Medicine, UNC School of Medicine Director, Molecular Immunology Associate Director, Clinical Flow Cytometry, HLA, and Immunology

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Contents De novo assembly... 2 Assembly statistics for all 150 individuals... 2 HHV6b integration... 2 Comparison of assemblers... 4 Variant calling and genotyping... 4 Protein truncating variants (PTV)...

More information

APPLICATION NOTE

APPLICATION NOTE APPLICATION NOTE www.swiftbiosci.com Approaching Single-Cell Sequencing by Understanding NGS Library Complexity and Bias Abstract Demands are growing on genomics to deliver higher quality sequencing data

More information

What Are the Chemical Structures and Functions of Nucleic Acids?

What Are the Chemical Structures and Functions of Nucleic Acids? THE NUCLEIC ACIDS What Are the Chemical Structures and Functions of Nucleic Acids? Nucleic acids are polymers specialized for the storage, transmission, and use of genetic information. DNA = deoxyribonucleic

More information

The complete genome sequence of a Neandertal from the Altai Mountains

The complete genome sequence of a Neandertal from the Altai Mountains The complete genome sequence of a Neandertal from the Altai Mountains The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation

More information

Parts of a standard FastQC report

Parts of a standard FastQC report FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are

More information

Whole genome sequencing in the UK Biobank

Whole genome sequencing in the UK Biobank Whole genome sequencing in the UK Biobank Part of the UK Government s Industrial Strategy Challenge Fund (ISCF) for the Data to Early Diagnosis and Precision Medicine initiative Aim to produce deep characterisation

More information

Genetic Identification of Ancient Korean Remains

Genetic Identification of Ancient Korean Remains Genetic Identification of Ancient Korean Remains Kyoung-Jin Shin, D.D.S., Ph.D. Department of Forensic Medicine Yonsei University College of Medicine, Seoul, Korea Merits and difficulties of ancient DNA

More information

Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding

Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding CORRECTION NOTICE Nat. Biotechnol. doi:10.1038/nbt.3880 Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding Freeman Lan, Benjamin Demaree, Noorsher Ahmed & Adam R

More information

12/8/09 Comp 590/Comp Fall

12/8/09 Comp 590/Comp Fall 12/8/09 Comp 590/Comp 790-90 Fall 2009 1 One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one

More information

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing

More information

ANTH 6491: Anthropological Genetics Wednesdays pm, Purple Seminar Room, 6 th floor Science & Engineering Hall Fall 2015

ANTH 6491: Anthropological Genetics Wednesdays pm, Purple Seminar Room, 6 th floor Science & Engineering Hall Fall 2015 ANTH 6491: Anthropological Genetics Wednesdays 6.10-8 pm, Purple Seminar Room, 6 th floor Science & Engineering Hall Fall 2015 Brief Summary: A detailed examination of molecular approaches to understanding

More information

Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing

Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing Hamid Ashrafi Amanda M. Hulse, Kevin Hoegenauer, Fei Wang,

More information

Systematic evaluation of spliced alignment programs for RNA- seq data

Systematic evaluation of spliced alignment programs for RNA- seq data Systematic evaluation of spliced alignment programs for RNA- seq data Pär G. Engström, Tamara Steijger, Botond Sipos, Gregory R. Grant, André Kahles, RGASP Consortium, Gunnar Rätsch, Nick Goldman, Tim

More information

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences. Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or

More information

DNA METHYLATION RESEARCH TOOLS

DNA METHYLATION RESEARCH TOOLS SeqCap Epi Enrichment System Revolutionize your epigenomic research DNA METHYLATION RESEARCH TOOLS Methylated DNA The SeqCap Epi System is a set of target enrichment tools for DNA methylation assessment

More information

UHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009

UHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009 UHT Sequencing Course Large-scale genotyping Christian Iseli January 2009 Overview Introduction Examples Base calling method and parameters Reads filtering Reads classification Detailed alignment Alignments

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature10163 Supplementary Table 1 Efficiency of vector construction. Process wells recovered efficiency (%) Recombineering* 480 461 96 Intermediate plasmids 461 381 83 Recombineering efficiency

More information

Introduction to Short Read Alignment. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016

Introduction to Short Read Alignment. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 Introduction to Short Read Alignment UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 From reads to molecules Why align? Individual A Individual B ATGATAGCATCGTCGGGTGTCTGCTCAATAATAGTGCCGTATCATGCTGGTGTTATAATCGCCGCATGACATGATCAATGG

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

Supplementary Methods 2. Supplementary Table 1: Bottleneck modeling estimates 5

Supplementary Methods 2. Supplementary Table 1: Bottleneck modeling estimates 5 Supplementary Information Accelerated genetic drift on chromosome X during the human dispersal out of Africa Keinan A, Mullikin JC, Patterson N, and Reich D Supplementary Methods 2 Supplementary Table

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION doi:10.1038/nature24473 1. Supplementary Information Computational identification of neoantigens Neoantigens from the three datasets were inferred using a consistent pipeline

More information

Unit 2: Biological basis of life, heredity, and genetics

Unit 2: Biological basis of life, heredity, and genetics Unit 2: Biological basis of life, heredity, and genetics 1 Issues with Darwin's Evolutionary Theory??? 2 Cells - General Composition Organelles - substructures in the cell which do different things involved

More information

ENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics

ENGR 213 Bioengineering Fundamentals April 25, A very coarse introduction to bioinformatics A very coarse introduction to bioinformatics In this exercise, you will get a quick primer on how DNA is used to manufacture proteins. You will learn a little bit about how the building blocks of these

More information