A primer on SNPs part 1 Single nucleotide polymorphisms (SNPs) offer the prospect of a systematic, predictive, genetically-based approach to medicine and pharmaceutical development. Dr Michael Phillips and Dr Michael Boyce-Jacino, Orchid BioSciences Inc SNPs are a variation in a single base pair, and are the most basic and common unit of genetic variation Figure 1. Unknown just a few years ago, SNPs (pronounced snips ) are a focus of increasing pharmaceutical industry interest. But many people in the industry do not yet have a clear picture of precisely what SNPs are or why they are important. Yet the potential implications of single nucleotide polymorphisms, or SNPs, for drug development, safety and marketing are so profound that they will fundamentally alter the way that the pharmaceutical industry operates in the coming years. Nothing SNPs are single variations or polymorphic changes in the nucleotide sequences that are located at specific points throughout a person s genomic DNA. illustrates the extraordinary nature of the SNP revolution better than the fact that most of the world s top pharma companies - companies that in other respects are tooth and nail competitors - are working together to bring the SNP revolution to fruition. What are SNPs? Human DNA is composed of an ordered set of four nucleotide bases that spell out specific sets of instructions, like in a blue print. These bases are represented by the letters A, C, T and G. The bases and the instructions they contain encode genes, the building blocks that make us human versus frog, and one person different from another. SNPs are a variation in a single base pair, and are the most basic and common unit of genetic variation. For example, where most individuals have a C, some people might have a T (Figure 1). This would be considered a C to T SNP or polymorphism. It is estimated that there are 3-10 million SNPs in the present human population, representing substantially less than 0.1% of the billions of base pairs that make up the human genome. Yet this tiny fraction of distinctive base pair changes accounts for virtually all the difference between individuals, including medically significant differences like variations in disease susceptibility or in response to medications. As the Human Genome Project nears completion, the entire DNA sequence for the human species is being painstakingly compiled. When finished, the raw genomic sequence will 54 Innovations in Pharmaceutical Technology
Figure 2. Life-cycle of a SNP. represent the basic core plans of what constitutes a human being. The next phase of the Human Genome Project will focus on understanding the variation between individuals in the genome, including determining the number and types of genes present and identifying how these genes control and regulate biological processes. This process is referred to as annotation of the raw genetic information. One of the pivotal first steps in the annotation phase is the identification of SNPs at specific locations within the genomic sequence. To this end, The SNP Consortium (TSC) was established in 1999 as a coalition of the leading pharmaceutical firms including AstraZeneca, Aventis, Bayer, Bristol-Myers Squibb, Hoffman- La Roche, Glaxo Wellcome, Novartis, Pfizer, Pharmacia and SmithKline Beecham, along with industrial partners Motorola, IBM and Amersham, and academic genomic centres and the Wellcome Trust; its aim is to promote the identification of SNPs and to ensure that SNPs are kept in the public domain, where they will be universally available to biomedical researchers worldwide. 56 Innovations in Pharmaceutical Technology
Figure 3. Single-base primer extension - one of the more robust SNP scoring techniques. It is estimated that there are 3-10 million SNPs in the present human population, representing substantially less than 0.1% of the billions of base pairs that make up the human genome While the SNPs themselves will be freely available, opportunities for intellectual property around SNPs come into play when medical function or utility can be demonstrated. Why do SNPs matter? SNPs are of great interest because understanding their role in genetic variation will have a revolutionary impact on the way drugs are developed and medicine is practised. SNPs are genetic differences between individuals that manifest as differences in susceptibility to disease or in drug response, including variations in the efficacy or the safety of a compound for an individual patient. The study of this latter type of variation is referred to as pharmacogenetics. Physicians and pharmacists have always been aware that some patients respond differently to drugs, but until now they have largely been unable to identify these individual differences prior to treatment, or even to understand them in a systematic way. As a result, many new drugs that are effective and safe in the majority of patients fail as a result of the confounding effects of genetic variation in the minority. On the other hand, many patients are prescribed drugs that produce adverse effects, or that lack efficacy as a result of the presence of pharmacogenetic SNPs. But as more and more SNPs are identified and characterised, this senario will change. Far from being a confounding factor, genetic characteristics will become axiomatic to clinical practice and pharmaceutical development as researchers develop the tools to harness the medical potential of genetic variability. With the SNP revolution, this goal is about to be realised. The resulting changes in pharmaceutical development and marketing will include the following: When new drugs are designed and developed, it will be possible to target specific SNP genotypes, leading to treatments that work particularly well for specific segments of the population. 58 Innovations in Pharmaceutical Technology
... it will be possible to target specific SNP genotypes, leading to treatments that work particularly well for specific segments of the population Clinical trials should have a higher probability of success, since patient populations defined by SNP profiles would be expected to have stronger clinical responses and fewer adverse reactions than are seen in today s trials. Clinical trials should also be smaller, faster and less expensive, since there will be fewer subjects to enrol and follow, and significant clinical endpoints should be more easily achieved. Drugs that have failed clinical development in the past, either due to lack of overall efficacy or toxic responses, can be redeveloped once the poor responders have been characterised in terms of their SNP profiles and excluded from subsequent trials. This approach applies as well to existing drugs that have under-performed in the marketplace as a result of sub-optimal efficacy or adverse effects in significant subsets of patients. As pharmacogenetic knowledge increases, it will become possible to use SNP profiles to help select the most appropriate treatment for patients when they become ill. Some observers have expressed concern that SNPs and pharmacogenetics will fragment and thereby threaten the financial structure of the overall pharmaceutical market. On the contrary, the incorporation of pharmacogenetics into drug research and development should enable the pharmaceutical industry to prosper, as SNP analyses allow companies to develop and market a greater number of drugs that are capable of achieving wide acceptance and use within their targeted segments. In any case, the health and economic advantages provided by more individualised medicine makes its eventual adoption inevitable. The life-cycle of a SNP To illustrate how SNPs are actually used in drug discovery and development, the Life-Cycle of a Table 1. Popular current methods of SNP scoring. 60 Innovations in Pharmaceutical Technology
SNPs can be identified in a number of ways using various scanning technologies and discovery strategies SNP provides a useful overview (Figure 2). New SNPs are first discovered, or identified by comparing specific DNA sequences in upwards of 5 to 100 individuals. Once new SNPs are identified, they must be confirmed - that is - the SNP is checked to ensure that it actually is a SNP and not a sequencing error, and the frequency with which it occurs is roughly determined (known as allelic frequency). Once a SNP has been confirmed, its potential utility is assessed in an initial clinical association study, involving 200-500 individuals. If the results of this study are promising, clinical trials involving several thousand subjects are needed to delineate the SNP s medical effects. Once validated, the SNP enters into routine diagnostic testing, for pharmacogenetic use in prescribing drugs or for assessing an individual s disease susceptibility. After a SNP has been identified, an appropriate technology must be selected that will allow for SNP scoring - determining whether or not the SNP is present - in all subsequent steps in its lifecycle. SNP scoring is also referred to as genotyping. As shown in the Lifecycle of a SNP figure, the number of patient samples that typically need to be scored varies at different parts of the cycle. As a result, SNP scoring throughput requirements vary over the course of the lifecycle, depending on how the SNP analysis will be used. How are SNPs discovered? If SNPs are going to have their anticipated impact, two requirements must be met. First, they will have to be discovered in large numbers - at the very least, hundreds of thousands. Secondly, they will have to be linked to clinical characteristics, such as disease susceptibility and response to drugs, in order ultimately to be used in medical practice and pharmaceutical development. SNPs can be identified in a number of ways using various scanning technologies and discovery strategies. A successful strategy for identifying SNPs is in silico identification. This involves computer-based approaches that identify genetic variation by comparing identical DNA sequences from many different nucleotide databases. The most common method for identifying SNPs is DNA sequencing of a small group of individuals, considered the Gold Standard for SNP identification because it is the most reliable way of discovering sequence variations. However, sequencing is a laborious process. While a number of alternative methods are emerging, there remains a need for more rapid, large-scale SNP detection strategies than currently exist. How are SNPs scored? Once a suspected SNP has been located, the work has only just begun. The steps that follow - confirmation, biological studies and clinical use in large numbers of patients - all require use of SNP scoring methods. Important attributes for SNP scoring approaches include accuracy, cost-effectiveness, flexibility and scalability - especially for high throughput analyses. SNP scoring technologies now include a variety of methods for SNP detection, and these methods can be analysed on numerous platforms. SNP scoring technologies can be generally classified into one of two basic methodologies - those technologies that measure the SNP directly and those that measure the presence of the SNP in a more indirect way. Table 1 lists popular current methods of SNP scoring. Direct scoring technologies include DNA sequencing and variations that use an enzyme, DNA polymerase. Traditional sequencing is highly accurate but it is also costly and slow for high throughput SNP scoring. DNA polymerase methods use this enzyme to directly detect the variant nucleotide base that defines the SNP by incorporating the appropriate base. The incorporated base has a label which can be detected by various read-out methods. The advantages of direct scoring approaches include their high accuracy and robustness. Accuracy is particularly important for SNP scoring since even a low rate of inaccurate readings can make the data unusable. Indirect scoring technologies use hybridisation, or a measurement of DNA binding, to detect SNPs. These methodologies indirectly measure the base composition of the SNP by the presence or absence of one of two specific DNA sequences that vary at only one base. Hybridisation approaches are harder to optimise and not as robust as direct detection technologies. Some methods are combinations of the two methodologies and involve hybridisation coupled to an enzymatic reaction. These combined approaches tend to have greater accuracy and robustness than hybridisation alone. Many of the approaches listed above can be coupled to a variety of read-out methods, which measure the SNP score. These read-out methods include fluorescence, colour and mass detection. These methods can be used on a number of different analytic laboratory instrument platforms, including capillary electrophoresis, mass spectroscopy, beads and microchips. The method of SNP scoring that is transferable to the largest number of platforms is single base primer extension. This technique has the accuracy and robustness provided by direct detection, yet it can be used on almost any platform with virtually all methods of detection at any scale (Figure 3). The portability of approaches like Orchid s single base primer 62 Innovations in Pharmaceutical Technology
extension technology makes it suitable for use in both the individual researcher s lab and in ultra-high throughput pharmacogenetic projects. The future We now have the capabilities to identify and document all the SNPs that exist in the human species. This also offers the possibility of the creation of a systematic, predictive, geneticallybased approach to medicine and pharmaceutical development. Although the technologies for accomplishing this task are still evolving, the tools already at our disposal will take us a long way towards this goal. What had been previously seen as a potentially endless journey has now become a finite task to identify a defined set of information which, when characterised, has the potential to revolutionise the management of human disease and transform therapeutics. and Development, and most recently President; he joined Orchid in 1998 when the company acquired Molecular Tool. Dr Boyce-Jacino received his BS in medical microbiology from the University of Wisconsin, and his PhD in microbiology from the University of Minnesota. Michael S Phillips, PhD, is a Senior Scientist at Orchid BioSciences Inc in Princeton (NJ, USA). Previously, he was a postdoctoral scientist at Merck Research Laboratories (West Point, PA, USA) with C Thomas Caskey in the Department of Human Genetics. While at Merck, his research concentrated on the identification of type 1 diabetes risk genes in both humans and the non-obese diabetic (NOD) mouse. In conjunction with the diabetes project, Dr Phillips assessed and utilised many different SNP technologies for use in pharmacogenomic applications. His PhD work focused on the understanding and analysis of the gene responsible for the anaestheticinduced disease of malignant hyperthermia. At Orchid, he is the leader for a project that is determining the allelic frequencies for 60,000 SNPs identified by The SNP Consortium in three diverse populations. Dr Phillips received his PhD in biochemistry from the University of Toronto (Toronto, Canada) under the direction of Dr David H MacLennan; he also received his BSc from the University of Toronto in 1988. Michael T Boyce-Jacino, PhD, is Chief Technology Officer and Vice President of Research at Orchid BioSciences Inc in Princeton (NJ, USA). Previously, he served in various capacities at Molecular Tool Inc from 1991 to 1998 as a scientist, General Manager and Director, Research References Ellis MC (2000). Spot-on SNP genotyping. Genome Research, 10, 895-897. Landegren U, Nilsson M, Kwok P-Y (1998). Reading bits of genetic information: methods for single-nucleotide polymorphism analysis. Genome Research, 1998, 769-776. McCarthy JJ, Hilfiker R (2000). The use of single-nucleotide polymorphism maps in pharmacogenomics. Nature Biotechnology, 18, 505-508. Roses AD (2000). Pharmacogenetics and the practice of medicine. Nature, 405, 857-865. Websites SNP consortium: snp.cshl.org Orchid BioSciences: www.orchid.com In the next issue of Innovations in Pharmaceutical Technology, the authors will consider in detail the role of SNPs in pharmaceutical development. We now have the capabilities to identify and document all the SNPs that exist in the human species Innovations in Pharmaceutical Technology 63