Gene Editing EDITION. An Introduction To Gene Editing

Size: px
Start display at page:

Download "Gene Editing EDITION. An Introduction To Gene Editing"

Transcription

1 Gene Editing EDITION An Introduction To Gene Editing PRODUCED BY: IN PARTNERSHIP WITH:

2

3 WELCOME ince the 1970 s, the idea of inserting new DNA into an organism s genome has been the focus of many different research studies. Over the last 40 years or so, our experiments have evolved from simple microinjections of viral DNA, to complex semi-synthetic nucleases, to highly accurate technologies such as CRISPR, with each new stage of the development improving our understanding of genomics. These technologies have helped us to determine the location and function of countless genes that had previously been mysteries and taught us more about humanity as a species. S Beyond this, there is more to gene editing than just the pursuit of knowledge; the ability to alter, insert, or inactivate genes has incredible potential in precision medicine against a range of diseases like HIV or cancer. There are concerns too. Any conversation that involves gene editing will ultimately turn to the morality of changing the fundamental code of nature and what risks might be faced if we make a mistake that can be passed on to a nearly infinite number of descendants. Gene editing has always been a controversial topic but as our knowledge increases, the potential benefits of this technology are also growing the Gene Editing 101 is designed to illuminate this often confusing and sometimes poorly explained topic, while showcasing the enormous potential of these technologies. With the help of our sponsors and external contributors, we ve tried to put together an unbiased, clear guide that covers the basics of modern gene editing. We focus on the central topics and techniques currently being used around the world to make this guide as accessible and functional as possible. As with the rest of the 101 series, this guide is not intended to act as a practical instruction manual for performing gene editing; instead, we want to offer you an introduction into this ever-evolving field and to help you start to ask the best questions in your research. We hope that you enjoy this book and, more than anything, you find it to be useful.

4 GENE EDITING 101 CONTENTS 1 INTRODUCTION 3 ABBREVIATIONS 4 GLOSSARY 6 CHAPTER 1: GENE EDITING TECHNOLOGY The idea of changing our genetic code has been around for decades. Here we discuss the technologies of the past and what tools we re currently working with. 16 CHAPTER 2: USING CRISPR-CAS9 Spend any amount of time in gene editing circles and you will hear about CRISPR, but it s not always clear how to go about performing an experiment with the technology. This chapter will touch on the main questions you need to ask before you begin BIOTIX: Save Your Lab From Indifference! 26 NEW ENGLAND BIOLABS: Single guide, Simplified CHAPTER 3: APPLICATIONS FOR GENE EDITING Beyond simple curiosity and the pursuit of knowledge, there are many real world applications currently benefiting from our ability to alter the genetic sequence. Here we discuss the most prominent areas using gene editing today. 38 CHAPTER 4: ETHICAL, LEGAL, AND SOCIAL IMPLICATIONS Any gene editing experiment will need to navigate a network of complicated legal precedents and ethical considerations. This chapter hopes to highlight the main areas you need to be aware of / Gene Editing 101

5 GENE EDITING 101 ABBREVIATIONS ALS Amyotrophic Lateral Sclerosis DNA Deoxyribonucleic Acid PAM Protospacer Adjacent Motif Cas CRISPR associated (proteins) FACS Fluorescence-Activated Cell Sorting PCR Polymerase Chain Reaction CAS9n Nicking Cas9 GFP Green Fluorescent Protein RNA Ribonucleic Acid Cpf1 CRISPR from Prevotella and Francisella 1 grna Guide RNA RNAi RNA Interference CRISPR Clustered Regularly Interspaced Short Palindromic Repeats CRISPRa CRISPR Activation CRISPR-Disp CRISPR Display CRISPRi CRISPR Interference/Repression crrna CRISPR spacer RNA dcas9 Dead/Deactivated Cas9 DMD Duchenne Muscular Dystrophy HDR Homology Directed Repair hfcas9 High Fidelity Cas9 Indel Insertion and Deletion mutations mrna Messenger RNA ncrna Non-Coding RNA NGS Next Generation Sequencing NHEJ Non-Homologous End Joining RNP Ribonucleoprotein complex RVD Repeat Variable Di-residues scrna Scaffold RNA sgrna Single Guide RNA TALENs Transcription Activator-Like Effector Nucleases tracrrna Trans-Activating crrna ZFNs Zinc Finger Nucleases Gene Editing 101 / 3

6 GENE EDITING 101 GLOSSARY ACTIVATOR A transcription factor that promotes gene expression. AUTOLOGOUS Cells and tissue obtained from the organism in question. BLUNT CUT A double strand break where both strands have been cleaved at the same point in the sequence so all DNA remains double stranded. CRISPR-CAS9 A genetic editing technique that uses the Cas9 protein found in bacteria. DNA MAPPING Using investigative methods to establish the location of specific genes within the genome. DOUBLE STRAND BREAK A cut in the DNA that involves both strands of the helix suffering a break in the sequence. ELECTROPORATION A transfection method that uses short burst of electricity to open the pores in cell membranes. ENDOGENOUS Native to the organism in question. ENDONUCLEASE A class of enzyme that is capable of cutting both DNA strands to leave either blunt or overhanging cuts. EPIGENETIC FACTORS Compounds that interact with DNA, commonly as tags or expression activators and repressors, without altering the underlying DNA base sequence. EXON A section of the DNA or RNA sequence that codes for a protein. GENE EXPRESSION The process that produces a functional protein as dictated by the gene sequence. GENOME The complete genetic data of an organism containing both coding and non-coding regions. GERMLINE CELLS Cells that will develop into sex cells (sperm and ova) and so can pass genetic data from parent to offspring, including mutations. HELA CELLS Cells making up an immortal cell line that is commonly used in scientific research (the cells originate from Henrietta Lacks). KNOCK-OUT MUTATION The removal of genetic data from a specific location in the genome. MUTATION A DNA sequence that has been altered from the reference sequence. OPERON A cluster of genes that rely on a single promoter for expression. OVERHANG A result of a double strand break in DNA where the two strands have not been cut at the same location, leaving short sections of single strand DNA on either side of the break. POLYMERASE CHAIN REACTION A technique that rapidly replicates a particular section of DNA. PROMOTER A transcription factor that activates or increases gene expression. REPRESSOR A transcription factor that inhibits gene expression. RNA A nucleic acid transcribed from DNA and that codes for protein production, is independently catalytically active or can complex with another compound to achieve catalytic function. RNAi A biological process that inhibits gene expression using RNA molecules, commonly by the destruction of specific mrna molecules. STABLE EXPRESSION The expression of a gene that has been incorporated into the genome of the cell and which will be carried through multiple generations after cell division. T-CELLS A major component of cell-mediated immunity that is largely responsible for recognition of non-self materials. TRANSCRIPTION The process that produces messenger RNA from the corresponding DNA sequence. TRANSFECTION AND TRANSDUCTION The transfer of genetic material into a cell. Transduction refers specifically to material transferred using a viral delivery system. TRANSIENT EXPRESSION The expression of a gene that has not been incorporated into the genome and which will be lost after cell division. WILD-TYPE The version of a gene that prevails in individuals subject to natural conditions. 4 / Gene Editing 101

7 GENE EDITING 101 IF YOU GET VERY FINE, ACCURATE, AND INEXPENSIVE CONTROL OVER YOUR GENOME, YOU CAN FUNDAMENTALLY CHANGE THE KIND OF ORGANISM YOU ARE. YOU ARE EXTENDING HUMAN CAPACITY. GEORGE M. CHURCH Gene Editing 101 / 5

8 CHAPTER 1: GENE EDITING TECHNOLOGY

9 GENE EDITING TECHNOLOGY INTRODUCTION Targeted genetic manipulation has existed in one form or another since the 70 s and 80 s, although the idea of selective breeding is a concept stretching back millennia. Throughout the years, our understanding of the genome has increased and correspondingly, the techniques available for genetic engineering have improved dramatically. In the last decade alone, gene editing has advanced from minor modifications to complete alteration of genomes in vivo with a speed that no one could have predicted. Now the concept sits at the forefront of modern genomic investigations. It is easy to see how important gene editing has become by looking at CRISPR research, a popular editing technology. Estimates suggest that five papers are published every day concerning the technique and the non-profit research aids provided by Addgene have been viewed over a million times in the space of a year. Even outside of scientific circles, gene therapy is drawing attention; notable CRISPR pioneers, such as Jennifer Doudna, Carl June, and Feng Zhang, were voted onto the short list for Time Magazine Person of the Year in While the advances in this field are very impressive and important, it is also necessary to remember where we came from and how much we have accomplished. Understanding how gene editing has evolved throughout the last few decades will help us to understand the scientific reasoning behind current techniques and perhaps even predict where the science may go next. It can also help to inform the ethical and legal considerations to take into account when performing gene editing in practice. To this end, this chapter will take you through the different stages of gene editing and how they work, all the way from naturally occurring nucleases to the brand new developments with CRISPR. INITIAL INVESTIGATIONS While selective breeding has been used to amplify desired characteristics for a very long time, the idea of intentional gene editing on a molecular level was only initially developed in the late 70 s. The concept arose when microinjection studies discovered that foreign DNA (known as exogenous DNA) introduced artificially into mammalian cells showed better gene expression when it was flanked by viral DNA sequences than when introduced unaltered. Further investigations developed this idea and by the end of the 80 s, researchers were developing numerous constructs to insert foreign DNA into the genomes of various organisms in an attempt to achieve a better understanding of gene function. Over the course of many different studies, over 7,000 genes and regulatory elements had their function inferred during the 80 s using this technique. While the process wasn t entirely understood at the time, these experiments were the first examples of using natural cellular processes to perform gene editing. The success of these investigations prompted other studies to advance the idea further, using a cell s DNA repair mechanism against it to achieve edits and the findings of these early experiments persist in influencing editing techniques still today. To understand how these techniques work first requires a basic understanding of the natural repair mechanisms being used and how they can be adapted for scientific use. Gene Editing 101 / 7

10 GENE EDITING TECHNOLOGY HOMOLOGOUS DIRECTED REPAIR AND NON-HOMOLOGOUS END JOINING Throughout the lifetime of a cell, it is often necessary to repair breaks in the DNA strands. These can occur naturally as a result of everyday life processes within a cell or due to damage from external influences such as radiation or toxins on average a single human cell will develop in excess of 10,000 damages per day, a number which rises to as much as 50,000 in creatures with faster metabolisms such as rats. The sheer number of damaged sites in a genome at any given moment means that without a natural repair mechanism, sustaining life would be all but impossible. Repair mechanisms are, therefore, absolutely vital in living organisms. Not only do they need to exist in all cells, but they need to be adaptable to a variety of damage types and working conditions to accommodate these needs, cells have developed two main processes for fixing damage: Homology Directed Repair (a type of homologous recombination, HDR) and Non-Homologous End Joining (NHEJ). HDR is the process that was at work in the initial investigations of the 70 s and 80 s and it involves using a second, unbroken, identical DNA strand to replace a specific genetic locus. The unbroken strand can unravel and provide the template sequence for polymerase enzymes to replicate before the system dissociates once more, producing two unbroken strands with minimal errors (polymerase enzymes can be error-prone, but separate proof-reader enzymes can correct these mistakes). Using an unbroken strand as a template allows the polymerase enzymes to create an exact copy of the gene, making it a highly accurate system. The lack of errors displayed by HDR mean that it is not only used for strand break repair but also as the main method used for DNA replication in meiosis. In contrast, NHEJ is highly error prone. Despite this, it is often the more commonly used method for DNA repair within a non-dividing cell as it doesn t require an identical DNA strand as a template, something which is rarely in proximity to the site of the damage (dividing cells will intentionally bring alleles together to ensure HR can occur during meiosis). Instead of using a template, NHEJ will bring both ends of the break together and enzymatically stitch them back into a contiguous strand with a few bases potentially being added or removed by enzymes that clean up the damaged ends, without consideration for the DNA base sequence. These insertion or deletion mutations, termed indels, may have no effect on the gene in question (if the protein code is maintained) but it is more likely that they will alter or inactivate the gene instead. The error prone nature of NHEJ has made it very useful in genetic engineering, as it offers a very simple way of inducing mutations within selected genes provided that a site-specific DNA break can be created. These mutations can be used to interrogate gene function many of the current genetic engineering technologies rely on inducing NHEJ within specific genes to deactivate them ( knock out mutations) and then observing what effect the change has on the phenotype of the cell. This data can then be used to infer gene function. Experiments like these, commonly known as loss-of-function screenings, have helped us to study poorlyunderstood regions of the genome. Similarly, HDR has proven to be very useful as a method of inserting new genes. Once a strand break has been created at a specific site, a DNA template with the desired gene can be provided and HDR will copy the sequence exactly into the genome. With the gene contained within the DNA sequence itself, the edit can be carried over into subsequent generations and thus a cell line can be created, with the entire population expressing the new gene. In both cases, the primary mode of action requires a DNA strand break to be present at a selected site in the genome. Artificially creating these breaks has therefore been the primary target of gene editing since its initial conception, with a focus on specificity and adaptability to achieve targeted, accurate edits on demand. With this in mind, researchers started by looking into meganucleases. MEGANUCLEASES Meganucleases are a naturally occurring form of nuclease, a type of enzyme that can cleave DNA at specific loci. They are characterised by a very large DNA recognition site, usually between 12 and 40 nucleotides long, combined with a domain that has nuclease activity. The size of the recognition site make them highly specific tools usually only recognising and acting on a single site within the entire genome and they therefore carry very low cytotoxicity due to the lack of off-target effects (changes to the genome at locations not being targeted). Off-target effects can be a very big problem in research as they can invalidate results or cause incorrect conclusions to be drawn when genes have been altered unknowingly and in the case of clinical applications, offtarget effects can present potentially-lethal health implications; a technique that minimises the possibility of these effects from happening is therefore necessary for broad adoption of genetic engineering. There are a wide range of meganucleases available from natural sources. Despite this range however, the vast variations in DNA sequences throughout nature have meant that there are limits to meganuclease application it is not always possible to find a meganuclease that will recognise your intended target. To counteract these limits, investigations were carried out to explore the possibilities of altering the protein structure of the enzyme via minor changes to their source DNA sequence and therefore altering the amino acid sequence of the protein and changing its recognition domain. While the method did achieve results, it was a very complex process that was difficult to carry out, partially due to meganucleases having DNA recognition and cleavage functions within a single domain. With both functionalities contained within a single domain, they couldn t be altered in exclusion of one another and alterations to the recognition functionality had the potential to change or destroy the cleavage function, thereby inactivating the nuclease. Modification was therefore a complicated process that frequently relied on time-consuming and expensive trial and error, meaning that meganucleases did not offer a very attractive prospect for wide spread gene editing. Similarly, synthesising meganucleases from scratch was found to be too complicated and expensive to be viable on a large scale. 8 / Gene Editing 101

11 GENE EDITING TECHNOLOGY Meganucleases were therefore entirely impractical for gene editing beyond small-scale research purposes. They did, however, provide an interesting concept for further investigations using a highly specific recognition site alongside a non-specific nuclease domain to achieve a site specific break. This idea was maintained in future work, with studies moving towards a more synthetic approach to recognition domains instead of relying on naturally available resources. ZINC FINGER NUCLEASES Building on the concept provided by meganucleases, the idea of Zinc Finger Nucleases (ZFNs) was formed. ZFNs rely on the same basic structure as meganucleases in that they consist of a DNA recognition domain (the zinc fingers) covalently bound to the nonspecific Flavobacterium okeanokoites restriction endonuclease (Fok1). Unlike meganucleases, ZFNs have recognition and cleavage functionality in separate, distinct domains to allow for simpler manipulation; the recognition domain can be altered without adversely affecting the activity of the nuclease. As they do not occur naturally, ZFNs have to be built for purpose and this usually involves a developmental period which is very time consuming and expensive. The recognition domain of ZFNs is made up of 3-6 finger arrays, each of which consists of roughly 30 amino acids stabilised by a zinc ion and which will bind to a specific DNA triplet. A complication arises during this DNA binding due to Fok1 s particular mode of action: it will only cleave DNA when acting in dimer form. This means that the nuclease will only cleave DNA when two enzymes are present at the site, thereby requiring two carefully positioned Fok1 nucleases to achieve the desired strand cleavage. In the context of ZFNs, this means that two complexes are required for each strand cleavage event, during which they have to bind to opposite strands the perfect distance apart to allow the Fok1 regions to come together in the correct orientation. This can be beneficial to the reaction by increasing the specificity of the method as the sequences on either side of the break need to be recognised, but it also requires a more complicated design process to ensure both the ZFNs are binding at the correct locations. When orientated correctly, this system provides a method of causing double strand breaks in the DNA with high specificity, as well as allowing for adaptability through the zinc finger domain so that ZFNs can be targeted to any section of the DNA. In principle, ZFNs are a very useful tool for genetic engineering but unfortunately, investigations discovered that there are significant drawbacks to this approach. The selectivity of ZFNs is determined by the zinc finger domains which have to be specifically designed for the exact gene you want to target. As a result, creating the recognition domain involves a time consuming and expensive design process that relies largely on trial and error. Gene Editing 101 / 9

12 GENE EDITING TECHNOLOGY This is further complicated by the variable binding affinity of the zinc fingers, which changes depending on the position of the finger within the overall protein (a finger on the end of the hand will have a different binding affinity if it is moved towards the Fok1 thumb ), meaning that the recognition domain has to be designed in the context of the whole structure instead of as individual subunits. Because the structure cannot be treated as exclusive parts in this way, each minor change in the target sequence will require a complete redesign in the ZFN structure, requiring a series of experimental syntheses using expensive reagents, an entirely impractical solution for adaptable gene editing. While ZFNs have proven financially problematic, they also proved that the idea of combining a naturally occurring nuclease domain with an engineered recognition domain was feasible and effective. This conclusion directed investigations towards similar systems with a focus on inexpensive designs and expanded DNA binding potential. It is also worth noting that ZFNs have made comeback in 2016 and 2017, resurfacing as an editing technique of particular interest in clinical trials hoping to cure HIV. Preliminary testing has shown that ZFNs hold great promise in that area and there may be a chance that they will become central to modern gene editing once more in the future. TALENs As the main drawbacks of ZFNs lay within the recognition domain, that was the primary target for redesign in other systems. Several groups turned to natural sources to try to identify a corresponding system that could be adapted for our use without requiring lengthy development processes. This approach was ultimately propelled by the discovery of Transcription Activator-like Effectors (TALEs) and their conversion to DNA nucleases. Transcription Activator-Like Effector Nucleases (TALENs) came to prominence in the late 2000 s and utilise the same cleavage principles as the ZFN system: they use Fok1 as a cleavage domain which is bound covalently to the recognition domain and which works in pairs with other nucleases to create a double strand break in the DNA. This cleavage system had proven successful with ZFNs and had only been limited by the restriction of the recognition functionality as a result, the domain went largely untouched. Though the mode of action may be the same, the recognition domain is vastly different. TAL Effectors are naturally occurring compounds in phytopathogenic Xanthomonas bacterial strains and are used to activate the expression of genes within plant hosts to benefit cell colonisation of the pathogens, thereby adapting the surrounding environment to suit their own needs. They are modular proteins that contain nearly identical repeat units that are amino acids in length. These repeats only differ from one another at the 12th and 13th amino acid position, a segment referred to as the Repeat Variable Di-residues (RVD). Each repeat binds to a single, unique nucleotide on the DNA strand (instead of a triplet as in ZFNs), with the RVD determining which base the repeat will bind to (A, T, C or G). It is therefore the RVD within the repeat units that determine TALEN specificity using only four possible variations for each unit. In contrast to the complex binding modalities displayed by ZFNs, the DNA binding mechanism of TAL Effectors is relatively simple. The basic and limited variations (only four variants possible, ZF1 ZF2 ZF3 ZF6 ZF5 A cartoon dipicting how Zinc Finger Nucleases (ZNFs) bind to 3 specific nucleic acid bases TAL1 TAL4 TAL1 TAL3 TAL2 TAL4 TAL3 TAL4 )*+ TAL1 TAL2 TAL2 TAL2 TAL3 TAL1 TAL4 A cartoon dipicting how TAL Effector Nucleases (TALENSs) bind to individual nucleic acid bases TAL1 ZF4 TAL2 TAL3 THROUGHOUT THE YEARS, OUR UNDERSTANDING OF THE GENOME HAS INCREASED AND CORRESPONDINGLY, THE TECHNIQUES AVAILABLE FOR GENETIC ENGINEERING HAVE IMPROVED DRAMATICALLY. 10 / Gene Editing 101

13 GENE EDITING TECHNOLOGY corresponding to the four DNA bases) in the RVD necessary for determining the target sequence in TALENs make the system much more adaptable and suitable for genetic engineering than ZFNs without negatively impacting selectivity. By relying on such a simple recognition system, the design process can be much shorter and therefore potentially less expensive, helped by the fact that the different RVDs act independently of one another and thus TALENs can be considered as subunits instead of as a whole. TALENs present a very effective, accessible tool for gene editing. Researchers have been engineering selective TALE nucleases since 2010 and their novelty means that there is still a lot of unexplored potential in the area for improvements in selectivity or cost. As with ZFNs, TALENs have recently shown promise in human clinical trials, in this case primarily in cancer treatments, though again this is still early stage research. CRISPR-CAS9 DISCOVERY AND PURPOSE IN NATURE Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a term that has become synonymous with gene editing in recent years and yet the term wasn t coined until 2002 when a paper by Jansen et al. discussing an unexplained phenomenon within bacterial DNA discovered in the 80 s was published. The phenomena in question were the carefully ordered, repeated motifs present in 40% of bacterial and 90% of single-celled prokaryotic genomes that, despite being so widespread, appeared to be non-coding. These repeats were present in many different cells but showed significant variations in sequence between different species without reasonable explanation. The paper also took note of the genes adjacent to these repeats, dubbing them CRISPR associated (Cas) genes. Although it was unknown at the time, this was the start of the CRISPR-Cas system. Interest in CRISPR spread, but it wasn t until 2007 that a direct link was discovered between CRISPR and bacterial immunity to viruses. It was shown that by encouraging continuous viral attack on bacteria until natural immunity was attained, some viral DNA became incorporated into the interspacing regions of the CRISPR loci within bacterial DNA and that subsequent removal or mutation of these genes led to an immediate loss of immunity. These findings confirmed the idea previously proposed by Eugene Koonin that CRISPR was linked to an adaptive immune system and that the process somehow involved one set of DNA being used to damage and disrupt another. Further investigations probed the mechanism of CRISPR-Cas. Over the next few years it was established that two different types of RNA were vital in the process: one coded for in the CRISPR spacer regions (crrna) and another coded for by the areas adjacent to CRISPR (tracrrna). These two RNA types were shown to be complementary to one another and could combine into a single double strand called a guide RNA (grna) which interacted with the Cas protein as part of the cleavage process. It was also shown that Cas-mediated DNA cleavage always occurred in positions adjacent to a short DNA sequence (2-5 nucleotides long) that was specific to the Cas variant being used, subsequently dubbed the Protospacer Adjacent Motif (PAM). If no PAM was present then no cleavage would occur, leading to the belief that the presence of a PAM sequence allowed for discrimination between self and non-self DNA strands within bacterial cells. The restriction and control that the PAM represented meant that it was necessary to find a variant of Cas9 (the most commonly used form of Cas proteins) that allowed for wide spread application throughout the genome. A solution was found with Streptococcus pyogenes Cas9, which uses a nucleotide sequence of NGG or NAG (where N can be any base) as a PAM (although S. pyogenes Cas9 displays lower recognition and cleavage efficiency for NAG than NGG). As NGG is a sequence that occurs on average every 8 nucleotides within the genome sequence, S. pyogenes Cas9 allows for near limitless application. STRUCTURE AND ACTION OF CAS9 The clear biological significance of the Cas proteins inevitably led to several investigations probing their physical structures. Research primarily centred on Cas9 due to its potential in gene editing but similar proteins in the Cas family, such as Cas3, were also probed. In early 2014, the crystal structure of a complex with S. pyogenes Cas9, grna and the corresponding target DNA was reported, presenting a clearer picture of the structure of Cas9 and how the different components of the system interacted with one another. The Cas9 enzyme is split into two main lobes: the recognition lobe and the nuclease lobe, with a total of four active sites between them. Strangely, the recognition lobe only refers to the interaction between the DNA and the grna and not the recognition of the PAM site, which is the responsibility of the nuclease lobe. Alongside recognition of the PAM, the nuclease lobe is, as expected, also responsible for DNA cleavage which is achieved through the joint action of two endonuclease domains: RuvC and HNH. Each domain can cleave one strand of the DNA structure, so that in conjunction the domains can create double strand breaks roughly 3 nucleotides upstream of the PAM. Unlike Fok1, which leaves defined nucleotide overhangs (useful for gene insertion), RuvC and HNH generally create blunt strand breaks, though will occasionally leave an overhang 1-2 nucleotides long. With both lobes present Cas9 is a relatively large protein with the most commonly used form, S. pyogenes Cas9, containing 1368 amino acids. Some Cas9 proteins with just over 1000 amino acids have been discovered as well, but the frequently occurring PAM of S. pyogenes ensures that it remains the most popular choice for genetic engineering despite the significant weight, something which can cause problems during cell delivery in gene editing experiments. Bacteria can produce Cas proteins and grnas in vivo naturally, but for genetic engineering purposes, it is necessary to introduce them into cells artificially. Several ways of achieving this have been developed in recent years and these will be discussed in Chapter 2 but for now, we will assume that Cas9 and grna are readily available within the nucleus. Gene Editing 101 / 11

14 GENE EDITING TECHNOLOGY When the grna and Cas9 proteins are present in a cell, they can be considered to use a three step process to recognise and cleave DNA: Destabilisation, Invasion and Cleavage. Destabilisation: 1. The grna is incorporated into the recognition lobe of the Cas9 protein, creating a single, stable complex. 2. The formation of this complex triggers the cleavage lobe to scan the DNA for PAM sites. 3. When a PAM site is found, it is thought that the recognition site within the cleavage lobe destabilises the interactions between the DNA double strand of the PAM site. 4. The reduced interaction between the two DNA strands leads to unwinding upstream of the PAM site, a process which is stabilised by increased complex-dna interactions. It is at this point in the process that the specificity of the grna becomes significant. If the grna and the now-unwound DNA are a match, then the protein-dna complex will have sufficient binding energy to maintain its current state. Alternatively, if there is no match between the DNA and the grna, the binding energy is too low to compensate for the decreased DNA interactions and the complex will dissociate before cleavage can occur. The protein will then move onto another PAM until it finds a sequence that matches the grna. However, should the binding energy be high enough, the process can progress into the Invasion stage. Invasion: 5. The grna molecule will start to displace the homologous DNA sequence and take its place by binding complementarily to the target DNA. 6. The sequential interactions within the DNA-gRNA dimer further stabilise the complex and thus the invasion process. Invasion will continue until the entire Cas9-gRNA-DNA complex has been formed. During the formation, the complex will have positioned itself so that the DNA is accessible to the RuvC and HNH domains, aided by the flexibility available in the HNH domain. At this point the system can continue into the Cleavage stage. Cleavage: 7. The HNH domain will cleave the DNA-gRNA dimer while the noncomplimentary strand is simultaneously positioned within the RuvC domain and cleaved. 8. With both DNA strands broken, the complex will dissociate and allow the DNA to rewind with a new, targeted double strand break. As previously mentioned, these breaks will be preferentially repaired by NHEJ in mammalian cells, giving rise to indel mutations that will impact the action of the gene. Should a DNA template be supplied, HDR can occur at this stage to insert a new gene into the genome. In nature, bacterial cells can use a NHEJ event to introduce errors into attacking viral DNA, creating mutations that inactivate the virus and prevent an infection an immune response. Outside of bacteria, the CRISPR-Cas9 system is the perfect opportunity for creating site-specific mutations with minimal redesign of the mechanism necessary when targeting new sites. The simplicity and adaptability of the system has attracted significant attention and many teams have been investigating how CRISPR-Cas9 can be utilised and improved. ADAPTATION FOR SCIENTIFIC USE One of the main advantages the CRISPR-Cas9 system has over its predecessors is that the target sequence can be completely changed by altering only the grna, a far simpler, more costeffective undertaking than the complete redesign of the DNA binding domains necessary in ZFNs and TALENs. This simplicity means that the system can be applied to any section of the genome with far less time and financial investment required for the continuous redesigns necessary with previous methods. Even more promising is that there is virtually no restriction on the sequence of the grna (it can be modified to any ACUG code with relative ease), making the presence of a PAM sequence the only limiting factor in the application of CRISPR-Cas9. Even taking this restriction into account, the prevalence of the NGG sequence and the wide array of Cas9 proteins available from different bacterial species ensure that CRISPR-Cas9 is a system applicable to almost anywhere in the entire genome. The unique opportunities provided by CRISPR-Cas9 have led to a great interest in the technology and over the last few years significant research has been done improve upon the natural action of the system. However, some teams have taken their research further than simple optimisation and are instead investigating the potential benefits available when the Cas9 protein itself is dramatically altered. VARIATIONS Since the discovery of the mechanism of action utilised by the CRISPR-Cas9 system, scientists have been looking for ways to improve it even further by increasing accuracy, minimising offtarget effects and even altering the enzyme function to allow for DNA mapping instead of cleavage. To these ends, several different forms of the Cas9 protein have been produced. Three main variants stand out, with a fourth, similar enzyme also rising to prominence in recent months. dcas9: Known as dead Cas9 or occasionally deactivated Cas9, dcas9 is a variant that has had both the HNH and RuvC domains inactivated, destroying the cleavage function of the complex. The impairment of the cleavage function opens the doors for a great many applications that would otherwise be impossible, such as DNA mapping when dcas9 has been fused to a marker (eg. GFP) or epigenetic manipulations by fusing transcription factors to the protein. These manipulations (known as CRISPRa and CRISPRi) will be discussed in detail in the following chapter. 12 / Gene Editing 101

15 GENE EDITING TECHNOLOGY UNDERSTANDING HOW GENE EDITING HAS EVOLVED THROUGHOUT THE LAST FEW DECADES WILL HELP US TO UNDERSTAND THE SCIENTIFIC REASONING BEHIND CURRENT TECHNIQUES AND PERHAPS EVEN PREDICT WHERE THE SCIENCE MAY GO NEXT. Gene Editing 101 / 13

16 GENE EDITING TECHNOLOGY Other work has also explored fusing dcas9 to Fok1 to create a double strand breaking system very similar to TALENs and ZFNs. hfcas9: High fidelity Cas9 (hfcas9) was designed to ensure a complete absence of off-target effects. While CRISPR-Cas9 is still a highly specific system, there is a slight chance that strand breaks will occur in sections of the genome that aren t being targeted (offtarget effects) which can lead to inaccurate data being generated during CRISPR experiments. Off-target effects also have negative implications for clinical use in future, where unintended edits can cause potentially fatal mutations any system that displays a noticeable number of off-target effects cannot be considered for clinical gene editing. It was previously mentioned that the binding energy between the DNA and the grna-cas9 complex is subject to the level of compatibility between the grna and the target DNA strand, but it is also impacted by a series of complicated interactions between the DNA and the protein. The cumulative energy of these two sources generally far exceeds the energy necessary for the complex to endure, further promoting strand cleavage. However, in some cases, the net effect of these DNA-protein interactions is strong enough for the complex to remain bound to the DNA even if the DNA and grna are not perfectly complementary. In nature, this allows bacteria to remain resistant to viruses that have undergone slight mutations and thereby retain immunity over several viral generations but in a research context, it can lead to significant complications with off-target effects. To counter this problem, a new version of Cas9 was developed by mutating the DNA-interacting domains to reduce the DNA binding energy of the whole system. The concept was to lower the binding energy to such an extent that only a perfect match between the DNA and the grna would be sufficient to maintain the DNA-gRNA- Cas9 complex and thus achieving perfect specificity. Any site that didn t display a perfect match with the grna would dissociate from TARGETED GENETIC MANIPULATION HAS EXISTED IN ONE FORM OR ANOTHER SINCE THE 70 s AND 80 s, ALTHOUGH THE IDEA OF SELECTIVE BREEDING IS A CONCEPT STRETCHING BACK MILLENNIA. the protein before cleavage occurred. Once the perfect balance was found, results showed the total absence of off-target effects when hfcas9 was used, indicating absolute specificity to the target sequence and opening a new avenue for genetic engineering without the risk of unintended mutations occurring. Cas9n: Cas9 nickase or nicking Cas9 is similar to dcas9 in that it involves the deactivation of cleavage domains, though in this case only one of them is affected instead of both. With only one working cleavage domain (either RuvC or HNH), the enzyme cannot cause double strand breaks in the DNA and will only cleave a single strand to produce a nick. In a similar vein of thought to the synthesis of hfcas9, Cas9n has potential reduce the risk of off-target effects in genetic engineering. While one Cas9n protein cannot achieve a double strand break, if two proteins work in conjunction with each other (one binding to each strand of the helix) then both DNA strands can be cleaved. As the two proteins will need to recognise the DNA sequence both upstream and downstream of the break between them, there is almost no chance of a double strand break occurring outside of the target DNA. While unintended single strand cleavages will occur, incorrect nicks pose far less of a risk than a misplaced double strand break due to the decreased possibility of mutations occurring during the repair. With a reduced number of double strand breaks, the experiment can be considered safer and more reliable. Furthermore, recent investigations have shown that in the case of a single strand break in the DNA, homologous recombination can occur to correct the damage in place of the usual single-strand repair mechanisms, albeit with reduced efficiency. This means that through Cas9n, scientists have a reliable method of inducing homologous recombination without carrying the increased risk associated with a double strand break. Cpf1: Cpf1 is not a variant of Cas9, but is actually a different form of Class 2 CRISPR effector (which rely on single component effector proteins, like Cas9 does). Cpf1 works in roughly the same way as Cas9 does with three important exceptions. The first difference is that the grna used by Cpf1 does not contain tracrrna. Instead, Cpf1 uses mature crrna alone as a single guide RNA (sgrna) which further simplifies the guide synthesis and manipulation during genetic engineering. Secondly, instead of the G rich PAM sequence required for S. pyogenes Cas9, Cpf1 utilises a T-rich PAM. The different PAM sequence means that Cpf1 has the potential to act on sections of the genome that S. pyogenes Cas9 cannot. Finally, Cpf1 does not induce the blunt double strand breaks that Cas9 does but instead introduces a staggered DNA break with defined overhangs 4-5 nucleotides long. Overhangs increase the ease with which a gene can be artificially inserted into the genome and so Cpf1 holds great potential within the field of genetic engineering. As a more recent discovery than Cas9, work on Cpf1 is still in its infancy but with more time, it s possible that Cpf1 could become as important to current genetic investigations as Cas9. 14 / Gene Editing 101

17 GENE EDITING TECHNOLOGY COMPARISON With multiple techniques available, it can be difficult to decide between them when planning a gene editing experiment. Techniques like meganucleases and ZFNs might have fallen out of fashion in recent years due to more recent developments, but TALENs and CRISPR-Cas are both still commonly used in a multitude of experiments. As with any experiment, it s important to look at the different options available and ascertain which technique is most suitable for what you are trying to achieve. To try to give you an overview of the advantages and disadvantages for each of them, we ve drawn up a comparison between the four techniques discussed in this chapter, see Figure 1. SUMMARY Over the last four decades or so, gene editing technology has changed dramatically. The techniques have become more precise and less expensive, and so have become more accessible, further increasing the rate at which they can advance. With each new development the potential of gene editing has grown, with more and more people starting to consider the implications of using these technologies in humans to treat disease, a feat which would have been considered impossible not so long ago. However, before we discuss the broader implications of gene editing, we should look at using these techniques. With a basic understanding of the underlying science of these techniques, it is possible to look at how you might use them in a practical environment to achieve gene editing both in vitro and in vivo. n ESTIMATES SUGGEST THAT FIVE PAPERS ARE PUBLISHED EVERY DAY CONCERNING THE TECHNIQUE AND THE NON-PROFIT RESEARCH AIDS PROVIDED BY ADDGENE HAVE BEEN VIEWED OVER A MILLION TIMES IN THE SPACE OF A YEAR. Figure 1: Gene Editing Technology Adaptability Synthesis Specificity Meganucleases ZFNs TALENs Difficult. Recognition and cleavage domains are combined making manipulation very complicated. Difficult. The complex binding modalities of the Zn fingers mean that the system has to be designed as a whole, without room for minor alterations. Large-scale screening is required to optimise the process. Simple. Variations in the RVD allow for changes in recognition with minimal design and engineering required. Naturally occurring or requiring time consuming construction. Very time consuming and expensive synthesis process necessary, even after initial design. Moderate cost but time consuming and repetitive. Highly specific, though with limited scope due to the limitation of naturally occurring meganucleases and problems with adaptability. Highly specific. Highly specific. CRISPR-Cas9 Very simple. Only the grna has to be altered to completely change the recognition site within the DNA and these changes are very easy to achieve. Also capable of editing multiple sites in the genome simultaneously in the presence of multiple grnas. Simple and inexpensive to synthesise. Highly specific, with some variations giving rise to no off-target effects. Gene Editing 101 / 15

18 CHAPTER 2: USING THE CRISPR-CAS9 SYSTEM SPONSORED BY

19 USING CRISPR INTRODUCTION All gene editing techniques deploy similar DNA repair processes but this chapter will mainly focus on using CRISPR-Cas9 for simplicity. There are many different factors to take into account when planning your CRISPR experiment; this chapter is intended to highlight the main things you will need to consider when first approaching gene editing. DECIDING ON THE DESIRED APPLICATION One of the main attractions of the CRISPR-Cas9 system is that it is easily adaptable for a wide range of targets and applications. Currently there are four main uses of CRISPR-Cas: 1. Permanently deactivating gene function or expression through alterations in the genetic code (knock-out mutations) To permanently disrupt a gene with a knock-out mutation, you need to use a Cas9 (or Cas9n) protein in conjunction with a single (or dual in the case of Cas9n) grna that can target the essential sequence of your target mutation site. As discussed in the previous chapter, using Cas9n will increase the specificity of your experiment by reducing off-target effects but it will also make the process less efficient. 2. Introducing a point mutation within a sequence to induce the expression of a mutated gene Similarly, for point mutations, you will need to use Cas9 or Cas9n, alongside a single or dual grna, respectively, that targets your desired editing site. 3. Insertion of an entire gene through HDR Likewise for gene insertion, you will need to use Cas9 or Cas9n, with a single or dual grna, respectively, targeting your desired editing site. A DNA template of the desired gene will also need to be supplied. It is important to note that despite using the same reagents, the efficiency of this process is lower than the efficiency displayed in knock-out editing due to the natural limitations of HDR. 4. Impacting the expression of a gene either positively or negatively through manipulation of epigenetic factors. Activation and repression of selected genes (known as CRISPRa and CRISPRi respectively) both involve altering the level of expression of a gene without permanently affecting the genome sequence. To achieve this, you ll need to use a modified version of dcas9, where the cleavage action of the protein has been nullified and the protein has been bound to either a repressor (eg. dcas9-krab) or an activator (eg. dcas9-vp64). Repression of a gene can be achieved without a specific repressor by DNA blocking at promoter sites but dcas9 alone displays a lower effectiveness than a dcas9-krab complex in mammalian cells and so for better results, a repressor is advisable. This process also benefits from the possibility of using multiple grnas to target different promoter areas of the gene in question, potentially achieving synergistic gene inhibition or activation. As the applications use different reagents, you should start planning your experiment by deciding which one you want to perform. Once you know what you want to achieve and what reagents you ll need to do it, you can move on to deciding on a target sequence. TARGET SELECTION AND grna DESIGN SELECTING A TARGET SEQUENCE Deciding on which sequence to edit, commonly referred to as your target, is one of the most important stages in your study. What target you choose will inform what happens in the rest of your experiment and will help you to understand what your final results are telling you. Before you decide, you need to consider both the intended CRISPR application and the overall goal of the gene editing on the cell. While it isn t absolutely necessary, it can be beneficial to sequence your target. Knowing the nucleotide sequence can tell you more about the gene you re working with, but it can also help you to design your grna and to identify the location of any PAM sequences in the region. If there are no suitable PAMs nearby, you may want to consider exploring Cas9 proteins from other species or perhaps using a different protein altogether, such as Cpf1. With a specific gene in mind, you ll need to consider which region of the gene to target, something usually dictated by your chosen application. As mentioned previously, CRISPRa and CRISPRi experiments will target promoter regions of a gene, but other applications are more complex. Achieving genetic knock-outs can be done using grnas that target the 5 exons (coding sequences at the 5 end of the strand) which are continually expressed regardless of the cellular environment, instead of on an as needed basis. This reduces the chance of alternative splicing removing the targeted region from the mrna (the removal of certain exons from the mrna produced by genes that code for multiple proteins). It is commonly the exons near the N-terminus that are targeted for knock-out mutations as changes there increase the likelihood of frameshift mutations causing the production of non-functional proteins edits upstream will affect all coding triplets downstream of the mutation and will therefore affect the largest area possible, increasing the chances of inactivating the gene. If the 5 exons are an unsuitable target, it is also possible to target the exons that code for essential protein domains. This approach does necessitate significant knowledge of the protein in question to be able to identify these essential domains but where possible, indel mutations within them caused by NHEJ can have a great impact on protein function. If you want to carry out knock-out mutations, consider both options carefully and decide which is more suitable for your specific experiment, taking into account your knowledge of the proteins involved and the surrounding genetic sequence. Alternatively if you re trying to induce HDR, then it is vital that the target sequence is as close to the location of your desired edit as possible, depending on PAM locations and other genetic factors. Due to differences between cell lines or particular strains of model organisms, the genetic background may deviate from the reference genome sequence. Gene Editing 101 / 17

20 USING CRISPR Modest mismatching between the homology donor segment and your sample genome can negatively impact HDR efficiency and as a result, when making site-specific edits with HDR, it s highly advisable that you sequence the target area to identify the exact location you need to edit to ensure that your experiment is a success. Alongside identifying your target for HDR, you also need to consider the design of your donor DNA. The design and addition of the intended donor depend on the size of the sequence you want to insert, including both the gene being inserted and any tags you are using for later validation. Short sequences such as fusion tags or Single Nucleotide Polymorphisms (SNPs) can be added as single stranded oligonucleotides donors (ssodn). ssodns are typically less than 200 bases long and are comprised of the intended gene insertion, flanked on both sides with bases homologous with the target locus of the double strand break.. These homologous regions allow the cell s mechanisms to recognise the sequence as a match to the broken strand, promoting its acceptance as a donor. Larger sequences require the construction of plasmids for their addition, with the intended genetic mutation surrounded by as many as 2000 base pairs that are homologous with the sequence centred on the double strand break. In both cases, the donor DNA will need to be inserted as close as possible to the target site. It is also worth noting that there is a possibility for Cas9 to recut the newly modified gene of the plasmid or ssodn if the recognition sequence and PAM sites have not been modified to prevent targeting. With the target site known and sequenced, it s possible to start thinking about the grna you will use to guide the Cas protein to that site. DESIGNING AND SYNTHESISING THE grna With a target selected and the donor DNA designed, you need to focus on grna design and synthesis. The grna has to be a match to the target sequence to ensure the maintenance of the Cas9-DNAgRNA complex but beyond that, it s just as important that it doesn t match any other sequence in the genome. If the grna is a match to multiple sites in the genome, there will be a much higher frequency of off-target effects and thereby complicating the interpretation of your phenotype. A perfect grna would be an exact match to the target DNA with no homology to any other sequence in the genome but sadly in practice this simply is not feasible there will always be additional sites that exhibit at least partial homology to the grna and are therefore at risk of off-target effects. Fortunately for your experiment, this is where a PAM comes in. Even sites with perfect homology to your grna will not be cleaved if a PAM is not present because the Cas protein will not bind to the site; even in cases where a PAM is in proximity, any mismatches between the DNA and grna near the PAM sequence will greatly reduce the efficiency of cleavage. While this still leaves the chance of off-target effects impacting your experiment, it should reduce them significantly and increase the specificity of your system, improving your experimental results and conclusions. With off-target activity minimised, you now need to consider how efficiently your target DNA will be recognised and cleaved by your custom system (commonly referred to as on-target activity). It would be logical to assume that all grna sequences that have 100% homology to their target will have the same cleavage efficiency but surprisingly, research has shown that this is not the case. While the reasons behind this are not fully understood, it has been proven that the position of specific nucleotides in relation to the PAM will affect the efficiency of the cleavage system (for example, having a T nucleotide in one position might provide the strongest binding even if the corresponding base should be an A) and as a result finding the best grna for your experiment will require a comparison of the predicted on- and off-target effects of a range of grnas. To help speed this process along, a variety of grna programs have been developed by different companies and organisations which can locate PAMs and target sequences as well as providing you with a list of the most promising grnas. It is worth noting that these lists are based on the predicted activity of the grnas, so the results generated cannot be considered the absolute truth when being used in a practical environment. Another option is to use a validated grna. These are grnas that have been used successfully in genetic experiments previously and have been made commercially available. They can be bought in the form of plasmids from a handful of companies, thereby requiring a financial investment but also greatly reducing the time and resources necessary for you to design and synthesise your own. 18 / Gene Editing 101

21 USING CRISPR CONSIDERING TARGET SELECTION There are a lot of factors to take into account when you re deciding on your target gene and, in many ways, your experiment hinges on your decision. Here are three things that you should definitely be aware of before you start: Once a grna sequence has been chosen, it needs to be introduced into a plasmid, synthesized, or transcribed in vitro. Introducing the sequence to a CRISPR plasmid is commonly done through standard restriction enzyme-based cloning methods, though newer, homologybased methods have been becoming more popular in recent years. In many cases, the plasmid you ve selected for the process will dictate the most practical way of introducing your target sequence. If your reaction involves in vitro transcribed RNA, a template can be assembled by Polymerase Chain Reaction (PCR) or plasmid linearization. Alternatively, you can purchase a gene fragment then perform in vitro transcription with a commercially available kit. Synthetic crna, tracrrna, or sgrna can also be purchased from oligonucleotide synthesis companies. The process involved in introducing your components into a cell will be discussed in detail later in this chapter. EQUIPMENT REQUIREMENTS As with all investigations, experiments such as these can require reagents and samples that are very expensive. Even if your equipment appears to be efficient at first glance, inaccurate equipment can lead to incorrect quantities of reagents being transferred, and contaminated or damaged equipment can introduce unknown elements to the reaction, both of which have the potential to ruin the entire experiment by impacting the accuracy and reproducibility of your results. WITH A SPECIFIC GENE IN MIND, YOU LL NEED TO CONSIDER WHICH REGION OF THE GENE TO TARGET, SOMETHING USUALLY DICTATED BY YOUR CHOSEN APPLICATION. 1. You need to know if your target gene can express multiple transcripts ie. if the genome can ignore your edit and proceed with business as usual using a different transcript that doesn t contain your target exon. Larger genomic databases should be able to tell you if there are known splice variants of your target and might help you find an exon that appears in all variants to target. 2. You should consider the ploidy of your cell line. The existence of multiple alleles will affect the impact of your mutations on the phenotype and you need to bear that in mind. 3. Consider the presence of pre-existing Single Nucleotide Polymorphisms. Small variants like SNPs can affect the strength of the grna- DNA interaction and alter the efficacy of cleavage. Again, genomic databases might be able to help you. Gene Editing 101 / 19

22

23

24 USING CRISPR IT WOULD BE LOGICAL TO ASSUME THAT ALL GRNA SEQUENCES THAT HAVE 100% HOMOLOGY TO THEIR TARGET WILL HAVE THE SAME CLEAVAGE EFFICIENCY BUT SURPRISINGLY, RESEARCH HAS SHOWN THAT THIS IS NOT THE CASE. 22 / Gene Editing 101

25 USING CRISPR Having to rerun failed experiments is not only financially draining through reagent costs but it can become very time consuming. It is very important that any and all equipment involved in your experiment is chosen carefully with the intended use in mind focus on getting the best equipment for what you are trying to do. This approach doesn t just extend to expensive sequencing machines or reaction vessels, but also to seemingly minor variables like pipette tips. While at a first glance these may be overlooked, the tips used in your experiment can be a source of contamination, result in waste of samples or reagents and even be the cause of a repetitive strain injury if used frequently. They are, therefore, an area that needs to be addressed in any experiment of this nature. One of the major factors you need to be aware of is the liquid retention of your tips. When using pipettes, it s common to see a small droplet of liquid remaining in the tip as a result of surface tension and in experiments that require very precise quantities of reagents, this small droplet can be a very large problem. Even if this droplet is accounted for in your measurements, it still means the loss of potentially expensive reagents. In response to this, some companies now offer tips that contain a hydrophobic plastic additive that destroys the surface tension of these droplets, forcing all the liquid out of the pipette. Without these droplets, the pipettes can display higher accuracy and lead to less wasted material. Another consideration is that the join of the tip to the pipette itself needs to be a good fit. If the tip is too loose then air can get into the sample and throw off the volume accuracy, as well as being a potential source of contamination. The lack of pressurisation within the pipette will also affect the force required to take up or eject samples, increasing the chances of developing a repetitive strain injury. Most pipette models will have specific tips available to buy but there is also a variety of universal tips available that could be more suitable to your particular experiment or offer superior features. Several companies have been developing various technologies to work at improving the seal between universal tips and pipettes by increasing the flexibility of the tips at the proximal end, allowing adaptability not present in pipettespecific tips. Some companies such as Biotix, Alpha Laboratories and ART (Thermo Fisher) now offer low-retention tips that combine features such as blade tips for the prevention of hanging drops, flexibility at the proximal end and contamination detection with hydrophobic resins. Further variations on these tips can include measurement graduations on the side of the tips or more ergonomic designs with lower insertion and ejection forces, though these are not absolute necessities in all experiments. Pipette tips are just one example of the many considerations you have to bear in mind when performing experiments like these. Any equipment that comes into contact with your sample or reagents has potential risks or complications and for every experiment you should bear these in mind and decide what can be done to minimise these risks. While potentially undesirable, this is an area where spending more money can definitely be a benefit. CELL DELIVERY With the grna and donor DNA synthesised and practical implications taken into account, you need to start considering how you intend to introduce your reagents to your sample DNA. There are several methods of doing this, the ease and feasibility of which are dependent on the reagents being used, whether the reaction involves in vivo or in vitro conditions and the type of cells involved. Generally, delivery of new genetic components to cells can be divided into two very broad groups: viral and non-viral. This section explains the main options available and when it is possible to use them. Non-Viral: One of the simplest non-viral delivery methods is transfection. Chemical transfection, a commonly used method sometimes known as lipid-mediated transfection, relies on inducing a positively charged carrier molecule to complex with the components intended for cell insertion (Cas9, grna and donor DNA). The positive charge of the carrier molecule allows for electrostatic binding to the negative cell membrane, an interaction which then triggers a process called endocytosis to draw the complex into the cell. Within the cell, the complex dissociates and your experimental components are released into the cytoplasm from which they can move into the nucleus and proceed with gene editing. Chemical transfection offers a way of inducing transient expression of the CRISPR components which, due to the lack of stable expression, will show reduced activity over time as the grna is degraded (the grna source DNA has not been transferred into the cell s genome and so the cell cannot produce more over time). This transfection method requires the grna and Cas9 to be transcribed in in vitro conditions which greatly restricts potential clinical applications (transient expression doesn t ensure that all edits will be made in with a single dose ) but the larger issue with this technique is the resistance of many cell types to transfection. In mammalian cells, chemical transfection is limited to HeLa cells alone, with other cell types remaining unaffected and making the technique unsuitable for many types of cell therapy the restrictions this method presents have meant that it is not a popular choice for CRISPR experiments. An alternative but similar method is to use in vitro transcription of grna- and Cas9-containing plasmids to generate the mature CRISPR components outside of the target cells (as with chemical transfection) and then, instead of utilising a carrier molecule, moving them through the cell membranes via microinjection or electroporation. Like chemical transfection, this gives rise to transient expression as the plasmids themselves do not enter the cells and so will display reduced expression over time, as well as being limited to in vitro application, restricting the potential of the system. The primary advantage this method holds over chemical transfection is that it has been shown to work not only on HeLa cells but on stem cells too, opening a wider range of potential applications including some cell therapies. One way of ensuring stable expression with non-viral delivery is to use a mammalian expression vector. Like in bacterial transfection, this system relies on the uptake of plasmids that confer an advantage to the cell in a process called positive selection; Gene Editing 101 / 23

26 USING CRISPR TO PERMANENTLY DISRUPT A GENE WITH A KNOCK-OUT MUTATION, YOU NEED TO USE A CAS9 (OR CAS9N) PROTEIN IN CONJUNCTION WITH A SINGLE (OR DUAL IN THE CASE OF CAS9N) GRNA THAT CAN TARGET THE ESSENTIAL SEQUENCE OF YOUR TARGET MUTATION SITE. if the Cas9 and grna promoters are contained within a plasmid that also provides immunity to a toxin, for example, then there is a pressure on cells to incorporate the plasmid into their genome when the toxin is present. By incorporating the toxin immunity gene, the cell has inadvertently also incorporated the Cas9 and grna genes and will proceed to express them like any other. With the source DNA present, when levels of grna and Cas9 start to fall, the cell is able to produce more to replace them stable expression has been achieved. In some reactions, the plasmid can also be engineered to contain a reporter gene such as GFP to aid in later identification of positive cells which can make validation a far simpler task (eg. in the case of GFP, cells that have taken up the new DNA will fluoresce). Unfortunately, this method is restricted to in vitro application, just as the other transfection methods are. While these are relatively risk free and simple methods of genetic transfer, they are not applicable in all cases. Some cells, such as primary cells, are highly resistant to transfection in this manner, making the systems completely useless. The restriction of these methods to in vitro applications (or reliance on partial in vitro synthesis) also limit the potential clinical applications, leading scientists to explore alternatives with more promise. In these cases, it is necessary to look at more complex processes to ensure genetic transfer. Viral: For stable, long-term CRISPR expression, viral delivery needs to be considered. The most commonly used method is lentiviral transduction which involves having the CRISPR components as either a single vector or as two separate vectors and then enveloping the plasmids within lentiviral particles. Lentiviral particles are used in nature as carriers of viral DNA to infect new cells in the proliferation of a virus; they are therefore biologically designed to infect cells and so are very easily incorporated into target cells without the need for careful engineering (though in most viral delivery systems, the replication genes within the viral DNA have been removed to prevent the virus from spreading). This method also includes the options of including a reporter gene within the vectors in the same way as mammalian expression vectors. As lentiviral transduction gives stable expression of Cas9 and grna, there is no deterioration in expression over time when the grna starts to be degraded, more will be produced to take its place and so the system can continue working indefinitely. The continual nature means that a single dose should theoretically cleave all potential sites within the genome and ensure that the editing process is completed without the need for further intervention. The nature of viral infection also ensures that the difficult-to-transfect cells will be affected too, ensuring a wide range of cell types can be utilised and thus improving the number of potential applications. While effective, lentiviral transduction carries some associated risks despite the inactivation of the replication genes, such as the potential for producing replication-competent lentivirus that can proliferate at will despite the previous modifications or undergoing a mutation to 24 / Gene Editing 101

27 USING CRISPR When using plasmid-based techniques, you will experience a delay between introduction of your vector and mutation events occurring; this is dictated by the time it takes the cell to transcribe the plasmid and produce the constituent proteins and is usually upwards of 12 hours. By introducing the proteins as a complex, you can start to observe mutations occurring very shortly after transfection without any waiting period. If you are trying to keep your experiment time down, using Cas9 RNPs might be the solution. The other advantage is that, as with other in vitro transcription models, the complex will not survive long in the cell and as such may show a greater specificity by spending less time in the cell, the complex has less opportunity to cause off-target effects. For experiments that carry a higher risk of off-target effects, you may achieve more accurate, reliable results using Cas9 RNPs. As with any of the methods discussed here, consider your own experiment and decide whether using RNPs will be of benefit to the results you are trying to obtain. The delivery system is an integral part of any CRISPR-Cas9 experiment. Regardless of how well planned out your experiment is, if you cannot somehow introduce Cas9 and grna into your target cells then you won t be able to create any strand breaks and therefore won t obtain any results. Deciding on what delivery method to use will depend on what cell lines you want to work with and how resistant they are to transfection, whether you are going to be using in vivo or in vitro applications and whether or not your reagents are suitable to the system (ie. if they can be contained within viral vectors). A useful starting point would be to examine whether any previous experiments have used similar conditions to your own and what delivery systems they employed, then deciding whether or not their system is suitable for your experiment. become oncogenic viruses which can result in life changing illnesses in the event of accidental exposure. A somewhat safer alternative is AAV transduction, which has been found to be the least toxic method of in vivo viral delivery available at the present time. It is also able to produce transient or stable expression as needed and can infect dividing and non-dividing cells, making the technique very adaptable. The downside to this approach is that there is a packaging limit during the production of AAV particles, meaning that AAV transduction is only compatible with the Staphylococcus aureus form of Cas9 (1053 amino acids), which carries a relatively complex PAM (NNGRRT) and thus is active at fewer potential target sites. While this is otherwise a hugely useful method, this significant restriction has led to it only being used relatively infrequently, resulting in a weaker understanding of the process. Cas9/gRNA Ribonucleoprotein Complexes: Cas9/gRNA Ribonucleoprotein complexes (Cas9 RNPs) have not been used as extensively as the previously discussed techniques due to their novelty, but they have recently been growing in popularity for several reasons. The complexes are introduced into cells using traditional electroporation or transfection methods, but in this case the Cas9 enzyme and the grna targeting the intended sequence have been combined in vitro to form a complex, prior to their introduction to the cell system. Their functionality in terms of strand cleavage within the cell is identical to plasmid-based expression systems, but there are certain benefits to using Cas9 RNPs that have made them interesting prospects in certain experiments. With the design, synthesis and delivery of your components decided, it is time to proceed with the experiment itself. VALIDATION Having introduced your chosen CRIPSR components to your cell line and hopefully induced the desired edits to occur, you need to identify which cells, if any, now carry your desired mutation. These cells then need to be isolated and replicated to produce a cell line that expresses your mutation through the population. The unpredictable nature of double strand break repair and the theoretical availability of both HDR and NHEJ will mean that after your experiment, you will be left with a heterogeneous population of cells: some with no edits (where cuts have not been made or the repair mechanisms have not changed the triplet sequence), some with a single allele edited and the remainder with both alleles edited. The first stage of your validation process should be a rapid assessment of whether or not a significant population of cells have undergone genetic editing if they haven t then it is an indication that the system doesn t work and needs modification. Validation is an important stage in any reaction as it is what proves your experiment has worked and gives your conclusions weight; without it, the experiment can t be considered valid. There are several methods of validation for gene editing, depending on what mutations you were trying to introduce. Gene Editing 101 / 25

28 be INSPIRED drive DISCOVERY stay GENUINE APPLICATION NOTE Single Guide, Simplified: EnGen sgrna Synthesis Kit, S. pyogenes Generation of microgram quantities of sgrna in less than one hour with the EnGen sgrna Synthesis Kit, S. pyogenes Introduction CRISPR is an acronym for Clustered Regularly Interspaced Palindromic Repeats, which are genomic loci found in many bacteria and archaea. The CRISPR/Cas9 pathway naturally allows for the elimination of genomic material from invading sources in bacteria, and has recently been adapted as a molecular biology tool to edit genomes in a target-specific manner. Cas9 (CRISPR-associated protein 9) is a doublestranded DNA endonuclease that forms an active ribonucleoprotein (RNP) when complexed with guide RNAs (grnas) encoded at the CRISPR loci. grnas provide sequence-specificity to the RNP, directing the Cas9 nuclease to DNA targets resulting in double-strand breaks in the DNA. In nature, S. pyogenes Cas9 is programmed with two separate RNAs, the CRISPR RNA (crrna) and the transactivating crrna (tracrrna). The crrna contains ~20 nucleotides of homology complementary to the strand of DNA opposite and upstream of the Protospacer Adjacent Motif (PAM) (NGG) sequence. The tracrrna contains partial complementary sequence to the crrna, as well as the sequence and secondary structure that is recognized by Cas9. These sequences have been adapted for use in the lab by combining the tracrrna and crrna into one long single guide RNA (sgrna) (1) species capable of complexing with Cas9 to recognize the target DNA and induce double-strand DNA breaks (Figure 1). Activation of the cellular double-strand break machinery can lead to insertions and/ or deletions (indels) through Non-Homologous End Joining (NHEJ), resulting in disruption of the gene at that specifi c locus. In the presence of a homologous repair template, the homologydirected repair (HDR) pathway can be activated leading to the introduction of specifi c changes in the DNA at the targeted site. This application note describes sgrna synthesis using the new EnGen sgrna Synthesis Kit, S. pyogenes, which simplifies the generation of custom sgrna in an hour or less by combining template synthesis and transcription in a singletube reaction. FIGURE 1: Cas9 Nuclease, S. pyogenes An sgrna is complexed with Cas9, S. pyogenes. Cleavage occurs three nucleotides upstream of the PAM sequence (red). sgrnas are complementary to the strand opposite of the PAM. 5 3 DNA sgrna Target DNA 3 GAGAACGGCGAAAACTA ACT GAGAACGGCGAAAACUA ACU CTCTTGCCGCTTTTGAT TGA Cleavage Cas9 Nuclease, S. pyogenes TGG ACC PAM (NGG) 3 5 Protocol The EnGen sgrna Synthesis Kit, S. pyogenes provides a simple and quick method for transcribing high yields of sgrna in a single 30-minute reaction using the supplied reagents and target-specific DNA oligos designed by the user. This target-specific oligo contains a T7 promoter sequence, ~20 nucleotides of targetspecific sequence and a 14 nucleotide overlap region that anneals to a complementary region within the S. pyogenes Cas9-specific Scaffold Oligo (included in the EnGen 2X sgrna Reaction Mix). The DNA polymerase within the Enzyme Mix extends both oligos from their 3 ends creating a double-stranded DNA (dsdna) template for transcription by T7 RNA Polymerase, also provided within the Enzyme Mix. Synthesis of the dsdna template and transcription of RNA occur in a single reaction, resulting in the generation of a functional sgrna. Target sequences for modification are selected using online guide selection tools such as ChopChop or Desktop Genetics. Target-specific oligos to be ordered are designed using the EnGen sgrna Template Oligo Designer within the NEBioCalculator (NEBioCalculator.neb.com). Reactions are assembled at room temperature and incubated at 37 C for 30 minutes, followed by DNase I treatment. sgrnas are purified using RNA spin columns, quantified by UV light absorbance and analyzed by gel electrophoresis. In vitro digestion of dsdna templates in the presence of Cas9 is performed to demonstrate the functionality of sgrnas synthesized using the EnGen sgrna Synthesis Kit, S. pyogenes. Materials EnGen sgrna Synthesis Kit, S. pyogenes (NEB #E3322) Cas9, S. pyogenes (NEB #M0386) SYBR Gold (Thermo, Cat #S11495) Novex TBE-urea gel (Thermo, Cat #EC6885BOX) RNA Loading Dye, (2X) (NEB #B0363) RNA Clean & Concentrator -25 (Zymo Research, Cat #R1017, #R1018) 1

29 Results Using the guidelines above, 72 target-specific oligos were designed and used to generate 72 different sgrnas. Yields ranged from 4-45 µg, with most falling between 4-25 µg. sgrna quality and length were evaluated by denaturing gel electrophoresis and SYBR Gold staining (Figure 2). The sgrna control oligo synthesized with the EnGen sgrna Synthesis Kit was incubated with Cas9, S. pyogenes to form an active RNP complex that cleaved a double-stranded DNA target in vitro (Figure 3). FIGURE 2: Examples of sgrna synthesized Using the EnGen sgrna Synthesis Kit, S. Pyogenes RNA was run under denaturing conditions on a 10% Novex TBE-Urea gel and post-stained with SYBR Gold. bases 150 Low Range ssrna Ladder sgrna1 sgrna2 sgrna3 Control sgrna Conclusion The EnGen sgrna Synthesis Kit, S. pyogenes can be used to generate microgram quantities of functional custom sgrnas in less than an hour. This method reduces protocol time with the single reaction-format and only requires the design of a single ~55 nucleotide ssdna targetspecific oligo for each sgrna resulting, thus reducing the cost per reaction Jinek, M. et al. (2012) Science PubMed ID: One or more of these products are covered by one or more patents, trademarks and/or copyrights owned or controlled by New England Biolabs, Inc. For more information, please contact NEB s Global Business Development team at gbd@neb.com. SYBR is a registered trademark of Molecular Probes, Inc. NOVEX is a registered trademark of Thermo Fisher Scientific. RNA CLEAN & CONCENTRATOR is a trademark of Zymo Research, Inc. FIGURE 3: Example of an in vitro Cas9 Nuclease Assay A DNA target (PvuII-linearized pbr322) was cleaved by Cas9 complexed with the control sgrna synthesized using the EnGen sgrna Synthesis Kit, S. pyogenes. Reactions were set up following NEB protocols with a ratio of 20:20:1 (Cas9:sgRNA:target site). Cleavage products were resolved on a 1% TBE agarose gel stained with ethidium bromide. 2-Log DNA Ladder DNA only DNA + Cas9+ sgrna DNA + Cas9 DNA + sgrna bp 5,000 4,000 3,000 2,000 Input DNA, 4,361 bp Fragment 1, 2,423 bp Fragment 2, 1,938 bp 1,500 be INSPIRED drive DISCOVERY stay GENUINE New England Biolabs, Inc., 240 County Road, Ipswich, MA Telephone: (978) Toll Free: (USA Orders) (USA Tech) Fax: (978) info@neb.com 2

30 USING CRISPR Indel Mutations: Indel mutations are usually the easiest edits to identify and can be validated very quickly using a mismatch cleavage assay, consisting of four main steps: 1. PCR is used to amplify the target sequence in both mutant and wild-type (un-edited) strands present within the genomes of a cell population. The PCR will require a high fidelity proofreading polymerase to prevent the introduction of further errors in the sequence; by introducing errors at this stage, PCR has the potential to interfere with your experiment by providing a false positive when the intended mutation has not taken place, invalidating any conclusions drawn after this stage. 2. The DNA strands are thermally denatured and then rehybridised. This allows mutant strands to anneal with wild-type strands, leading to mismatches at the site of the mutation which produce single stranded DNA bulges in the duplex that are susceptible to certain nucleases (it will also involve wild-type-wild-type and mutant-mutant strands to reanneal). 3. The annealed DNA is treated with an enzyme such as the SurveyorTM nuclease from Integrated DNA Technologies or T7 Endonuclease 1 from New England Biolabs, which will cleave DNA strands at the 3 end of any mismatches (different mismatches are preferential to others, but all will be identified). All heteroduplexes created by the annealing of wild-type and mutated strands will be cleaved, producing two strands with weights and sizes that correspond to the strands on either side of the mutation. 4. The digested DNA products then need to be separated by size, something which is usually achieved with gel electrophoresis or occasionally high-resolution capillary electrophoresis. Assuming that mutations have occurred within the sequence, the assay should display three bands: full length PCR products where two identical strands have re-annealed in the second step and have remained uncleaved, and two bands corresponding to DNA strands on either side of a mismatch where the nuclease has cleaved the sequence. If no mutation has occurred, a single band of full length, wild-type product will be present. This method is semi-quantitative the strength of the bands are an indication of how much product is present and thus how successful your reaction has been. As this testing method requires no prior knowledge of the mutation, it can be a very useful tool in early experiments when the understanding of the gene and sequence is poor. It also carries the advantage of being rapid and simple to perform (PCR is an inexpensive platform that is very commonly used in regular biological labs). Because of these characteristics, mismatch cleavage assays have proven to be very useful for the high-throughput systems used in screening which require fast turnaround on a large number of samples. The main downsides of this method are as a result of PCR, but these are only minimal. While amplification runs the possibility of introducing new mutations to your sequence, using a high fidelity polymerase will ensure that the risk of this is extremely low and can be disregarded in your experiment. Using the high fidelity enzyme, you should also validate the compatibility of the PCR reagents with downstream digestion to verify that you do not need to complete a post-pcr clean up, reducing the overall workflow of your investigation. It is also important to remember that the assay is only of use for identifying the presence of an indel mutation; once you have obtained a successful result from your assay, it s a good idea to sequence the mutated locus to confirm that the mutation has occurred in the desired location. The sequencing will also inform you as to the exact nature of your new mutation, which may help you later in ascribing significance to your results. Despite the downsides, mismatch cleavage assays are very effective tools following genetic engineering experiments because of the speed of the technique and the availability of the equipment necessary, and they provide a convenient method of identifying the presence of indel mutations. A newer alternative to mismatch cleavage assays is to use Tracking of Indels by Decomposition (TIDE). This involves using the quantitative sequence trace data from two standard capillary sequencing reactions (a mixed pool containing the sample and a control) with a specially developed decomposition algorithm to identify the predominant types of indels present in your sample and to quantify the editing efficacy. Programs are currently commercially available for you to analyse your data with a TIDE algorithm. HDR: If you were trying to induce HDR, validation is slightly more complicated. It s advisable to carry out an attempt at creating an indel mutation using your grna prior to the HDR modification so that you can confirm the grna is active at your intended target through a mismatch assay. If this experiment produces a negative result, it is advisable to go back and redesign your grna until it achieves the desired activity. Once you are certain that your grna is active at the desired target, you can complete your HDR experiment as previously intended. HDR edits are detected via pre-planned systems. These usually take the form of newly-inserted reporter genes (such as GFP which will fluoresce in cells with the desired mutation) or the addition or removal of restriction sites that will affect the products formed after PCR, though there are other options in certain cases. Large insertions, for example, should be easily detectable due to their increased weight relative to the native sequence which can be displayed by electrophoresis. Single nucleotide changes can be shown with next generation sequencing. A significant factor of HDR mutations to keep in mind is that they will occur less frequently than indel mutations and as a result, most HDR experiments will rely on screening a large number of different colonies to produce enough mutations to establish a clonal cell line. In fact, there is a high likelihood that if one allele in your cell line has been modified by HDR, remaining alleles will have been mutated by NHEJ. If the frequency of your desired mutations is very low, it is likely due to either a naturally low HDR frequency or because of a low transfection frequency, something which can be solved by reviewing your transfection method. Some work has been done to improve the efficiency of HDR in CRISPR 28 / Gene Editing 101

31 USING CRISPR experiments, usually by repressing key molecules within the NHEJ pathway and thereby promoting HDR in its absence (with some success), though this introduces further complications to the process and can increase reagent costs and the time required for your experiment. Deletions: Deletions are a specific way of inactivating a gene which work by removing a section of the DNA entirely and hence deleting the sequence necessary for gene function. This can be done using two grnas to direct Cas9 to coincidentally cleave at separate, defined points within the target gene, causing the intervening sequence to be excised from the genome in its entirety. While these mutations can be detected in the same way as indel mutations, a far simpler method is to use PCR on the sequence directly with PCR primers that flank the intended deletion site. The significant weight and size loss within the genome in the event of a large deletion mean that electrophoresis will display two distinct bands: larger, intact wild-type alleles and lighter, smaller alleles that have undergone a deletion mutation. Next Generation Sequencing: While the other validation methods mentioned above involve observing the effects of a mutation to identify it, Next Generation Sequencing (NGS) instead looks directly at the ATCG code. From there, if the wild-type sequence has been identified, it is a simple matter to compare the two and establishing what changes, if any, have been made. Unlike the relatively simple testing methods above, NGS is a far more involved and lengthy process but it offers some advantage over its analogues. The major factor to be aware of is that NGS allows for an examination of what off-target mutations have taken place and therefore provides an analysis for how specific your system is, alongside the quantitative assessment of on-target effects. The problem with NGS usually lies within the cost. While the expense of NGS has plummeted in recent years, there is still an initial financial investment necessary for the equipment which is not always possible for smaller labs. Likewise, the time constraints of NGS have been decreasing but it can still require more time than simpler validation techniques. One way of decreasing this cost is to focus your sequencing efforts on selected regions of the genome based on the desired target location and the predicted areas of off-target activity. Using targeted gene panels can greatly reduce both the cost and time requirements of sequencing by focusing on the genes of most interest, making the technique more accessible to researchers. This does, however, carry the risk of missing potentially significant offtarget edits due to the narrowed focus. Despite the problems, NGS is still a very powerful technique and, where possible, it can be a significant advantage in genetic engineering experiments. Epigenetic Manipulations: As previously discussed, introducing transcription promoters or repressors does not involve altering the DNA sequence. As a result, success or failure of your experiment cannot be detected in the same way as mutations as the genome will remain in its natural form; instead, the efficacy of your experiment on the expression of a gene requires monitoring the levels of the corresponding protein. If the levels remain unaffected by your interaction, it means that the reaction has failed, possibly because of your grna not being suitable for the target site or your promotor/repressor being insufficiently active. If the reaction is a success, you should be able to detect increased/decreased levels of the protein being targeted. These manipulations, commonly referred to as CRISPRi and CRISPRa, will be discussed further in the next chapter. Creating a Clonal Cell Line: If your tests are coming back negative, it is an indication that your edit has failed. This is likely due to an insufficient presence of grna and/or Cas9 as a result of poor delivery or expression, or perhaps because of a lack of sufficient target cleavage. Re-examination of your cell delivery and grna design should reveal where the experiment failed and let you reform your plan. However, should these results show a significant population of edited DNA, then it is possible to move onto creating clonal cell lines that contain your desired mutation. To start with, cells possessing the mutation need to be isolated, usually via serial dilutions or with Fluorescence-Activated Cell Sorting (FACS) of cells that express a marker gene that indicates an edit has occurred. By isolating the mutated cells, you ensure the proliferation of your edited genome and exclude any wild type cells. Once isolated, these cells can then undergo an expansion period to establish the cell line you want for testing or further experimentation. The new cell line will need re-validation and sequencing using the methods described above to ensure that the mutation has persisted through your isolation process; where possible, detection of protein expression using western blotting offers a secondary validation of your genetic edit, improving the strength of your conclusions. If these tests come back positive, then you have proof that your desired mutation has occurred and that your experiment has been a success. SUMMARY While the adaptable nature of CRISPR-Cas makes it suitable for a range of different applications, it does make it more difficult to decide how to approach your own experiment. Breaking the process down into the different stages described here might help you to build a clearer understanding of the different considerations you need to take into account and perhaps to identify problems in a reaction that has failed. This chapter loosely outlines how a CRISPR reaction might proceed in a lab environment, but this isn t the whole picture. Gene editing as a whole has applications across a range of different areas, such as medicine and genomic research, with more applications being developed all the time. In the next chapter we ll discuss a few of the more prominent real world applications of gene editing and explain the scientific approaches behind them. n Gene Editing 101 / 29

32 CHAPTER 3: APPLICATIONS OF GENE EDITING

33 APPLICATIONS OF GENE EDITING INTRODUCTION One of the most apparent medical uses of gene editing is in the treatment of diseases which have a root cause within the genome but this is not the only avenue currently being pursued. As mentioned in the previous chapter, CRISPR has potential in the manipulation of epigenetic factors without affecting the genetic sequence. This is a technique which can be used as a treatment for various conditions, as well as offering a readily accessible method for large-scale genetic mapping. Editing techniques are also being utilised in many other areas of biology and physiology, but the main applications that we will focus on here relate to disease treatment due to its prevalence and potentially wide-reaching consequences. A more controversial position arises when considering gene editing as a method for embryonic modification, both in a medical sense and as a step towards designer babies, but as this is still an almost entirely theoretical concept, it won t be discussed here. TREATMENT OF GENETIC DISORDERS CRISPR-Cas9 has been shown to be a highly precise and efficient tool for gene disruption. One of the primary reasons this has attracted significant interest is because of the potential uses such a system has in the treatment of genetic disorders. Genetic diseases and disorders commonly arise from errors within the genetic sequence that affect gene expression or function. A tool that allows alteration of the sequence (such as CRISPR) has the potential to correct these errors to create the functioning gene once more or to remove the diseased alleles completely, curing the disease. The specificity of CRISPR-Cas9 minimises the possibility of off-target, potentially devastating effects in vivo and its adaptability make it suitable for targeting any system within the body. Similar work has been completed with older editing tools such as ZFNs and TALENs, providing a reference point for many of the more recent CRISP-Cas9 experiments. There are two main ways gene editing can be used to treat genetic disorders: inactivation of a mutated gene to silence it or insertion of a functional gene to replace a damaged one. Both methods have been used by various groups to treat a variety of disorders, with some techniques displaying promising results in the initial investigations. CANCER TREATMENT Genetic research almost always becomes involved in suggestions for cancer treatment, an understandable reaction when there were an estimated 8.2 million cancer-related deaths worldwide in 2012 alone (according to the IARC). The immense range of cancer types and genetic causes responsible has historically hindered the development of universal cancer treatments but this might be set to change with the relatively recent development of personalised immunotherapy. Chimeric Antigen Receptor-T cell (CAR-T) therapy involves removing a sample of a patient s T-cells (a major component of the cellmediated immune system and largely in charge of recognition of non-self material). These T-cells are then modified to recognise cancer cells as non-self before they are reinserted into the patient, thereby triggering their own immune system to attack the cancer. The use of autologous T-cells (native to the patient) make the treatment perfectly tailored to the specific case, greatly reducing the possibility of the patient s body rejecting them as foreign material. Transplant rejection is something which has hindered similar techniques in the past. Even without the CRISPR-Cas9 system, CAR-T therapy has shown significant promise in human trials but there is room for improvement with the potential provided by genetic editing techniques. The modified T-cells work by displaying chimeric antigen receptors that recognise proteins on the surface of the cancer cells, in the same way as they recognise surface proteins of pathogens during infections. This recognition then aims the immune system at the cancer and the patient s body will begin to destroy the cells. This has proven to be a very effective method of cancer treatment in comparison to older techniques, however there are some drawbacks that need to be overcome before immunotherapy can be considered the go-to treatment for cancer. One of the problems found with this method of immunotherapy is that in some cases the cancer-targeting receptors displayed by the modified T-cells are expressed alongside the endogenous T-cell receptors (present in T-cells before their modification). This duel expression can cause problems with specificity and potency, making the treatment unpredictable in certain cases and posing potentially lethal health risks to patients. Genetic editing can be used knock out the genes coding for the T-cell receptors and thereby silencing their expression so that only the chimeric antigen receptors are displayed on the surface of the cells. This leads to a reduction in unpredictable specificity and potentially offers a higher therapeutic potency, making the treatment more effective at targeting the cancer cells and therefore making the treatment cycles shorter. Another limitation is the requirement of autologous T-cells to prevent a patient s body rejecting them in a natural immune response. Should the body identify any T-cells as foreign, they will be targeted and destroyed by the patient s immune system, causing the therapy to fail and worsening the condition of an already ill person. It is therefore very important that each patient is injected with T-cells that will be identified as self so that they will pass by the immune system without triggering an attack. This is most easily achieved by extracting a patient s own T-cells and modifying them, instead of using a communal stock. While effective, this requirement means that for each patient the treatment has to be custom made, something which is entirely impractical for the current cancer rates and severely limits the potential for clinical application. A solution to this problem might be found by using gene editing to prevent the production of the human leukocyte antigen (HLA) within extracted T-cells. HLA is the natural recognition factor used in humans to distinguish cells between self and non-self, preventing accidentally targeting one s own cells but ensuring that all foreign material is destroyed. If HLA production is successfully silenced in modified T-cells, the patient s body will no longer be able to identify injected cells as foreign material and the immune system should remain dormant against them. Gene Editing 101 / 31

34 APPLICATIONS OF GENE EDITING ZFNS IN HIV CLINICAL TRIALS As we mentioned briefly in the first chapter, ZFNs have been making a comeback in gene editing experiments in the last year or so. California-based company Sangamo Therapeutics has been utilising ZFNs to artificially create the HIV-resistant T-cells some people naturally possess in the hopes of producing an effective treatment, or even a cure, for the virus. Early successes in preclinical testing looked very promising and this success continued through Phase 1 trials. Currently the therapy is nearing the end of Phase 2 testing to ensure the efficacy and safety of the treatment. The same company is also using ZFNs to mutate hematopoietic stem cells to boost a patient s genetic resistance to the virus. As a newer concept, these experiments are still in Phase 1 trials but initial results are showing the same promising results as their targeted T-cell therapy. With more than 36.9 million people in the world living with HIV and AIDs, success with these gene therapies could completely revolutionise the modern medical environment. Inactivation of the HLA has applications throughout all forms of allogenic (non-self) cell therapy, though currently its focus is primarily on cancer immunotherapy. The final complication is that despite the promising results from T-cell immunotherapy, tumour cells do display some resistance to the treatment. This resistance takes several forms, but it is commonly achieved via the inhibition of effector functions in T-cells or by silencing the target antigen. By inhibiting the effector functions, the T-cells are effectively crippled, unable to identify the materials they come into contact with and thereby removing their ability to activate the immune system. In some cases, tumour cells can even induce apoptosis in T-cells to destroy them entirely, allowing the tumour to repress the natural immune system of the host and enable cancer cell proliferation. Gene editing has been shown to decrease the chances of this by knocking out certain T-cell receptors that are targeted by the cancer cells. Without the receptors, the T-cells are unidentifiable and this therefore prevents their being targeted and destroyed, which in turn increases T-cell effector function. With further research, a similar technique could be used to enable T-cells to pass a variety of cancer cell inhibitor pathways, allowing them to work against a wide range of cancer types. The numerous variations of cancer present in the general population mean that it is desperately important to develop a treatment technique that applies to different forms of the disease and immunotherapy has the potential to perform this function. This work is still in the early stages, but initial clinical trials have shown very promising results in large population samples. Future developments might see immunotherapy involving CRISPR-Cas become a major part of cancer treatments. HIV TREATMENT Ever since the early 1980s there has been an international crisis surrounding HIV/AIDs. Since the first reported cases of the virus in 1981, several treatment plans for the immunodeficiency disease have been published that improve the quality of life of a patient but a true cure has not yet been established. The CRISPR-Cas system has the potential to change that. Research using ex vivo somatic cells isolated from a HIV patient has shown that CRISPR-Cas9 can be used to knock out the CCR5 co-receptor which is involved in primary HIV infection. Without this co-receptor the virus is unable to infect new cells, preventing the virus from spreading any further and reducing its risk. The modified somatic cells can then be reintroduced to the patient without risking an immune response (like in cancer immunotherapy, by using the patient s own cells they are not detectable by the immune system), something which is vital in a disease that weakens the immune system and puts the patient at risk of further infections. Early human trials investigating this method have shown success in some patients, providing an early proof-of-concept. However, the treatment does still require significant development to ensure the identification of side effects or complications and to improve the efficacy of the treatment. Other experiments have displayed the possibility of removing the HIV genome completely from infected cells instead of inactivating a single gene. This is made possible by long, terminal repeats that surround the viral DNA and provide a suitable target area for recognition and cleavage in a deletion event. Using a different approach, a paper published in October 2016 reported using CRISPR-Cas9 in an automated, high-throughput system to edit T-cells to incorporate mutations that have the potential to induce HIV resistance. Some people naturally possess HIV resistance but the nature of this immunity has not yet been discovered; a system of high-throughput screening could help to discover the required mutation. This information could potentially be adapted into a cell therapy for HIV that artificially gives cells the immunity gene to prevent infection. While the primary focus of this research has been on the treatment of HIV, the results produced have an immediate applicability in the treatment of other viral diseases such as Hepatitis B and the Human Papillomavirus, which has been shown to have direct links to the development of cervical cancer. If CRISPR-Cas9 can be targeted at genes involving the maintenance and replication of the viral genome then it can lead to degradation of the viral DNA in vivo and potentially provide a step towards curing these diseases. NEUROMUSCULAR DISORDERS Neuromuscular disorders such as Huntington s disease, amyotrophic lateral sclerosis (ALS), and multiple sclerosis (MS) are almost always caused by genetic factors and very rarely have a cure. Their root causes in the genome have made them all a target for genetic engineering experiments in the past but in recent years the main focus has been on Duchenne Muscular Dystrophy (DMD), a severe form of muscular dystrophy that affects boys and can be carried by female relatives. DMD manifests in children, with most patients requiring wheelchairs before they are 13 years old. The disease is caused by a damaged gene that prevents 32 / Gene Editing 101

35 APPLICATIONS OF GENE EDITING SINCE THE FIRST REPORTED CASES OF THE VIRUS IN 1981, SEVERAL TREATMENT PLANS BEEN PUBLISHED THAT IMPROVE THE QUALITY OF LIFE OF A PATIENT BUT A TRUE CURE HAS NOT YET BEEN ESTABLISHED. the production of a protein called dystrophin, leading to a gradual weakening in a patient s musculature which affects movement and, in later years, breathing and circulation. In most cases, the gene has been inactivated by large deletions upstream that cause a frameshift mutation and lead to the production of a non-functioning protein. A straightforward solution to this would be the introduction of a functional dystrophin gene through HDR using a viral vector as a delivery method but this is unfortunately not an option. The sequence that codes for dystrophin is exceptionally large (roughly bases) which makes it unsuitable for size-restrictive viral vectors, making cell delivery of the DNA impossible with current technologies. A way of bypassing this problem is to use minigenes that can fit in the viral vectors but these have only produced partial functionality in comparison to the complete gene. They have not yet displayed an ability to reverse the disease in humans and so have limited potential for clinical application. The inability to insert a functional gene has led to an interest in using gene editing as a possible pathway for repairing the endogenous gene directly. An investigation in 2011 (J. Rousseau et al.) managed to adapt ZFNs to target an exon in the human dystrophin gene to produce indel mutations. These indel mutations allowed for the correction of the frameshift that causes DMD (restoring production of functional dystrophin) and provided a proof-of-concept even though several drawbacks were displayed. By using ZFNs to induce NHEJ, the outcome of each mutation was unpredictable and there was no guarantee that the frameshift would be corrected, with only some of the samples displaying positive results. The experiment was further restricted by having a narrow patient population as the source of the cells and therefore having limited validity as a suggested clinical treatment without further work. In an attempt to ensure gene editing would correct the frameshift, more recent experiments have attempted a specific deletion of one or more exons. By targeting an exact sequence, the experiment is less reliant on the chance that the uncontrolled mutations will correct the frameshift and instead makes sure that the correct number of bases will be removed to restore gene function. These deletions have become even more accessible with the discovery of CRISPR-Cas9, which allows for highly site specific mutations and removes some of the unpredictability previously present with ZFNs. Some very recent cases have involved deleting more than 300,000 bases of genomic data in a single mutation event that has been shown to restore dystrophin expression in over half of the patients involved. Intramuscular injections have introduced CRISPR-Cas9 containing AAV vectors to skeletal and cardiac muscle in mice to provide in vivo research, with varying levels of success. Most notably, a paper published in 2015 (C. Long et al.) reported the recovery of dystrophin expression and partially recovered muscle functional deficiencies after using CRISPR-containing AAV on the mdx mouse model (a mouse model specifically designed to mimic DMD in humans). This research is still early and obviously requires more work but it represents a big step towards potential clinical applications of in vivo modifications to cure DMD. Gene Editing 101 / 33

36 APPLICATIONS OF GENE EDITING Alongside research into DMD, a more high profile advancement in neuromuscular disorder research was the discovery of a gene linked to the development of ALS. In July 2016, it was reported that the NEK1 gene had been identified as the most common gene to contribute to ALS, something which hit headlines as a result of the unusual funding stream for this research: the international Ice Bucket Challenge. While the challenge started in 2014 was seen as frivolous by many, it greatly raised awareness of ALS on a global scale and, for at least a short period, donations to the ALS Association skyrocketed, leading to a collaboration of over 80 researchers across 11 countries. This collaboration led to NEK1 s discovery and has provided a potential target for future gene editing research. LIVER-TARGETED TREATMENTS The liver is a vital organ that has interactions with a whole host of different biological systems. This wide-spread involvement in biological functions makes it a very attractive target for research across many scientific areas, such as blood disorders, cirrhosis, hepatitis, haemochromatosis, and many others. Genetic editing was first used successfully in the liver in 2011 when ZFNs were able to restore haemostasis in a mouse model with haemophilia (H. Li et al.) and the work has progressed ever since. Most research in this area has been held back by small sample sizes of patients and the high cost involved in clinical development, but the new tool provided by CRISPR-Cas9 has opened the area up once more. In 2014, a CRISPR system was used in a mouse model in vivo to successfully correct tyrosinemia type 1 (a potentially fatal genetic disorder in new-borns caused by the uncontrolled breakdown of tyrosine) (H. Yin et al.). However, the experiment only showed very low efficiency and the technique was not directly translatable into human trials. Despite the problems, these are only early experiments and at the very least, the results show a promising start for the introduction of mutations in the liver using CRISPR- Cas9. HAEMATOLOGICAL DISEASES Haematological disorders is a term that covers many different conditions, from sickle cell anaemia, to haemophilia, to leukaemia and lymphoma. The range of diseases means that there is no treatment approach applicable to all of them but when studied in isolation, some research groups have had success with the various disorders. Many of these investigations involve the treatment of sickle cell disease, which is caused by a point mutation in the β-globin gene, alongside a condition called β-thalassemia which arises from a separate mutation in the β-globin gene. The two mutations together cause red blood cells to be formed incorrectly, distorting the usual bi-concave disk into a thin sickle shape and impacting their ability to carry oxygen. Experimentation with gene editing technology has managed to correct both sets of mutations in human induced pluripotent stem cells, effectively curing the disease. This still doesn t represent a cure in humans, but it is a major step forwards and it has led to investigations using the same principles to treat other forms of anaemia. The primary suggestion for clinical treatments involves the knockout of a gene called BCL11A. BCL11A is a transcriptional regulator involved in the suppression of γ-globin, a gene expressed during the development of a foetus that is silenced at birth and which can compensate for β-globin deficiencies when it is upregulated. Theoretically by removing the suppression of γ-globin, both sickle cell disease and β-thalassemia can be treated as the deficiencies of the β-globin gene will compensated for. Standing in the way 34 / Gene Editing 101

37 APPLICATIONS OF GENE EDITING of this as a potential treatment however is the detrimental effect on nonerythroid cells (cells that are not red blood cells) the absence of BCL11A has. Instead, an enhancer region responsible for coordinating BCL11A needs to be targeted by a highly specific gene editing tool like CRISPR-Cas9 to suppress the gene in specific cellular contexts, allowing for the upregulation of γ-globin without compromising nonerythroid cells. As with the other disease treatments this work is still very recent and requires significant further work. Nonetheless, genetic research into haematological disorders has been in the forefront of the news recently as a result of work by a Chinese research group involving the attempted cure of β-thalassaemia in human embryos, something that will be discussed in depth in the next chapter. ANTIMICROBIAL APPLICATION Gene editing has the potential to revolutionise the way we approach medical treatments in future, but the technology doesn t have to be limited to curing only genetic diseases. Recent work has demonstrated that gene editing can be used to treat pathogenic bacterial infections by altering the bacterial DNA to disable vital proteins (thereby killing or disabling the bacterial cells) and by conferring antibiotic resistance to the host DNA. This concept has proven effective in a mouse skin colonisation model using the CRISPR-Cas9 system to perform the necessary edits, though the technology is still too primitive for human trials. Another potential antibacterial application is to force bacterial cells to attack their own DNA with their CRISPR system through the introduction of self-targeted RNAs. The primary study that attempted this focused on the less commonly used Type I system which facilitates the disruption and removal of DNA. As native CRISPR systems tend to be Type I, this has the potential to become a highly useful tool, especially as our supply of effective antibacterial drugs runs out. OTHER APPLICATIONS A less direct application of CRISPR-Cas9 in the study of genetic disorders is its use in the creation of disease models for investigations. These models are useful both in the study of drug development to observe how novel medications affect illnesses and as a way of observing how a disease develops and affects cells. Previously, creating genetic defects in mice and other creatures has proven inefficient as the mutations have to occur in two chromosomes to ensure expression and only at the correct locus which can be hard to ensure with most gene editing methods. In some cases it can also require the time consuming and expensive rearing of multiple generations to display the correct mutations in both chromosomes. This is especially a problem in larger animals like apes which have long gestational periods that can greatly delay research. CRISPR-Cas9 makes the process much easier and faster as multiple generations are not necessary to alter both chromosomes, as well as being highly specific and ensuring valid results. CRISPR-Cas9 also allows multiple genes to be edited in a single mutation event, meaning that there is no need to breed two mutated animals to produce offspring that carry both of the desired variants. As before, the removal of gestation times can greatly speed up genomic research and can help us to build more accurate investigative models. Better understanding of how gene mutations affect the body can lead to a much greater knowledge of disease development and progression, something which can aid in the advancement of treatments and drugs by opening new avenues to explore. The conditions discussed above are just some of the applications of gene editing in the treatment of diseases. Research is currently being conducted in the treatment of diseases in all areas of the body, with many showing promising initial results when using both the CRISPR-Cas9 system and TALENs or ZFNs. The novel nature of these techniques mean that this research is still very much in the early stages but with more time it s possible that they may hold the answer to curing a range of life-altering diseases. FUNCTIONAL GENOMICS Genetic mapping is a vital part of trying to understand the genome. While the high-profile sequencing of the human genome was a major step forwards, there are still many questions lingering about the function of specific genes as well as how different genes interact with each other. One way of answering some of these questions is systematic loss-of-function screening; as the name suggests, this involves the systematic inactivation of genes within the genome and observing what effect the change has on the cell. In mammals, this technique of screening can be difficult as diploid cells require both copies of the gene to be inactivated for valid results, just as when creating biological models. In the past, this type of screening has usually been achieved using RNA interference but that carries a high risk of off-target effects and only provides partial gene suppression, making changes in the cell more difficult to detect and producing a level of uncertainty about the validity of the results. It also doesn t allow investigation into the non-coding sections of DNA, limiting investigations to the exome which only constitutes around 2% of the overall genome. CRISPR-Cas9 can completely inactivate genes with high (though imperfect) specificity through the introduction of indel mutations and are capable of interrogating non-coding regions. These advantages make it highly suitable for loss-offunction screens which can potentially help us understand how the genome works. To aid in the interrogation of non-coding sequences, another method of genetic mapping has been developed. CRISPR-Display (CRISPR-Disp) uses dcas9 as a programmable DNA binder to either investigate or repurpose the non-coding regions of the genome. Usually a grna molecule consists of 20 base pairs necessary for target selection, followed by a loop of approximately 80 base pairs that dcas9 uses to recognise the molecule before complexing with it. However, research has shown that the DNA-targeting sequence can be part of a much larger sequence without impacting dcas9 binding, provided the recognition loop remains present. This freedom allows for the incorporation of non-coding RNA (ncrna) sequences into the grna, which can then be delivered to specific genomic loci through the specificity of the grna-dcas9 complex. Gene Editing 101 / 35

38 APPLICATIONS OF GENE EDITING IN THESE EXPERIMENTS, A VERSION OF DCAS9 THAT HAS BEEN MODIFIED WITH DISTINCT FUSION PROTEINS IS USED AS A TRANSCRIPTIONAL OR EPIGENETIC MODIFIER; THE DCAS9 WILL NOT BE ABLE TO GENERATE DOUBLE STRANDED DNA BREAKS, SO THE MODIFICATIONS ARE TRANSIENT AND WILL NOT BE PASSED THROUGH MULTIPLE GENERATIONS. 36 / Genomics 101

39 APPLICATIONS OF GENE EDITING This delivery can be used to elucidate the action of poorly understood ncrna (through observation of the effects created by the ncrna-dna interactions), as well as having possibilities within synthetic biology applications. REVERSIBLE GENE ACTIVATION AND REPRESSION The ability of CRISPR to directly manipulate gene expression has been touched upon a few times already but the purpose and the process involved haven t been directly explained. In these experiments, a version of dcas9 that has been modified with distinct fusion proteins is used as a transcriptional or epigenetic modified; the dcas9 will not be able to generate double stranded DNA breaks, so the modifications are transient and will not be passed through multiple generations. In a research context, this allows for short term study of the action of specific genes without permanently altering your cell line s genome. It also carries interest for clinical application because of the reduced risk involved in transient manipulations (there is no chance of long term damage or of the modifications persisting in subsequent generations). Epigenetic manipulation with CRISPR is divided into two main classes: CRISPR activation (CRISPRa) and CRISPR repression/interference (CRISPRi). CRISPRA CRISPRa primarily involves the fusion of a transcriptional activation domain to dcas9, as well as sgrnas targeted to the transcription start sites of the gene you intend to target. Further modification of CRISPRa can be used to target both the regulatory regions of the genome and the target gene itself, providing a wide scope of action that allows for different applications. CRISPRa can be useful in studying gene expression and interactions between different sections of the genome but it still hasn t been as widely used as its counterpart, CRISPRi. Current research is mostly limited to work with bacterial cells and initial experiments with mammalian cells have shown that multiple sgrnas are necessary to induce significant activation, with more work needed to establish a better system. Recent work using a Synergistic Activation Mediator (SAM) system has displayed potential for improvement, but this is still relatively novel and hasn t been expanded into widespread use yet. CRISPRI Like CRISPRa, CRISPRi uses dcas9 which, in this circumstance, has been fused to the KRAB transcriptional repressor domain, as well as sgrnas targeted to the sequence downstream of the target transcription start site. Targeted gene suppression can also be achieved with dcas9 alone in bacterial contexts, although this has not been effective in mammalian cells. A similar system exists naturally in bacterial cells, which lack the required machinery to perform the mammalian interference system, RNAi. The mechanism of action requires a complex sgrna and Cas9 to impact transcription elongation through the blocking of RNA polymerase, something which is achieved with very high efficiency in bacterial cells. Bacteria also display an alternative pathway that prevents transcription initiation by disrupting the binding of transcription factors at the targeted gene. CRISPRi has several advantages over the mammalian RNAi, including a lower cytotoxicity due to a better specificity and displaying less time dependant genetic drift (changes in the frequency of a gene variant within a population). It also has the potential to affect the non-coding sections of DNA due to its action at a genetic level instead of the mrna targeting that RNAi relies on. CRISPRi also displays better suppression of the target genes than the partial suppression of RNAi, but the system still displays problems. The main drawback of this interference system is that it can have knock-on effects downstream of the target when multiple genes exist in an operon (multiple genes using a single promoter). By disrupting the binding of promoters, all genes within the operon downstream of the target will be silenced as well, which can invalidate the results of an experiment involving CRISPRi, or cause unintended changes to the cell s epigenome. Despite being highly efficient in bacteria, CRISPRi translation into mammalian cells has not managed to retain this success. Initial results from mammalian experiments without using a repressor achieved only moderate repression, though more recent investigations have found that combining the Cas9 protein with the KRAB repressor, as mentioned earlier, can increase the efficiency of the reaction. Even with a repressor present, the system has still proven to be less efficient than its bacterial counterpart; more work is necessary to develop this method of gene repression. Despite the problems displayed, the specificity and adaptability of Cas9, as well as the reduced risk from the lack of DNA strand breaks, make it a highly attractive option for further investigations. Other studies have shown that by using a combination of sgrnas, activation and repression can be achieved simultaneously within the same cell, opening up a variety of potential applications and investigations. These combinations involve the use of scaffold RNA (scrna) which encodes the information required for DNA recognition and for recruitment of a specific transcription factor. Again this work is still in its early stages, but it has the potential to become a focal point in the manipulation of epigenetic factors. SUMMARY This chapter offers as overview of some of the potential uses of gene editing and the major areas of research currently on-going, but it should not be considered a comprehensive list. Gene editing and CRISPR-Cas9 in particular has great potential to revolutionise biotechnology and medicine but there is still a long way to go before genetic engineering can be considered a clinical standard. With this technology still in its infancy, there are many questions regarding the safety of these treatments and techniques, as well as what the long term effects of altering the genome might be. Beyond the immediate technical complications with efficacy and offtarget effects, problems also arise when the science becomes mixed up with ethical and legal discussions. One of the major impacts on how this technology will progress is in public perception and legal restrictions. These considerations have long been involved in genetic engineering and it is important to remember them as part of any study or investigation. n Gene Editing 101 / 37

40 CHAPTER 4: ETHICAL, LEGAL AND SOCIAL IMPLICATIONS

41 ELSI INTRODUCTION The idea of humans altering the nature of life has been a contentious issue ever since Mary Shelley s novel Frankenstein was published in 1818 and while the exact content of the argument has shifted over the last two centuries, the core principles remain the same. The potential benefits of using gene editing to advance treatment of fatal diseases have been generally uncontested, but there is substantially more opposition to the idea of permanently changing the fabric of our DNA or furthering humanity beyond our natural limits. Whether or not this is even possible is as yet undecided; German-born philosopher Hans Jonas, for example, argued that human nature is fixed at its present level. On the other side of the argument, cytologist and biochemist Christian de Duve pointed out that we have already risen from the mental limits of our genetic ancestors to not only understand evolution and natural selection, but to begin to manipulate the underlying theories. De Duve furthered his argument by reasoning that by ignoring our ability to manipulate evolution, we would be choosing to side with the random changes brought about by natural selection instead of relying on reasoned, planned adaptation through genetic engineering. With techniques like CRISPR and TALENs still being so novel and only having limited understanding, society as a whole is unable to agree on what the future ramifications might be and whether or not the benefits outweigh the risks. Until such a time as a general agreement can be reached, the debate will continue. As with any topic involving a complicated ethical position, gene editing has been under heavy scrutiny by ethicists and lawmakers alike throughout the world. With science advancing at its current rate, new technologies are often being developed before the legal systems are ready for them, and this has led to a complicated web of laws with little uniformity through different countries. With so much about gene editing being uncertain, it is an understandably difficult area to approach for research. This chapter is meant to outline the main ethical concerns present in the modern environment and what the public perception of these issues has been, as well as a rough guide to the international laws governing gene editing. PUBLIC PERCEPTION AND ETHICAL CONCERNS Genetic Engineering has always been a highly controversial topic amongst both scientists and the general population. Opinions on the topic can vary by age, gender, religious preference, personal wealth, and a whole host of other factors, making it impossible to present a unified, decisive public opinion. These very mixed opinions can partially be due to an unclear distinction between what can be considered research and what could potentially lead into clinical applications of genetic editing, as well as the at-times overcomplicated legal system that confuses what research is considered acceptable by law. This distinction can be further confused by the media s focus on sensational news stories, which in some cases lack the necessary understanding to present the facts in a balanced, unbiased light. In the future, there may be some cases in which it may be necessary for ethicists and lawmakers to weigh up the relative merits of public opinion against clinical or laboratory research and decide which argument has greater standing to come to a conclusion. There are a few ways of examining the public perception of gene editing, but one of the easiest is to look at public surveys. These offer a very black and white view of how a person considers genomic research and therefore don t allow for any misunderstanding of opinion. A more widely-reaching but potentially uncertain method is to examine the general reaction to recent news in the field, something which has more of a basis in real-world applications than surveys do and therefore presents a more realistic overview of public perception. This chapter will discuss both methods to try to present a unified stance, as well as investigating the opinions of the science community in relation to gene editing for research and clinical applications. PUBLIC OPINION SURVEYS To try to build a broadly accurate assessment of public opinion towards the use of genetic engineering for future clinical use, several different surveys have been completed by a variety of companies. All of these surveys suffer significant drawbacks due to either a narrow sample group or by lack of awareness of the knowledge level of the participants in the survey (as an indication of how well the questions were understood), so it s not possible to draw any clear conclusions from them. However, they offer a suitable starting point when considering the general attitude of the public. Gene Editing 101 / 39

42 ELSI Most surveys establish some of the same basic things: in all cases lifesaving gene editing is significantly preferable to non-health-related editing (to improve characteristics such as appearance or intelligence) and that in general, stronger religious feelings will increase mistrust of gene editing. There is also an indication in several surveys that gene editing is less attractive when presented as a method of improving a human beyond current human limits instead of just correcting them to their natural potential (e.g. helping a disabled person with spinal problems to walk is more acceptable than using the technology to allow a human to run faster than current world records). Many surveys display a public wariness of as yet untested technology and there appears to be a fear that these techniques will be used in clinics before how they work or what the ramifications might be are fully understood. Several surveys, primarily American, also indicate a worry that genetic editing will exacerbate the divide between the rich and poor (those who can choose to buy modifications and those who cannot afford it) and those with mutations and those without, leading to class conflict. This also reflects on the stigmatisation of living people with genetic diseases who could potentially be considered lesser for not having undergone gene editing. In a similar vein, there has been significant protest to the implication that some non-fatal disorders need to be corrected (and thereby indicating that a person without these disorders is the default state) and that the value of a person with a genetic disorder is less than that of a healthy human being. It is worth noting that this problem is not new and nor is it limited solely to gene editing. A recent example within public health initiatives is the rise in smoking bans in several countries encouraging a societal view that smoking is shameful or to be avoided (this is, of course, not the opinion of everyone, but merely select individuals). When only considering public healthcare spending, it is obviously a benefit to have a healthier population which requires less treatment, but this ideology doesn t take into account the variations and nuances in society and a person s right of autonomy. Another observation was the disparity between men and women. The surveys indicated a more positive approach to gene editing among males, with female participants displaying more caution about the techniques. It is unclear whether this is related to the higher riskaversion generally observed in women in statistical analyses or the potential involvement a woman would have with a genetically altered embryo during pregnancy. The final point these surveys highlight is a noticeable divide between younger and older generations, with younger people more readily accepting of the new technology; this has generally been the case in all areas of technological development. However this is not to say there aren t hesitations within the younger generation about potential clinical applications, with the general feeling remaining more hesitant than positive. The surveys almost unanimously conclude that while there is some positivity in relation to gene editing, the overall wariness outweighs the potential acceptance of clinical applications. However, as mentioned before, these questions are asked in a very black and white environment, often lacking context that might affect how people answer. Alone, these surveys offer no firm input to the question of whether or not gene editing is something we should pursue as a society. In some cases monitoring the perception of real world events can offer a more realistic view of the matter, especially when gene editing has so frequently been in the news. RECENT DEVELOPMENTS The last few years have been littered with significant advancements in genetic research, partly due to newer techniques like CRIPSR but also thanks to a wider, growing understanding of the human genome. The advent of technologies like social media and smartphones has meant that news of gene editing has been delivered to a large percentage of the population very rapidly and provided a platform for anyone to voice their opinions. These advancements have brought many concepts into the public eye in a way that hasn t really been seen before and allowed scientists to assess the population s position on certain topics. The most recent high-profile example of genetic engineering was a birth of the three-parent baby, a boy who possessed the usual DNA mix from his mother and father but also the mitochondria of a donor (providing 0.1% of the child s DNA). This donation was performed in an attempt to avoid the child inheriting Leigh Syndrome from the mother. Leigh Syndrome is a severe neurological disorder that presents in infants through the progressive loss of movement and mental abilities, commonly resulting in death by respiratory failure within three years. The family in question had previously tried to have children naturally but had suffered the death of two children (aged eight months and six years) along with four miscarriages, leading them to believe that they had no choice but to seek help. There have been mixed responses to this news. The initial negative reactions were not helped by the assertion of the lead scientist that the technique was carried out in Mexico due to a lack of regulation there compared to the USA. In truth, three-parent babies are only theoretically legal in the UK and even then not by the novel method used in this case (the legalised method in the UK involves some embryos being discarded which was not suitable for the religious parents). Now that the procedure has been performed with apparent success, some people are starting to question whether the real problem that needs to be faced is whether the regulations need to be relaxed to allow for this treatment in others suffering similar situations. One of the main objections has been that the two previous three-person babies born in the 90 s (using a different technique) both developed genetic disorders, the main catalyst for the banning of the practice in the USA. Others have seen this development as a major step forwards and an example of what science can achieve in the future. The previous child history of the family have had some people, including the medical team responsible, arguing that preventing the boy from suffering Leigh s Syndrome was the ethical thing to do. This is really the opening of the question as to whether or not, in this particular instance, the end justifies the means; preventing Leigh syndrome is unquestionably a good thing when the chances of the child s survival were so low. The real factor to consider is whether or not the procedure of creating a three-person baby is, in itself, unethical. Beyond potential safety considerations, a primary concern that has been raised is the matter of consent. The current climate is filled with conflicts over what can be covered under a person s right to autonomy, all the way from the legality of abortion to rape convictions. Several people have pointed out that while the parents 40 / Gene Editing 101

43 ELSI agreed to this treatment, the child, now a healthy 11-month-old, did not. The response to this is largely to point to neonatal care. When infants require healthcare, it is up to the parents and the healthcare professionals to act in the best interests of the child who is obviously unable to make decisions about their care. This is something that occurs in hospitals around the world every day and many would argue that this particular example of gene editing is no different. The three-parent child is obviously much too young to express any opinion on the matter and it is likely that until he is, the public opinion will remain divided. While this event hasn t provided any decisive consensus, it certainly displays the public uncertainty when it comes to alterations to embryos, even when there isn t a possibility of the modification being passed onto future generations (the inserted DNA is housed solely within the mitochondria which are not passed on from fathers). It may be an interesting area to observe, but this applies solely to clinical application of gene editing and doesn t indicate the public opinion on research. For that, we can look towards the recently announced experiments in the UK, Sweden, and China, which involve using human embryos to research foetal development and the potential removal of a gene linked to blood disease. Both the Karolinska Institutet in Sweden and the Francis Crick Institute in the UK intended to study the early development in embryos to further our understanding of a process that is still relatively unknown. The Human Fertilisation and Embryology Authority (HFEA) approved the research in the UK with the stipulations that the embryos could not subsequently be inserted into a woman and that the study could only last for 14 days of embryo development (in accordance with the international agreement and assured by the natural limit an embryo can exist outside a host). The study involved using CRISPR-Cas9 to systematically disable genes to observe how their absence affected foetal development. The work at the Karolinska Institutet involved observing embryo development on a molecular level for the first week of life as well as using CRISPR to knock out genes tied to early development. The data they obtained established that mouse models, the most commonly used model for foetal studies, were very poor analogues for human development with considerable differences between the two systems. The main inconsistency was found to be the regulation of genes on the X chromosome to prevent over-expression when two copies of the chromosome were present. In both cases the work was completed with no direct intent for clinical application. Early embryonic development has been significantly under researched, primarily due to the moral and legal restrictions, and yet it is a vital part of the development of a foetus. Over 80% of miscarriages occur in the first 12 weeks of pregnancies and in many cases it isn t clearly understood why the miscarriage occurred; further research could help lower the rate of miscarriages significantly. Despite the seemingly beneficial studies and generally positive response, there was still some backlash. Several opponents of gene editing technology stipulated that editing in embryos like this was a slippery slope towards unnecessary (non-health related) edits and the ever-present concerns about Designer Babies. There were also concerns raised that no matter how innocent the initial investigations might be, the knowledge learned in these studies could be adapted to future germline engineering by other groups regardless of the original intent. However, despite the negative feeling from some groups, there was a surprising amount of support, particularly in UK. Many saw the studies as important scientific research and small concerns were assuaged by the fact that the research would be monitored by the HFEA to ensure that there was no compromise on morals. This does raise some question as to what exactly constitutes moral research, as it is no firm understanding of what we, as people, have unanimously agreed as immoral or unethical. Nonetheless, this support implies a certain level of public acceptance towards genetic research using tools like CRISPR-Cas9 provided that the study has achieved the correct approvals and is monitored throughout. The openness of the Francis Crick Institute about the entire process was notably appreciated by many. THE DATA THEY OBTAINED ESTABLISHED THAT MOUSE MODELS, THE MOST COMMONLY USED MODEL FOR FOETAL STUDIES, WERE VERY POOR ANALOGUES FOR HUMAN DEVELOPMENT. Gene Editing 101 / 41

44 ELSI SURVEYS HAVE ESTIMATED THAT AS MANY AS 50% OF AUSTRALIAN COUPLES WOULD LIKE TO CHOOSE THE GENDER OF THEIR CHILDREN, WITH A SLIGHTLY HIGHER PERCENTAGE SAYING THAT THEY WEREN T AGAINST THE PRACTICE EVEN IF THEY MIGHT NOT USE IT THEMSELVES. On the other end of the spectrum are the Chinese research group who only admitted to using CRISPR in human embryos after their research was completed, prior to the work done in the UK and Sweden. The study was intended to try to correct the mutation that causes β-thalassaemia (discussed in the previous chapter) but only reported relatively low success rates. The team managed to avoid significant disapproval by their assurances that all embryos used in the study were non-viable embryos from IVF treatments but there were still several groups that questioned the ethics of their experiment. This disapproval was only strengthened when the team involved admitted that they stopped their experiment when they realised that the techniques were still too imperfect to be viable. Many saw this sudden, unexpected advancement as the sign needed to establish a temporary ban on gene editing until everyone had agreed on an overall goal and direction to prevent morally and legally ambiguous practices. However, until different countries manage to establish a universal set of legal guidelines, it is unlikely that any such ban will have a noticeable effect. Another high-profile case currently in international awareness is the matter of sex selection for embryos in Australia. Between 1999 and 2004, the Australian state of New South Wales allowed for scientifically administered sex selection on potential children resulting from IVF. The practice was stopped when they were strongly advised to discontinue the practice by the National Health and Medical Research Council of Australia (NHMRC) due to ethical objections. Since then, the technique has been universally banned in Australia with the exception of cases involving the passing on of serious genetic conditions to children of a certain gender (such as DMD which only affects male offspring). This ban has had fertility specialists who work in IVF reporting an increasing number of Australian couples travelling abroad to countries that allow for sex selection, frequently nearby Thailand. The change implies a significant desire for the practice to be available. Travelling abroad can in some cases cost up to five times as much as the usual price of a round of IVF treatment, making it unobtainable for many couples and raising concerns of an inequality divide, as well as calling into question the safety of foreign clinics. Several countries in Asia, like Thailand, and the USA allow for sex selection during the IVF process, with some celebrities openly discussing their experiences in choosing the gender of their child. The concerns raised and the international inequality in sex selection have led to proposals for legal reform in Australia. A high profile proposed amendment would allow couples 42 / Gene Editing 101

45 ELSI SCIENTIFIC CONSENSUS It is not purely among the general population where opinions on gene editing are divided. International summits organised by the Hinxton Group and the National Academy of Sciences (NAS) in 2015 (September and December respectively) both discussed the ethical and scientific implications of gene editing in the context of research and as a potential clinical treatment. The Hinxton Group report displayed a general consensus that gene editing as a form of research was not only acceptable, but held tremendous value in helping to draw conclusions relating to biology. They therefore believed that moral concerns about genetic editing as a whole should not be used as a hindrance to experiments purely intended for research. They also discussed the use of non-viable embryos for investigations of embryonic biology and development, concluding that while it solved the ethical objection of interfering with a potential human life, the validity of data collected from a non-viable embryo was questionable. Further, they found that a consensus could not be reached on what research could be considered acceptable when taking into account the small number of human embryos currently available for testing. Legal restrictions and social objections have meant that only a very small number of human embryos, commonly from rejected IVF pools (both viable and non-viable), are available to groups wishing to study embryonic development. Should studies of this nature become more widespread, it might be necessary to establish a priority system to which research is deemed the most appropriate or advantageous. Overall the Hinxton Group concluded that clinical applications for gene editing should not be completely ruled out in the future but that at present, the techniques were too primitive and too poorly understood to be considered. They also advised that governments should suit their laws in this area to rely on safety concerns, rather than merely on moral objection. to choose the gender of their third child, leaving the first and second children up to chance in an attempt to avoid creating a gender imbalance. It s thought that it will be an option primarily used by parents with two children of the same sex who wish for a third child of the opposite sex. These reforms also include potential changes to financial incentives for egg donation and the time period during which frozen eggs can be stored. The suggested changes have been met with generally positive responses from fertility clinicians who have spoken about patients desperate for a boy or a girl. Responses from the general public have been more mixed but with a lean towards a more positive outlook. Surveys have estimated that as many as 50% of Australian couples would like to choose the gender of their children, with a slightly higher percentage saying that they weren t against the practice even if they might not use it themselves. The overall positive response implies an openness towards the idea of selecting certain characteristics in someone s future children and while this is a long way from specifically editing an embryo with gene editing tools, it might indicate that there is a potential for acceptance of the practice in future. The National Academy of Science s report was slightly more mixed on certain topics. The report features several opinions on germline editing with the majority indicating that it should be prohibited permanently in relation to clinical applications, with one scientist proposing a two year ban on basic research until the legal system had formed an international blanket ban on germline editing. This is directly contrary to the Hinxton Group summit that speculated research should not be held back by concerns over germline engineering. It can also be argued that until we have a more thorough understanding of gene editing, we cannot form a set legal precedent as it will not be clear exactly what we are declaring to be unlawful, though many would prefer to side with caution in this regard nonetheless. The NAS report went on to discuss the similarities between germline engineering for the sake of removing disabilities to the eugenics movement of the 20th century. There was a fear that removal of select disabilities from the germline would increase the stigmatism that accompanies the disability or disease in question, widening the social disparity that is already a contentious issue in the modern world. It was reasoned that many of the same pressures that led to eugenics still exist today and increasing polarity among the population could allow a similar movement to appear. Gene Editing 101 / 43