1. Purification, amplification and sequence analysis of DNA. (See the The Molecular Biology of the Gene Watson et al.,). 2. Recombinant DNA technology

Size: px
Start display at page:

Download "1. Purification, amplification and sequence analysis of DNA. (See the The Molecular Biology of the Gene Watson et al.,). 2. Recombinant DNA technology"

Transcription

1 1. Purification, amplification and sequence analysis of DNA (See the The Molecular Biology of the Gene Watson et al.,). 2. Recombinant DNA technology Recombinant DNA technology refers to a collection of methods used to manipulate the sequence of genes and proteins. Manipulating regulatory sequences of genes provides experimental control over where and when the transgene is expressed (see 3.7) while by changing coding sequences the structure and function of protein molecules can be altered. The coding sequence of genes is routinely altered for a variety of experimental purposes. Site directed mutagenesis involving the changing amino acids at specific positions can be used to study their contribution to the protein s function. For example mutating serine or threonine residues to a non-polar amino acid such as alanine can be used to test if they are targets of specific protein kinases. Deletions and insertions involving short (often randomly selected) amino acid sequences can be introduced to identify functional domains (and for other purposes). Finally, different proteins or functional domains can be fused into a single protein molecule (fusion protein) to study the function of specific domains of proteins of interest, tagging proteins for visualization (e.g.: GFP) or to alter subcellular localization (etc). In addition, transgenes often include additional sequences which can be used to increase the expression of the gene or to allow the expression of more than one protein molecules under the control of the same promoter sequence (e.g.: a fluorescent protein together with another protein). To accomplish these goals so called fusion genes are created in which the appropriate DNA sequences are cloned into the same plasmid. In general, due to the large number of RE available restriction cloning can be used to create these fusion genes. In some cases methods using small synthetic oligonucleotides can be used to create appropriate RE sites or linkers to bring together DNA sequences when perfect sequence specificity and precise relative positions are required (e.g.: maintaining correct reading frames). In addition, PCR methods using mutated primers can be useful for site directed mutagenesis, deletions and insertions. It is far more difficult to create recombinant sequences with similar methods when large DNA sequences (> ~10-50 kb) are to be used in part because of the increasing difficulty of finding appropriate RE sites. To overcome this problem new methods have been developed in which instead of in vitro enzymatic reactions, recombinant DNA sequences are created by introducing appropriate DNA fragments into recombination competent bacterial strains in which the cells homologous recombination mechanisms can carry out the required manipulation of the introduced DNA molecules. These techniques have been particularly important for manipulating bacterial artificial chromosomes (BACs) and creating BAC transgenics. 3. Delivery and regulation of expression of transgenes Ideally, the goal is controlled and cell type specific expression of transgenes. Specifically, we want to target the transgenes to any cell or group of cells of interest and control the timing, quantity and conditions of expression experimentally with methods which are sufficiently time efficient and inexpensive. 3.1 mrna microinjection RNA can be directly injected into Xenopus oocytes due to the large size of the cells. Expression is strong and whole cell recordings can be obtained from the cells. Extensively used for studying ion channels because of the absence of endogenous voltage gated currents. RNA

2 injection is usually not possible in neurons (but whole cell recording may allow transfer of plasmids or RNA). 3.2 Gene gun Gold microspheres or particles (1µm or so) can be coated with one or more (DNA) plasmids. Currently limited to DNA. Each particle carries many plasmids ensuring that usually all the different types of plasmids in a mixture are represented on each particle. A high velocity air stream blows the particles into cells in slice culture preparations. 3.4 In-vivo electroporation Plasmids are injected into SVZ or ventricles and a short electrical pulse ( shock ) is applied between electrode plates. The electrical field causes transient dielectric breakdown of cell membranes and the plasmids are moved into cells as negative charge carriers. Mostly applicable to embryonal brains in utero and some other models (e.g.: tadpole of Xenopus Leavis). 3.5 In vivo juxtacellular labelling with plasmids Using standard glass micropipettes pulled to approximately 0.5 micrometer tip size action potentials from single neurons can be recorded extracellularly. After recording, a train of negative voltage pulses are applied usually from a separate electrical source (not the amplifier) to electroporate the plasmid DNA dissolved in the recording solution. The pipettes solution is usually M NaCl containing approximately 1 microgram / microliter endotoxin-free plasmid DNA. The pipette needs to be in close proximity of the neuron which may be judged at the end of the recording by monitoring the pipette resistance while moving the pipette closer to the neuron until the resistence suddenly increases. Several days are required for gene expression after the recording and labelling experiment. Multiple plasmids cab be transferred simultaneously. 3.5 Viral vectors Viruses are by nature gene delivery vectors. Biologically very heterogeneous recombinant viruses have been used successfully as gene delivery vectors. The most important properties to consider for selection of viral vectors include (i) the maximum transgene capacity (max. length of insert carried by the virus), (ii) presence and severity of immune response and/or cytotoxicity, (iii) tropism, (iv) long-term stability of gene expression and incorporation into the genome, (v) and axon transport and diffusion from the site of injection. Viral vectors are rendered replication or propagation incompetent by deletion of appropriate vial genes and transgenes are incorporated into the viral genome using recombinant technology. Viral particles are grown in producer cells (like human embryonal kidney cells) using trans-complementation. Trans-complementation involves the transfection of producer cells with multiple plasmids so that select genes necessary for viral replication that have been deleted from the viral genome are supplied in separate plasmids and therefore do not get incorporated into the viral particles exiting the producer cells. Using this strategy, infectious virions get assembled in the producer cells, but these viruses cannot propagate further after the infection of target neurons. The tropism of viruses can be altered replacing envelope proteins involved in targeting cell surface receptors with proteins obtained from viruses exhibiting more desirable tropism. This is called pseudotyping. Some properties of additional vectors : Virus vector Transgene Retrograde Integration Production AAV (1-11+) 4.7kb Yes (some) (-) commercial Lentivirus 9 kb No (+) in house Recombinant rabies 3.7 kb Yes (-) commercial HS amplicons >100 kb Primarily (-) difficult Retroviruses (HTLV) The most commony used vector is AAV. A very large variety of AAV packaged transgenes are avaialable as low cost as commercial stocks from U. North Carolina Gene Thereapy Vector core, U. Pennsylvania Vector Core, Vector Biolabs and others. The promoters of choice include CamKII (targeting many spine-bearing neurons, including cortical and hippocampal pyramidal

3 neurons, striatal medium spiny cells etc), human synapsin promoter (neurons, but variable expression level), EF1a (neuron and glia), CAG (neuron and glia). It is important to check the viral titer of the stock obtained, since this can vary widely between different stocks resulting in different number of virus delivered by application of the same injection volume. Usually 10^12 10^13 viral particles / ml are provided for AAV. (Only about 0.1% of the particles are functional). The amount of injected virus and virus serotype needs to be tested or obtained from published methods to match the experimental needs. Usually microliter is injected over minutes. AAV1 and AAV9 provides large infection areas but injection of too many virus particles can be toxic to neurons. Toxic level of expression of ChR2 is indicated by bright intracellular puncta in neurons. The toxicity can have very strong adverse effect on the neurons firing properties (sometimes silencing the neurons without killing them)! Similar toxicity can occur with other serotypes too. The toxicity can be avoided by reducing injection volume and expression time. AAV5 provides intermediate injection size, while AAV2 transfects small areas. 3.6 Transgenic animals Transgenic technology refers to the modification of select genes within the genome of the organism. Transgenes can be inserted into the host genome randomly or at specific locations. Genes of interest can also be removed or rendered non-functional in the genome (see 4.1). In the simplest form, the recombinant DNA can be injected into the fertilized egg pronucleus where it is incorporated at one or more random location into the genome with relatively high efficacy. The transgenic animal is then screened for germline transformation (i.e.: the presence of the transgene in the genome of the gametes) and successfully transformed animals can be used for establishing breeding colonies. Transgenes can also be targeted to a selected locus using homologous recombination. In this case embryonal stem cells are microinjected with the recombinant carrying DNA vector. The vector is engineered in a way as to allow both positive and negative selection of the appropriately transformed cells. Negative selection is achieved by incorporating a neomycin resistance gene into the recombinant DNA within the region of homology in the transgene so that in the presence of the antibiotic only the cells with successful DNA integration survive. Positive selection of those stem cells in which the integration occurred in a site specific manner (and not randomly) is obtained by incorporating a gene encoding the enzyme thymidine kinase (TK) outside of the region of homology. Since, at this location the enzyme can only be integrated into the host genome through non-homologous recombination, cells in which random integration occurred can be selected against by culturing the cells in the presence of gancyclovir which is converted into a cellular toxin by TK. During site specific integration the targeted sequence is usually a gene therefore, this manipulation is aimed at mutating or replacing the targeted gene with a recombinant transgene (See below). The most important application of this technology has been the deletion or knockout of target genes (see 4.1) 3.7 Temporal, cell-type specific and other conditional regulation of transgene expression. Most experiments require cell type specificity and experimentally controlled timing of the expression of the introduced transgenes. Specifically, methods have been developed which allow the induction of the expression of the transgenes in: (i) specific cell types as defined by neurochemical or genetic profiles, anatomical properties (brain area, projection targets) or developmental history; (ii) during specific periods controlled by endogenously applied drugs as inducers (or suppressors); and (iii) in response to a variety of biological conditions the occurrence of which is linked to the control of transgene expression through additional recombinant mechanisms introduced into the same organism. (i) Cell type specificity of transgene expression is primarily achieved through transcriptional control using an appropriate promoter. Consequently, successful targeting to a

4 specific cell type depends on whether the targeted cell population can be identified by the expression of a specific gene. For example, functionally specialized classes of GABAergic interneurons can be identified by their usually non-overlapping expression of calcium binding proteins and neuropeptides and the promoters of these genes can therefore be used to control the expression of transgenes in transgenic animals. Similarly, neurons expressing a particular neurotransmitter can be targeted using synthetic enzymes (e.g.: TH, GAD 67, etc) or transporters (e.g.: DTA, VGLUT etc.) specific for such cell types. Promoters for regional specific (often developmentally expressed genes like homeobox genes, etc) can be used for targeting whole brain areas (such as the dentate gyrus, the cerebellum or the striatum). In practice, the use of promoters is limited by several problems. First, obviously, an appropriate gene that would identify the targeted neuron population may not be known. Second, the promoter must drive sufficiently strong expression of the transgene which depends both on the strength of the endogenous promoter at its normal locus and on positional factors that can significantly alter the efficacy of the same promoter at different chromosomal locations. Third, eukaryotic regulatory regions can be very large and extend to ~100kb or more upstream as well as include downstream and intronic sequences, therefore precise reproduction of the expression of the cell type defining gene requires the engineering and delivery of very large recombinant DNA molecules. In general, these cannot be incorporated into viral vectors or plasmids excluding the use of these delivery systems. This limitation can sometimes be overcome by identifying minimal promoters, which are sufficiently short (usually 5 ) segments of the regulatory regions (~few kb long) but provide the required level of cell type specificity. Such promoters can at least in theory be used in viral vectors as well. Alternatively, large DNA molecules containing select genes and extensive up- and downstream sequences are available as bacterial artificial chromosomes (BACs) which have been created as genomic libraries for cloning the mouse, human and other mammalian genomes. Using recombinant technologies designed for the manipulation of large DNA molecules (e.g.: homologous recombination in bacteria) transgenes can be targeted to replace the coding sequence of the original gene with great precision and therefore after incorporating the BAC into the host genome to ensure that the expression of the transgene will closely match the expression of the original gene. (This procedure does not replace the original gene which is left unaltered but involves integration at a random site). These methods have been used to create transgenic mice in which enhanced green fluorescent protein (EGFP) as a reporter gene is expressed under promoters of genes of interest in neurobiology with the long-term goal of creating a gene expression atlas for the nervous system by the GENSAT (gene expression in the nervous system atlas) project. Currently several hundred strains of BAC transgenic mice are available ( (ii) Temporal control of transgene expression has been most successfully achieved with the tetracycline transactivator and reverse transactivator systems. Regulated control of gene expression is based on a bacterial gene regulatory mechanism the tetracycline repressor (tetr). TetR is a repressor protein which in bacterial cells binds strongly to a specific region called teto in the promoter of the tetracycline resistance gene (which makes the protein conferring antibiotic resistance to tetracycline). When tetracycline is present it binds with high affinity to tetr which in response is released from the teto site relieving suppression of the gene and resulting in expression. The basic idea is to use the teto site as a promoter to control introduced transgenes. To allow direct activation of transcription in eukaryotic cells, tetr, which functions as a repressor in bacteria, had to be modified to make it an inducer of activation of transcription in eukaryotic cells. This was done by fusing it with a transactivator domain of protein from herpes simplex virus (VP16) which is capable of inducing transcription in mammalian cells. The tetr/vp16 fusion protein is the tetracycline transactivator (tta). The inclusion of VP16 also required the appropriate modification of the DNA promoter sequence (teto) to include a sequence allowing the DNA binding by VP16. To control the expression of a transgene in a new transgenic animal first two separate transgenics are created which are subsequently crossed to obtain a double transgenic. The first transgenic

5 expresses the gene encoding tta under a promoter which can be selected to match the requirements of the experiment (for example tissue or cell type specific promoters). The second mice strain expresses the transgene under the modified teto promoter. In the double transgenic the transgene of interest will not be expressed until tetracycline is withdrawn from the animal. This system called the tetracycline transactivator or tet-off system and requires continuous administration of tetracycline analogues (including during gestation) to prevent expression. The more desirable form of control, called the reverse tetracycline transactivator (rtta), in which tetracycline induces transcription was created by modifying tta so that it binds to DNA in the presence of the antibiotic. The temporal control creates the possibility of inducing and reversing transgene expression allowing direct comparison of the effect of the transgene in individual animals. A further important implication of this temporal control is the ability to prevent transgene expression during development, since this can seriously confound the interpretation of the effects of the transgene. (iii). Finally, instead of controlling the expression of transgenes directly by externally applied inducers in many cases the expression of the transgenes is to be controlled by some biologically occurring event in the organism (e.g.: developmental events etc.). This can be accomplished with systems based on one of two well characterized conservative site specific tyrosine recombinases, Cre and FLP. Cre (cyclization recombination) is a bacteriophage lambda enzyme which carries out site specific recombination at loxp sit(or modification of the loxp site generally called lox sites), while FLP ( Flippase ) is a yeast enzyme that acts on so called frt sites. These systems are also based on combining two transgenic lines, one carrying the effector in this case the recombinase enzyme gene under some desired promoter, while another carrying the transgene engineered in a way that when the recombinase is expressed it will act on recombination sites engineered into the gene in a way to appropriately effect thestructure and therefore expression of the transgene. For example, using Cre, loxp sites can be inserted up- and downstream of a gene of interest causing the excision of the gene when Cre is expressed, providing a regulated mechanism for knockouts. Alternatively, gene expression can be induced by Cre with insertion of inverted loxp sites creating a directional change of promoters or by excising insulating elements. Since recombination events (except for inversions) are permanent once the expression of the recombinase in a given cell occurred the cell s genome is permanently altered. This can be used to report transient gene expression events to be detected at some time after their occurrence. In addition to these transgenic strategies specific cell groups can also be targeted based on their projections using retrogradely transported viral vectors. Although the above limitations apply, in theory, additional specificity can be obtained using cell type specific promoters (or tta, or Cre systems see below). In these cases, the viral vector can carry either the tta (or Cre) or can deliver a transgene under the teto promoter or have appropriate lox sites. the Moreover, cell type specificity of viral delivery can be engineered by pseudotyping viruses to exhibit exclusive tropism to cell surface receptors which in turn are absent in the host organism and therefore can be selectively expressed in target neurons using promoter strategies. Such an approach may potentially overcome the limitation of promoter strategies arising from limited carrying capacity of viruses and may be used for targeting cells based on the expression of more than one genes. 4. Suppressing genes One of the most informative ways to learn about the functioning of a gene is to examine the effects of suppressing its activity. This can be done at different levels along the pathways leading to the syntheses of the gene product, (i) by removing the gene from the genome or

6 preventing transcription from the gene, (ii) by interfering with the translation from the specific mrna and (ii), by interfering with the functioning of the protein product. 4.1 Genomic level knockouts and inducible knockouts As discussed above one homologous recombination allows the targeting of transgenes to specific loci in the genome. A knockout is created when these transgenes are designed to yield nonfunctional products. Knock-ins are transgenics in which an alternative functional transgene in replacing an endogenous gene. Temporal regulation of knockouts is even more important than with other transgenics since the absence of a gene can seriously disrupt development and result in an adult phenotype which can be completely of the role played by the gene of interest in the adult. Temporal regulation can be achieved using the tta system. Alternatively, site specific recombination with Cre/lox or Flp/Frt systems can be used by controlling the expression of the recombinase by additional mechanisms such as tta. 4.2 RNA level antisense oligonucleotides, and RNAi Translation from an mrna molecules can be prevented by introducing small antisense oligonucleotides which specifically hybridize and inactivate the targeted mrna molecule by preventing ribosomal access along the molecule. The antisense oligos are sometimes chemically modified to prevent degradation in the cells and are often transported into neurons sufficiently effectively after in vivo intracerebral injection in vehicle solutions. The degree of translational suppression is variable, and often the oligos are taken up by only a fraction of the targeted cells. Immunological response can also be a problem. Finally, selective targeting of specific cell types is not possible and the lifetime of the oligos is severely limited making them unusable as therapeutic tools. Of great current interest is an alternative strategy called mrna interference. This takes advantage of the discovery that many cells including mammalian neurons express enzymatic mechanism for the degradation of specific double stranded RNA molecules. Cells use these degradative mechanism (in part) to regulate mrna levels by expressing so called mrna genes (i.e.: genes which do not code for proteins) which in turn hybridize to specific mrna molecules and induce their degradation. Techniques have been (and are being) developed for piggybacking on this mechanism for experimental and therapeutic purposes. By mimicking the necessary properties of the regulatory RNA molecules transgenes can be designed which when expressed in target cells induce the degradation of select mrna molecules. Since these interfering RNAs can be encoded genetically they can be targeted and regulated with the same efficacy and specificity as other transgenes, including the applicability of viral delivery in humans and other animals where transgenic technology is not feasible. mrna interference can be far more effective then antisense technology, can be targeted and regulated as other transgenes and potentially used for very long term suppression of gene expression, including human gene therapy. 4.3 Protein level expression of negative dominant mutants. Often the function of proteins can be suppressed by expressing mutant variants of the same protein or other interacting proteins. This is usually possible when the protein of interest participates in a multi-subunit complex in which the incorporation of just one mutated subunit is sufficient to block normal function. In this case overexpression of the mutant form can suppress almost entirely the formation of functioning protein complex. For example, in multi-subunit voltage gated potassium channels the mutation introduced into the pore forming regions can block ion permeability and result in the suppression of the functional expression of the particular voltage gated channel in the cell. The advantages of this technique include (i) the ability to (at least theoretically) limit suppression at specific subcellular sites by targeting the dominant negative mutants to these locations selectively, (ii) the reverse use of this technique to study protein function by introduction of mutations and screening for dominant negative phenotypes.

7 5. Detection and quantification of the expression of genes Ideally, methods would be desirable that allow the fast and quantitative profiling of the expression of all known genes by individual neurons (or glial cells). Today this is not yet possible and the experimenter must choose between available techniques or their combinations. 5.1 cdna libraries cdna libraries are clones of cdna replica of cellular mrna extracted from specific cell groups in the CNS, usually, specific brain areas. They are prepared by reverse transcription of extracted mrna using oligo-dt probes and incorporation into plasmids which are then transfected and amplified in bacteria. These libraries carry nearly complete collections of genes expressed in a brain area. The relative level of expression, and the cell types contributing to the expression of each mrna cannot be determined with this method, however, no a priory information is needed about the genes whose expression is to be examined. 5.2 In situ hybridization Sequence specific antisense oligonucleotide probes can be hybridized to mrna in fixed tissue sections and visualized with autoradiography or fluorescent labeling. The great advantage of this technique is the relatively high specificity and simplicity with which the expression of specific genes can be tested across the entire brain. The specificity is limited only by the relatively high background labeling, which also limits the sensitivity of the method. When combined with immunocytochmistry or anatomical tracing techniques the cell type specificity of the expression of the gene(s) of interest can be determined. Although it provides some quantification of gene expression levels this information is not very reliable. Disadvantages include: (i) even with the much less sensitive fluorescent methods only a few genes can be examined simultaneously (co-expression), (ii) the sensitivity of the method is not high, (iii) cannot be used to test expression in single neurons (due to the unreliable labeling of individual cells). 5.3 GENSAT. AS described earlier the GENSAT project is generating an expression atlas for the CNS ultimately aimed at examining the expression of all neuronal genes. Gene expression is reported by EGFP under gene specific promoters in transgenic mice. In may ways the information is similar to in situ data, but the critical advantage is that identified neurons (or glia) can be targeted for electrophysiological recording (optical or electrical, in vivo and in vitro), RT-PCR, and other live applications. 5.4 RT-PCR, multiplex RT-PCR and qtrt-pcr for tissue and single cells. Reverse transcription polymerase chain reaction (RT-PCR) can be used to detect the expression of individual genes or a small number of genes (<15 or so). mrna is reversetranscribed into cdna using reverse transcriptase and oligo-dt primers (to target the poly-a tail). The cdna is amplified with a gene specific primer pair using PCR and analyzed with gelelectrophoresis. This can be performed on mrna extracts from tissue homogenates or from single cells. For single cell applications either cultured cells are used in which case the entire cell is harvested into a recording pipette or whole cell recording can be used to aspirate the cytoplasm of neurons in brain slices. An important extension of the technique is the targeting of several genes from a single sample (e.g.: from a single recorded neuron). In this case, multiple specific primer sets are used and up to about 15 sequences can be tested from single cells. A fundamental problem with all RT-PCR techniques is the non-linearity of PCR which makes the amount of the final amplicons extremely sensitive to minor variations in the amplification rate of each amplicon and other nonspecific factors making quantification unreliable. This can be overcome and quantitative versions of PCR (qt-pcr) established by using a number of

8 normalization strategies to estimate the relative amount of starting material in the PCR step of the reaction. One quantification method called real time or kinetic PCR is based on fluorescent measurement of the rate of amplification for a number of different apmlicons on-line. The great advantage of single cell RT-PCR methods is the ability to obtain a (limited) gene expression profile from single neurons after the electrophysiological characterization of these cells. This can provide correlative genomic and physiological information. The disadvantage in comparison to in situ hybridization is the severe limitation on the number of cells that can be sampled, and the technological difficulties cost and skill requirements associated with this method. In particular, real time PCR requires expensive specialized instrumentation for carrying out the PCR reaction itself. 5.6 Microarray technology. The closest to the ultimate desired gene expression profiling technology is single cell microarrays. Microarrays are custom made 2D arrays consisting of evenly spaced spots each containing a chemically immobilized hybridization probe with complementary sequence to a segment of a specific mrna transcript. At each location hybridization to the probe is detected under a computer controlled fluorescence microscope in the form of a change in fluorescent emission which is due to the replacement of a fluorescent standard (which becomes fluorescent only upon dissociating from the probe). Although microarrays require very small amount of mrna, the mrna that can be directly obtained from a single neurons is not nearly enough for the microarray. The mrna harvested from a single cell therefore has to be amplified without a priory information about the mrna species present in the sample. This can be accomplished with mrna amplification. mrna amplification involves a few repeated cycles of amplification in which the mrna is first reverse-transcribed into cdna using nonspecific oligo-dt probes (which target the poly-a tail in all mrna), followed by in vitro transcription and generation of new RNA, which is subsequently converted back into cdna etc. This methods is different from RT-PCR in that it (i) does not require sequence specific primers and therefore will amplify all mrnas and (ii) it is largely linear in the number of cycles (which are also small). The amplified mrna (or corresponding cdna) is applied in equal amounts to the wells in the microarray which allows the determination if the particular sequence represented in a specific well is present in the mrna mix obtained from the cell. There are a number of variants of this technology including so-called cdna, Affimetrix and Illumina microarrays depending on the selection of probes used in the microarrays and these provide somewhat different information and are prone to different errors. Usually several hundred genes can be tested simultaneously. Currently, microarray methods are quite expensive and the results generally need to be validated with RT-PCR or in situ hybridization, but they are unsurpassed by any other method in the number of mrna that can be tested from single cells simultaneously. Moreover, due to the relatively good linearity of mrna amplification with appropriate modifications and proper selection of the type of microarrays used this method can provide quantitative information about the relative expression of mrnas from single cells. 5.7 mrna sequencing. The sequencing provides quatitative information about the expression of all transcribed genes in the cell. This method can be applied to populations of neurons isolated using fluorescence cell sorting or other methods as well as to isolated single neurons. Cytoplasmic/nuclear mrna obtained by aspiration through a whole cell recording pipette can also be sequenced. The mrna is reverse transcribed as described in 5.4. The obtained cdna is sequenced at a commercial facility. This is an extremely powerful but expensive method. The main reason for its application is to identify cell types based on the complete gene expression profile of the cells. Due to the noise inherent in the method complex clustering algorithm are used to classify the cells.

9 Sequencing also provides information about splice and editing variants of mrna species, but often the same information can be more easily obtained with RT-PCR.