Gene is the basic physical and functional unit of heredity. A Gene, in molecular terms,

Size: px
Start display at page:

Download "Gene is the basic physical and functional unit of heredity. A Gene, in molecular terms,"

Transcription

1 Gene Structure-Introns, Exons and Pseudogenes... What is a gene? Gene is the basic physical and functional unit of heredity. A Gene, in molecular terms, is a nucleotide sequence necessary for the synthesis of a functional polypeptide. Genes contain the information necessary for living cells to survive and reproduce. In humans, genes vary in size from a few hundred DNA bases to more than 2 million bases. The Human Genome Project has estimated that humans have between 20,000 and 25,000 genes. Every person has two copies of each gene, one inherited from each parent. Most genes are the same in all people, but a small number of genes (less than 1 percent of the total) are slightly different between people. Alleles are forms of the same gene with small differences in their sequence of DNA bases. These small differences contribute to each person s unique physical features. Much of gene structure is broadly similar between eukaryotes and prokaryotes. These common elements largely result from the shared ancestry of cellular life in organisms over 2 billion years ago. Prokaryotes Eukaryotes The overall organisation of prokaryotic genes is markedly different from that of the eukaryotes. The most obvious difference is that prokaryotic ORFs are often grouped into a polycistronic operon under the control of a shared set of regulatory sequences. These ORFs are all transcribed onto the same mrna and so are co-regulated and often serve related functions. The structure of eukaryotic genes includes features not found in prokaryotes. Most of these relate to post-transcriptional modification of pre-mrnas to produce mature mrna ready for translation into protein. Eukaryotic genes typically have more regulatory elements to control gene expression compared to prokaryotes. This is particularly true in multicellular eukaryotes, humans for example, where gene expression varies widely among different tissues. A key feature of the structure of eukaryotic genes is that their transcripts are typically subdivided into exon and intron regions. Exon regions are retained in the final mature mrna molecule, while intron regions are spliced out

2 (excised) during post-transcriptional processing.indeed, the intron regions of a gene can be considerably longer than the exon regions. What are introns and exons? Introns and exons are parts of genes. Exons code for proteins, whereas introns do not. A great way to remember this is by considering introns as intervening sequences and exons as expressed sequences. According to researchers, there are an average of 8.8 exons and 7.8 introns per human gene. What are exons? Figure 1 Structure of a eukaryotic gene Exons are parts of DNA that are converted into mature messenger RNA (mrna). The process by which DNA is used as a template to create mrna is called transcription. This mrna then undergoes a further process called translation where the mrna is used to synthesize proteins, via another type of molecule called transfer RNA (trna). What are introns? Introns are parts of genes that do not directly code for proteins. Introns can range in size from 10 s of base pairs to 1000 s of base pairs. Where are introns found? Introns are commonly found in multicellular eukaryotes, such as humans. They are less common in unicellular eukaryotes, such as yeast, and even rarer in bacteria.

3 It has been suggested that the number of introns an organism s genes contains is positively related to its complexity. That is the more introns an organism contains, the more complex the organism is. How are introns removed? Introns are present in the initial RNA transcript, known as pre-mrna. They need to be removed in order for the mrna to be able to direct the production of proteins. Pre-mRNA, therefore, undergoes a process, known as splicing, to create mature mrna. It is vital for the introns to be removed precisely, as any left-over intron nucleotides, or deletion of exon nucleotides, may result in a faulty protein being produced. This is because the amino acids that make up proteins are joined together based on codons, which consist of three nucleotides. An imprecise intron removal thus may result in a frameshift, which means that the genetic code would be read incorrectly. This can be explained by using the following phrase as a metaphor for an exon: BOB THE BIG TAN CAT. If the intron before this exon was imprecisely removed, so that the B was no longer present, then the sequence would become unreadable: OBT HEB IGT ANC AT

4 What are pseudogenes? Pseudogenes are genomic DNA sequences similar to normal genes but nonfunctional; they are regarded as defunct relatives of functional genes. Pseudogenes are segments of DNA that are related to real genes. Pseudogenes have lost at least some functionality, relative to the complete gene, in cellular gene expression or protein-coding ability. The first pseudogene was reported in Since that time, a large number of these genes have been reported and described in humans and many other species. What causes pseudogenes to arise? There are two accepted processes during which pseudogenes may arise: Duplication - modifications (mutations, insertions, deletions, frame shifts) to the DNA sequence of a gene can occur during duplication. These disablements can result in loss of gene function at the transcription or translation level (or both) since the sequence no longer results in the production of a protein. Copies of genes that are disabled in such a manner are termed non-processed or duplicated pseudogenes. Retrotransposition - reverse transcription of an mrna transcript with subsequent re-integration of the cdna into the genome. Such copies of genes are termed processed pseudogenes. These pseudogenes can also accumulate random disablements over the course of evolution.

5 Examples of Pseudogene function Pseudogenes have long been labeled as junk DNA, failed copies of genes that arise during the evolution of genomes. However, recent results are challenging this moniker; indeed, some pseudogenes appear to harbor the potential to regulate their protein-coding cousins. Far from being silent relics, many pseudogenes are transcribed into RNA, some exhibiting a tissue-specific pattern of activation. Pseudogene transcripts can be processed into short interfering RNAs that regulate coding genes through the RNAi pathway. In another remarkable discovery, it has been shown that pseudogenes are capable of regulating tumor suppressors and oncogenes by acting as microrna decoys. The finding that pseudogenes are often deregulated during cancer progression warrants further investigation into the true extent of pseudogene function. Pseudogenes can, over evolutionary time scales, participate in gene conversion and other mutational events that may give rise to new or newly-functional genes. This has led to the concept, used in a major review from 2003, that pseudogenes could be viewed as potogenes: potential genes for evolutionary diversification. Bacterial pseudogenes References: Pseudogenes can be found in bacteria. Most are in bacteria that are not free-living; that is, they are either symbionts or obligate intracellular parasites and thus do not require many genes that are needed by bacteria living in changeable environments. An extreme example is the genome of Mycobacterium leprae, the causative agent of leprosy. It has been reported to have 1,133 pseudogenes which give rise to approximately 50% of its transcriptome. 1. Pseudogenes: Pseudo-functional or key regulators in health and disease? Ryan Charles Pink, Kate Wicks, Daniel Paul Caley, Emma Kathleen Punch, Laura Jacobs, and David Raul Francisco Carter