Relationship of Gene s Types and Introns

Similar documents
CHAPTER 21 LECTURE SLIDES

Make the protein through the genetic dogma process.

MATH 5610, Computational Biology

Genes and How They Work. Chapter 15

Genome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013)

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.

MOLECULAR GENETICS PROTEIN SYNTHESIS. Molecular Genetics Activity #2 page 1

Genomics and Gene Recognition Genes and Blue Genes

The Flow of Genetic Information

Lesson Overview. Fermentation 13.1 RNA

The Nature of Genes. The Nature of Genes. Genes and How They Work. Chapter 15/16

Videos. Lesson Overview. Fermentation

BS 50 Genetics and Genomics Week of Oct 24

I. Gene Expression Figure 1: Central Dogma of Molecular Biology

FOGA-III: HOW DOES GENETIC CHANGE HAPPEN? - NATURAL GENETIC ENGINEERING OF GENOME STRUCTURE

Videos. Bozeman Transcription and Translation: Drawing transcription and translation:

The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Nature of Genes. The Genetic Code. Genes and How They Work

Text Reference, Campbell v.8, chapter 17 PROTEIN SYNTHESIS

BIOLOGY - CLUTCH CH.17 - GENE EXPRESSION.

From DNA to Protein: Genotype to Phenotype

There are four major types of introns. Group I introns, found in some rrna genes, are self-splicing: they can catalyze their own removal.

BIO 311C Spring Lecture 36 Wednesday 28 Apr.

Wednesday, November 22, 17. Exons and Introns

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

Genomes: What we know and what we don t know

Genome annotation & EST

CHAPTERS , 17: Eukaryotic Genetics

Molecular Cell Biology - Problem Drill 08: Transcription, Translation and the Genetic Code

Gene is the basic physical and functional unit of heredity. A Gene, in molecular terms,

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3


Genomes summary. Bacterial genome sizes

Bi 8 Lecture 5. Ellen Rothenberg 19 January 2016

Sequence Analysis. II: Sequence Patterns and Matrices. George Bell, Ph.D. WIBR Bioinformatics and Research Computing

8/21/2014. From Gene to Protein

Reading Lecture 8: Lecture 9: Lecture 8. DNA Libraries. Definition Types Construction

From DNA to Protein: Genotype to Phenotype

Evidence of Purifying Selection in Humans. John Long Mentor: Angela Yen (Kellis Lab)

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger.

Quick Review of Protein Synthesis

CHAPTER 21 GENOMES AND THEIR EVOLUTION

Chapter 6: Transcription and RNA Processing in Eukaryotes

I. Prokaryotic Gene Regulation. Figure 1: Operon. Operon:

Fermentation. Lesson Overview. Lesson Overview 13.1 RNA

The study of the structure, function, and interaction of cellular proteins is called. A) bioinformatics B) haplotypics C) genomics D) proteomics

Degenerate site - twofold degenerate site - fourfold degenerate site

Bioinformatics: Sequence Analysis. COMP 571 Luay Nakhleh, Rice University

RNA-Seq Now What? BIS180L Professor Maloof May 24, 2018

Introduction to Bioinformatics Online Course: IBT

Transcription. DNA to RNA

Unit 6 DNA ppt 3 Gene Expression and Mutations Chapter 8.6 & 8.7 pg

BIOCHEMISTRY REVIEW. Overview of Biomolecules. Chapter 12 Transcription

Complete draft sequence 2001

ChIP-seq and RNA-seq

About Strand NGS. Strand Genomics, Inc All rights reserved.

Annotation of contig27 in the Muller F Element of D. elegans. Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans.

DNA Evolution of knowledge about gene. Contains information about RNAs and proteins. Polynucleotide chains; Double stranded molecule;

Transcription and Post Transcript Modification

Genome research in eukaryotes

Lecture Summary: Regulation of transcription. General mechanisms-what are the major regulatory points?

Transcription Eukaryotic Cells

GRU5 LECTURE POST-TRANSCRIPTIONAL MODIFICATION AND TRANSCRIPTION

DNA Function: Information Transmission

13.1 RNA Lesson Objectives Contrast RNA and DNA. Explain the process of transcription.

Basic Bioinformatics: Homology, Sequence Alignment,

GENETICS - CLUTCH CH.10 TRANSCRIPTION.

Mixing and matching, the power of combinatorial evolution:

Training materials.

REGULATION OF PROTEIN SYNTHESIS. II. Eukaryotes

Introduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute

The Little Things About the Little Things Inside of Us The Eukaryotic Genome and Its Expression

Transcription is the first stage of gene expression

Chapter 11. How Genes Are Controlled. Lectures by Edward J. Zalisko

BIOL 461/ 661 Cell Biology 4 Credits Instructor: Dr. Kristin O Brien. T/TH 11:30-1:00, Irving I 208 Office hours: M 9-10, TH 3-4

7.03, 2005, Lecture 20 EUKARYOTIC GENES AND GENOMES I

Old EXAM 1 BIO409/509 NAME. Please number your answers and write them on the attached, lined paper.

Lecture 7 Motif Databases and Gene Finding

Lecture #8 2/4/02 Dr. Kopeny

Section C: The Control of Gene Expression

The Central Dogma. DNA makes RNA makes Proteins

DNA makes RNA makes Proteins. The Central Dogma

Training materials.

MBioS 503: Section 1 Chromosome, Gene, Translation, & Transcription. Gene Organization. Genome. Objectives: Gene Organization

Textbook Reading Guidelines

Transcription: Synthesis of RNA

ChIP-seq and RNA-seq. Farhat Habib

PrimePCR Assay Validation Report

Introduction to Genome Biology

Bayesian Decomposition

Machine Learning. HMM applications in computational biology

Transcription in Eukaryotes

Applications of HMMs in Computational Biology. BMI/CS Colin Dewey

Single-cell sequencing

Annotating 7G24-63 Justin Richner May 4, Figure 1: Map of my sequence

Lecture for Wednesday. Dr. Prince BIOL 1408

Introduction to human genomics and genome informatics

Grundlagen der Bioinformatik, SoSe 11, D. Huson, July 4, This exposition is based on the following source, which is recommended reading:

a small viral insert or the like). Both types of breaks are seen in the pyrobaculum snornas.

Gene expression DNA RNA. Protein DNA. Replication. Initiation Elongation Processing Export. DNA RNA Protein. Transcription. Degradation.

DNA and Biotechnology Form of DNA Form of DNA Form of DNA Form of DNA Replication of DNA Replication of DNA

Transcription:

Chi To BME 230 Final Project Relationship of Gene s Types and Introns Abstract: The relationship in gene ontology classification and the modification of the length of introns through out the evolution is an important information can help for the process of determination the origin of intron. The statistics shows that genes in dog that have the biological process are actin cytoskeleton organization and biogenesis, actin filament-based process tends to have many deletion in the introns of the genes. Introduction Most eukaryotes contain multiple introns per gene. This requires hundreds of thousands to millions of individual intron gains to have occurred throughout eukaryotic evolution. For example, there are an extremely large percentage of bases in introns compared with small amount of bases in exons in human genome (33% of genome and less than 2% in human genome). Five different hypotheses for the origin of introns have been proposed: (i) intron transposition: An intron from one gene is spliced out of an mrna transcript. That intronic RNA sequence then reinserts into a previously intronless site of a transcript of the same or different gene. That structure is then retroposed to give a DNA copy of the gene containing an intron at the new site. (ii)transposon insertion: A transposon inserts into a contiguous coding region and is transformed into an intron. (iii) Tandem genomic duplication: A region including part or all of an exon with and internal AGGT is duplicated. The two homologous AGGTs are then used as 5 and 3 splicing boundaries for a new intron. (iv) Intron transfer: A gene undergoes a gene conversion or simple double recombination with an intron-containing paralog. (v)self-splicing type II intron insertion: a type II insertion, presumably from an organelle of the same organism, inserts into a contiguous region of coding sequence of a nuclear genome and is then converted to a spliceosomal intron. [1] To determine the origin of introns, it is important if one can find whether introns of a species is gain or loss in comparison with their ancestors[2]. Intron early mean

Materials and Methods The UCSC genome browser was used to get a bed file for the data of all exons in the Boreoeutherian genome (ancestral genome)[3]. From the data, the coordinates of exons and the gene IDs were extracted, and a perl program called getintron.pl used these extracted data to created a script that have the information of all introns in the Boreoeutherian genome. This script was then used by a program called chaintoaxt which helped to do a pairwise sequences alignment between the Boreoeutherian genome and the dog s genome, and another pairwise sequences alignment between the Boreoeutherian genome with the human genome. [4]. The outputs of these pairwise sequences alignments were then used by another perl program called getidalign.pl which helped to determine the number of deletion or insertion in the dog and human genome. The number of deletions and insertions were determined by counting the number of gaps, and only gaps have the values greater than 100bp were selected; otherwise, they were just considered noises. Four output files was received after the getidalign.pl called. In each output file, the gene coordinates for only introns that had either more than 100bp deletion or insertions were shown. Another perl program called getgene.pl was then used to match the information from each output file to get the gene IDs for selected introns for each file. Then, four geneid output files (insertion in dog, deletion in dog, insertion in human, and deletion in human) were upload to the GOstat website, which helped to do the Gene Ontology statistics on gene IDs from each file by using the hypergeometrics distribution statistics [5]. Result

Table 1: Insertion in Human GO:0006813 Biological process: potassium ion 0.28 transport GO:0005886 Cellular component: protein complex 0.28 Table 2: Deletion in Human GO:0003779 Mol. Function: Actin Binding 0.23 GO: 0008092 Mol. Function: cytoskeletal protein binding 0.23 GO: 0005096 Mol. Function: GTPase activator activity 0.23 Table 3: Insertion. In Dog GO:0016772 Mol. Function: transerase activity, 0.65 transferring phosporus-containing groups GO:0005057 Mol. Function: receptor signaling protein 0.65 activity GO:0016301 Mol. Function: kinase activity 0.65 Table 4: Deletion in Dog.

GO:0030036 Actin cytoskeleton organization and biogenesis 0.0598 GO:0030029 Bio. Process: Actin filament-based process 0.0598 GO:0007010 Bio. Process: cytoskeleton organization and biogenesis 0.0598 Figure 1 Discussion: The result form the Gene Ontology (G.O.) statistics shows that for human, genes have the biological process is potassium ion transport tend to have many insertions in the introns (Table1). However, the p-value of getting this result was little high (0.28), so there is a little high chance that this G.O classification is getting from random (28%). Table 2 shows that genes in human that have the molecular function is actin binding, cystokeletal protein binding, or GTPase activator activity tend to have many deletions in the introns (Table 2). However, the p-value of this outcome is also high, so our prediction is very wiggle. Table 3 shows that genes in dog that have molecular function are transferase activity ( transferring phosporus-containing groups),

receptor signaling protein activity and kinase activity tend to have many insertions in the introns. However, similarly with the predictions of table 1 and table 2, the prediction of the outcome G.O. classification is very wiggle! One reason to explain for the high value of p-values for the table1, table 2 and table 3 can be the size of our data. The data used in this experiment was small, the result can be improve by increasing the size of the data.. Table 4 shows a quite interesting result. It shows that genes in dog that have the biological process are actin cytoskeleton organization and biogenesis, actin filament-based process tends to have many deletion in the introns of the genes. Our p-value is very small here, and it shows that it is very small chance that the G.O. classification is getting from random (only about 5-6%). Figure 1 was generate from the UCSC genome browser, and it shows an very interesting feature. In this figure, it shows that more than ¾ the intron of the dog and about ½ the intron of human was deleted during the evolution in comparing with the original intron of the ancestral! Conclusion: The information of biological function, biological process, and components of genes related to how the introns delete or insert may be meaningful in the research of the origin of introns. By knowing the relationship of Gene Ontology classification and the intron, the researches can narrow down their experiments to the specific type of genes have insertions or deletions! Reference: 1/ Scott W Roy, The origin of recent introns: transposons?. Genome Biology 2004, page 2). 2/ Coghlan, A. and Wolfe, K., Origins of recently gained introns in Caenorhabditis. www.pnas.org/cgi/doi/10.1073/pnas.0308192101 3/ UCSC genome browser, http://genome-test.cse.ucsc.edu/ 4/ Robert Baertsch, chaintoaxt for pairwise alignment. 5/ Gostat, http://gostat.wehi.edu.au/ - for the G.O. statistics