Bioinformatics overview

Size: px
Start display at page:

Download "Bioinformatics overview"

Transcription

1 Bioinformatics overview Aplicações biomédicas em plataformas computacionais de alto desempenho Aplicaciones biomédicas sobre plataformas gráficas de altas prestaciones Biomedical applications in High performance computing platforms Oswaldo Trelles, PhD University of Malaga In this section we survery the bioinformatics application domain and the typical sources of data in the field

2 Definition Computer sciences, statistics, physics, chemistry,... Information Technologies Bioinformatics: The application of computational techniques to the management and analysis of biological data Molecular, clinic, population, environmental,... Acquisition, storage, retrieval, transmission, processing...

3 The domain of the data

4 Data production Huge data production at different levels Atoms Proteins Interactions Metabolic pathways Cells Organs Organisms Populations

5 Diversity of types of data > E bp DNA linear gaattctaac ggtcccgaaa ctctgtgcgg tgctgaactg gttgacgctc tgcagtttgt ttgcggtgac cgtggttttt attttaacaa acccactggt tatggttctt cttctcgtcg tgctccccag actggtattg ttgacgaatg ctgctttcgt tcttgcgacc tgcgtcgtct ggaaatgtat tgcgctcccc tgaaacccgc taaatctgct tagaagctt

6 Format heterogeneity LOCUS E bp DNA linear PAT 04-NOV-2005 DEFINITION DNA encoding human insulin-like growth factor I(IGF-I). ID E01306; SV 1; linear; unassigned DNA; PAT; SYN; 229 BP. ACCESSION E01306 AC E01306; VERSION E GI: DT 07-OCT-1997 (Rel. 52, Created) KEYWORDS JP A/1. DT 09-NOV-2005 (Rel. 85, Last updated, Version 3) SOURCE synthetic construct DE DNA encoding human insulin-like growth factor I(IGF-I). ORGANISM synthetic construct KW JP A/1. other sequences; artificial sequences. OS synthetic construct REFERENCE 1 (bases 1 to 229) OC other sequences; artificial sequences. AUTHORS Raasu,A., Toomasu,M., Berun,N. and Majiasu,U. RA Raasu A., Toomasu M., Berun N., Majiasu U.; TITLE METHOD FOR TRANSPORTING GENE PRODUCT TO MEDIUM PROPAGATING GRAM RT "METHOD FOR TRANSPORTING GENE PRODUCT TO MEDIUM PROPAGATING NEGATIVE BACTERIA GRAM JOURNAL Patent: JP A 1 20-AUG-1987; RT NEGATIVE BACTERIA"; KABIGEN AB RL Patent number JP A/1, 20-AUG COMMENT OS Artificial gene RL KABIGEN AB. OC Artificial sequence; Genes. CC OS Artificial gene OS Homo sapiens CC OC Artificial sequence; Genes. PN JP A/1 CC OS Homo sapiens PD 20-AUG-1987 CC CC strandedness: Single; CC strandedness: Single; CC CC topology: Linear; CC topology: Linear; CC CC hypothetical: No; CC hypothetical: No; CC CC anti-sense: No; CC anti-sense: No; CC FH Key Location/Qualifiers FH Key Location/Qualifiers CC FT mat_peptide FT /product='human insuline-like growth factor I CC FT CDS > FT CDS > CC FT /product="human insulin-like growth factor I" FEATURES Location/Qualifiers FH Key Location/Qualifiers source FT source /organism="synthetic construct" FT /organism="synthetic construct" /mol_type="unassigned DNA" FT /mol_type="unassigned DNA" /db_xref="taxon:32630" FT /db_xref="taxon:32630" ORIGIN SQ Sequence 229 BP; 40 A; 57 C; 55 G; 77 T; 0 other; 1 gaattctaac ggtcccgaaa ctctgtgcgg tgctgaactg gttgacgctc tgcagtttgt gaattctaac ggtcccgaaa ctctgtgcgg tgctgaactg gttgacgctc tgcagtttgt ttgcggtgac cgtggttttt attttaacaa acccactggt tatggttctt cttctcgtcg ttgcggtgac cgtggttttt attttaacaa acccactggt tatggttctt cttctcgtcg tgctccccag actggtattg ttgacgaatg ctgctttcgt tcttgcgacc tgcgtcgtct tgctccccag actggtattg ttgacgaatg ctgctttcgt tcttgcgacc tgcgtcgtct ggaaatgtat tgcgctcccc tgaaacccgc taaatctgct tagaagctt ggaaatgtat tgcgctcccc tgaaacccgc taaatctgct tagaagctt 229 // // The DNA encoding human insulin-like growth factor I(IGF-I) available at GenBank: E (search for insulin in All databases ) The same insulin (E01306) sequence at (in both text-boxes some lines has been removed)

7 Dispersion of data sources More than 1000 biological data collections Bioinformatics workflows: the usual way to work See: [1] Infobiogen: Catalog of Databases: Bioinformatics: a web-based domain

8 Types of data and applications (overview)

9 Sequencing data The long DNA chain is split in small fragments that are read using sequencing technology. Read: short sequences obtained during the sequencing process Software is used to obtain Contigs, Scaffolds, Consensus Reading in annotation data from a GFF file. Assigning aligned reads to exons and genes. Biologically intelligent interpretation of genomic data FASTA and FASTQ formats >000014_1863_0292 length=76 uaccno=fgsmdpn08etuie AATACTCAGGAATCGAACGGACTCGGGTATAGTATATGATCGGCAGCCAGCCG AACATAACAGCGGCATGAAAACC >000016_1821_0619 length=120 uaccno=fgsmdpn08ep50t GGCAAGTTTTCGGTGTCGCTAAGCCCGAGATATCGCAGCTCACCCGTGTCGGC GATTGCTGCTGTGACCGTCCCCAGTCGGTCACCCTCCGGCTGATTCTATCCTTACATCGG TCGTTTC >000021_1845_1786 length=69 uaccno=fgsmdpn08esarw ATCCGCGCGGCCGCATTGTCGACACTGCCTGCCGGCAGTGAAGGCGAGGCGCA GGTGGCCGATGCGCTG >000030_1849_0863 length=69 uaccno=fgsmdpn08esmpd ATCCGCGCGGCCGCATTGTCGACACTGCCTGCCGGCAGTGAAGGCGAGGCGCA GGTGGCCGATGCGCTG >000035_1856_0283 length=148 uaccno=fgsmdpn08es8dp GACGCCCTTTATGCACGTTTCGCTCACAGTATCCCTTAATAGCAAGATTAATA CCCTCAGTGGCCCCACTAGTAAAAACGATCTCTCGAGAACGACAGTTCAGTTC ATTGGCAATCAATTTTCGGGCCGTTTCTTACCGCCTCCTCAG

10 Assembling the puzzle From spectrograms to a sequence of letters An exhaustive and resource consuming procedure is needed to solve the assembling fragments into a longer Contigs... the sequence is coming up

11 Biological sequence data biológicas >ref NT_ : Drosophila melanogaster chromosome 2L CGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGG GAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAA TGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTT AAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCG TTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATT TAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATATAATGAC TGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGAT ATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGC CAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGATGATAATATATTC AAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGA GAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAA GACAATACACGACAGAGAGAGAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCG AGAGTAGTGCCAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAAC CCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATAT TATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAAC CCAAAATGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGC AACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGC CTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGA GATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATAT AATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGA ACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAG TAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGATGATAAT ATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGT ATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACC CAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATGATCGCG TATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACC GCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTC CCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTT GGCAACCCAAAATGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATT CATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAAT GAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGC AGCGGAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCT CTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGACAGAGGA AGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATG CGAGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGAT GATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCATTGCAACGTTAAATACAGCACAATATATGA TCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCG CAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATG ATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTAT TTTACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTT TCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGTGCTCTTTGA TTTTTTGGCAACCCAAAATGGTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAA TAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTGCCAACATATTGT GCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGA GAGAGCAGCGGAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATATTG TGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAAACCCAAATCGACAATGCACGAC AGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCG From assembly to databases entries FASTA is the favorite format used for this type of data (without annotations) We knows the text, but the meaning needs more processing

12 Annotated Sequence databases ID 100K_RAT STANDARD; PRT; 889 AA. AC Q62671; DT 01-NOV-1997 (Rel. 35, Created) DT 01-NOV-1997 (Rel. 35, Last sequence update) DT 15-JUL-1999 (Rel. 38, Last annotation update) DE 100 KD PROTEIN (EC ). OS Rattus norvegicus (Rat). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. RN [1] RP SEQUENCE FROM N.A. RC STRAIN=WISTAR; TISSUE=TESTIS; RX MEDLINE; RA MUELLER D., REHBEIN M., BAUMEISTER H., RICHTER D.; RT "Molecular characterization of a novel rat protein structurally RT related to poly(a) binding proteins and the 70K protein of the U1 RT small nuclear ribonucleoprotein particle (snrnp)."; RL Nucleic Acids Res. 20: (1992). RN [2] RP ERRATUM. RA MUELLER D., REHBEIN M., BAUMEISTER H., RICHTER D.; RL Nucleic Acids Res. 20: (1992). CC -!- FUNCTION: E3 UBIQUITIN-PROTEIN LIGASE WHICH ACCEPTS UBIQUITIN FROM CC AN E2 UBIQUITIN-CONJUGATING ENZYME IN THE FORM OF A THIOESTER AND CC THEN DIRECTLY TRANSFERS THE UBIQUITIN TO TARGETED SUBSTRATES (BY CC SIMILARITY). THIS PROTEIN MAY BE INVOLVED IN MATURATION AND/OR CC POST-TRANSCRIPTIONAL REGULATION OF MRNA. CC CC This SWISS-PROT entry is copyright. It is produced through... CC DR EMBL; X64411; CAA ; -. DR PFAM; PF00632; HECT; 1. DR PFAM; PF00658; PABP; 1. KW Ubiquitin conjugation; Ligase. FT DOMAIN ASP/GLU-RICH (ACIDIC). FT DOMAIN PRO-RICH. FT DOMAIN ASP/GLU-RICH (ACIDIC). FT BINDING UBIQUITIN (BY SIMILARITY). SQ SEQUENCE 889 AA; MW; DD7E6C7A CRC32; MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK AMNQQTTLDT PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN HPFFRRSDSM VYEYVRKYAE HRMLVVAEQP LHAMRKGLLD VLPKNSLEDL TAEDFRLLVN GCGEVNVQML ISFTSFNDES GENAEKLLQF KRWFWSIVER MSMTERQDLV YFWTSSPSLP ASEEGFQPMP SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV //

Bioinformatics overview

Bioinformatics overview Bioinformatics overview Aplicações biomédicas em plataformas computacionais de alto desempenho Aplicaciones biomédicas sobre plataformas gráficas de altas prestaciones Biomedical applications in High performance

More information

AAGTGCCACTGCATAAATGACCATGAGTGGGCACCGGTAAGGGAGGGTGATGCTATCTGGTCTGAAG. Protein 3D structure. sequence. primary. Interactions Mutations

AAGTGCCACTGCATAAATGACCATGAGTGGGCACCGGTAAGGGAGGGTGATGCTATCTGGTCTGAAG. Protein 3D structure. sequence. primary. Interactions Mutations Introduction to Databases Lecture Outline Shifra Ben-Dor Irit Orr Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can

More information

NiceProt View of Swiss-Prot: P18907

NiceProt View of Swiss-Prot: P18907 Hosted by NCSC US ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot Mirror sites: Australia Bolivia Canada China Korea Switzerland Taiwan Search Swiss-Prot/TrEMBL for horse alpha Go Clear NiceProt

More information

Redundancy at GenBank => RefSeq. RefSeq vs GenBank. Databases, cont. Genome sequencing using a shotgun approach. Sequenced eukaryotic genomes

Redundancy at GenBank => RefSeq. RefSeq vs GenBank. Databases, cont. Genome sequencing using a shotgun approach. Sequenced eukaryotic genomes Databases, cont. Redundancy at GenBank => RefSeq http://www.ncbi.nlm.nih.gov/books/bv.fcg i?rid=handbook RefSeq vs GenBank Many sequences are represented more than once in GenBank 2003 RefSeq collection

More information

Regulation of eukaryotic transcription:

Regulation of eukaryotic transcription: Promoter definition by mass genome annotation data: in silico primer extension EMBNET course Bioinformatics of transcriptional regulation Jan 28 2008 Christoph Schmid Regulation of eukaryotic transcription:

More information

DNAFSMiner: A Web-Based Software Toolbox to Recognize Two Types of Functional Sites in DNA Sequences

DNAFSMiner: A Web-Based Software Toolbox to Recognize Two Types of Functional Sites in DNA Sequences DNAFSMiner: A Web-Based Software Toolbox to Recognize Two Types of Functional Sites in DNA Sequences Huiqing Liu Hao Han Jinyan Li Limsoon Wong Institute for Infocomm Research, 21 Heng Mui Keng Terrace,

More information

Biological databases an introduction

Biological databases an introduction Biological databases an introduction By Dr. Erik Bongcam-Rudloff SLU 2017 Biological Databases Sequence Databases Genome Databases Structure Databases Sequence Databases The sequence databases are the

More information

Bioinformatics Course AA 2017/2018 Tutorial 2

Bioinformatics Course AA 2017/2018 Tutorial 2 UNIVERSITÀ DEGLI STUDI DI PAVIA - FACOLTÀ DI SCIENZE MM.FF.NN. - LM MOLECULAR BIOLOGY AND GENETICS Bioinformatics Course AA 2017/2018 Tutorial 2 Anna Maria Floriano annamaria.floriano01@universitadipavia.it

More information

Linking the EMBL Australia Bioinformatics Resource with the Australian National Data Service

Linking the EMBL Australia Bioinformatics Resource with the Australian National Data Service Linking the EMBL Australia Bioinformatics Resource with the Australian National Data Service JEFF CHRISTIANSEN ANDS PIERRE CHAUMEIL - QFAB DOMINIQUE GORSE QFAB MARK RAGAN IMB/UQ EMBL Australia Australia

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

NCBI web resources I: databases and Entrez

NCBI web resources I: databases and Entrez NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table

More information

Computational Biology and Bioinformatics

Computational Biology and Bioinformatics Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management

More information

TIGR THE INSTITUTE FOR GENOMIC RESEARCH

TIGR THE INSTITUTE FOR GENOMIC RESEARCH Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,

More information

NUCLEIC ACIDS. DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides.

NUCLEIC ACIDS. DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides. NUCLEIC ACIDS DNA (Deoxyribonucleic Acid) and RNA (Ribonucleic Acid): information storage molecules made up of nucleotides. Base Adenine Guanine Cytosine Uracil Thymine Abbreviation A G C U T DNA RNA 2

More information

Transcriptome Assembly, Functional Annotation (and a few other related thoughts)

Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types

More information

Databases for Life Science Research. Ulf Leser

Databases for Life Science Research. Ulf Leser Databases for Life Science Research Ulf Leser This Lecture What this lecture is not RDBMS in ten slides Classification & Properties Some Examples Ulf Leser: Bioinformatics, Winter Semester 2010/2011 2

More information

Biotechnology Explorer

Biotechnology Explorer Biotechnology Explorer C. elegans Behavior Kit Bioinformatics Supplement explorer.bio-rad.com Catalog #166-5120EDU This kit contains temperature-sensitive reagents. Open immediately and see individual

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

Chimp Sequence Annotation: Region 2_3

Chimp Sequence Annotation: Region 2_3 Chimp Sequence Annotation: Region 2_3 Jeff Howenstein March 30, 2007 BIO434W Genomics 1 Introduction We received region 2_3 of the ChimpChunk sequence, and the first step we performed was to run RepeatMasker

More information

Computational Molecular Biology Intro. Alexander (Sacha) Gultyaev

Computational Molecular Biology Intro. Alexander (Sacha) Gultyaev Computational Molecular Biology Intro Alexander (Sacha) Gultyaev a.p.goultiaev@liacs.leidenuniv.nl Biopolymer sequences DNA: double-helical nucleic acid. Monomers: nucleotides C, A, T, G. RNA: (single-stranded)

More information

Fundamentals of Bioinformatics: computation, biology, computational biology

Fundamentals of Bioinformatics: computation, biology, computational biology Fundamentals of Bioinformatics: computation, biology, computational biology Vasilis J. Promponas Bioinformatics Research Laboratory Department of Biological Sciences University of Cyprus A short self-introduction

More information

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases

Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 5a Protein sequence databases Bioinformatics Prof. M. Michael Gromiha Department of Biotechnology Indian Institute of Technology, Madras Lecture - 5a Protein sequence databases In this lecture, we will mainly discuss on Protein Sequence

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

Types of Databases - By Scope

Types of Databases - By Scope Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of

More information

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005 Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

ORGANISATION AND STANDARDISATION OF INFORMATION IN SWISS-PROT AND TREMBL

ORGANISATION AND STANDARDISATION OF INFORMATION IN SWISS-PROT AND TREMBL 13 ORGANISATION AND STANDARDISATION OF INFORMATION IN SWISS-PROT AND TREMBL Michele Magrane* and Rolf Apweiler. EMBL Outstation European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton,

More information

Two Mark question and Answers

Two Mark question and Answers 1. Define Bioinformatics Two Mark question and Answers Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three

More information

Genome Sequence Assembly

Genome Sequence Assembly Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru

More information

Product Applications for the Sequence Analysis Collection

Product Applications for the Sequence Analysis Collection Product Applications for the Sequence Analysis Collection Pipeline Pilot Contents Introduction... 1 Pipeline Pilot and Bioinformatics... 2 Sequence Searching with Profile HMM...2 Integrating Data in a

More information

Themes: RNA and RNA Processing. Messenger RNA (mrna) What is a gene? RNA is very versatile! RNA-RNA interactions are very important!

Themes: RNA and RNA Processing. Messenger RNA (mrna) What is a gene? RNA is very versatile! RNA-RNA interactions are very important! Themes: RNA is very versatile! RNA and RNA Processing Chapter 14 RNA-RNA interactions are very important! Prokaryotes and Eukaryotes have many important differences. Messenger RNA (mrna) Carries genetic

More information

Chapter 2: Access to Information

Chapter 2: Access to Information Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. Part 1. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq Part 1 An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk http://bit.ly/bio2links

More information

CHAPTERS , 17: Eukaryotic Genetics

CHAPTERS , 17: Eukaryotic Genetics CHAPTERS 14.1 14.6, 17: Eukaryotic Genetics 1. Review the levels of DNA packing within the eukaryote nucleus. Label each level. (A similar diagram is on pg 188 of your textbook.) 2. How do the coding regions

More information

Biological databases an introduction

Biological databases an introduction Biological databases an introduction By Dr. Erik Bongcam-Rudloff SGBC-SLU 2016 VALIDATION Experimental Literature Manual or semi-automatic computational analysis EXPERIMENTAL Costs Needs skilled manpower

More information

Introduction to CGE tools

Introduction to CGE tools Introduction to CGE tools Pimlapas Leekitcharoenphon (Shinny) Research Group of Genomic Epidemiology, DTU-Food. WHO Collaborating Centre for Antimicrobial Resistance in Foodborne Pathogens and Genomics.

More information

Lecture 7 Motif Databases and Gene Finding

Lecture 7 Motif Databases and Gene Finding Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 7 Motif Databases and Gene Finding Motif Databases & Gene Finding Motifs Recap Motif Databases TRANSFAC

More information

Gene-centered resources at NCBI

Gene-centered resources at NCBI COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving

More information

MS bioinformatics analysis for proteomics. Protein anotations

MS bioinformatics analysis for proteomics. Protein anotations MS bioinformatics analysis for proteomics Protein anotations UCO - Córdoba Organized by: ProteoRed, EUPA and Seprot Alberto Medina January, 23rd 2009 Summary Introduction Some issues Software: Fatigo -

More information

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

Frietze_Figure S1. Validation of the ZNF263 antibody.

Frietze_Figure S1. Validation of the ZNF263 antibody. A MW(kDa) 150 GM12878 HeLaS3 HepG2 K562 30 µg Nuc. Ext. 100 75 ZNF263 B MW(kDa) 150 100 75 SN IgG IP IN 1 2 4 1 2 4 1 2 4 : µg angbody ZNF263 Frietze_Figure S1. Validation of the ZNF263 antibody. (A) Nuclear

More information

Sequence Databases and database scanning

Sequence Databases and database scanning Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.

More information

RNA-Sequencing analysis

RNA-Sequencing analysis RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges

More information

Protein Bioinformatics Part I: Access to information

Protein Bioinformatics Part I: Access to information Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures

More information

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression

Unit 1: DNA and the Genome. Sub-Topic (1.3) Gene Expression Unit 1: DNA and the Genome Sub-Topic (1.3) Gene Expression Unit 1: DNA and the Genome Sub-Topic (1.3) Gene Expression On completion of this subtopic I will be able to State the meanings of the terms genotype,

More information

Array-Ready Oligo Set for the Rat Genome Version 3.0

Array-Ready Oligo Set for the Rat Genome Version 3.0 Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.

More information

Integration of data management and analysis for genome research

Integration of data management and analysis for genome research Integration of data management and analysis for genome research Volker Brendel Deparment of Zoology & Genetics and Department of Statistics Iowa State University 2112 Molecular Biology Building Ames, Iowa

More information

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned

More information

Lecture 2 Introduction to Data Formats

Lecture 2 Introduction to Data Formats Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 2 Introduction to Data Formats Introduction to Data Formats Real world, data and formats Sequences and

More information

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies Discovering gene regulatory control using ChIP-chip and ChIP-seq An introduction to gene regulatory control, concepts and methodologies Ian Simpson ian.simpson@.ed.ac.uk bit.ly/bio2_2012 The Central Dogma

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Alla L Lapidus, Ph.D. SPbSU St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as "the study of

More information

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine

Bioinformatics Tools. Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Bioinformatics Tools Stuart M. Brown, Ph.D Dept of Cell Biology NYU School of Medicine Overview This lecture will

More information

Klinisk kemisk diagnostik BIOINFORMATICS

Klinisk kemisk diagnostik BIOINFORMATICS Klinisk kemisk diagnostik - 2017 BIOINFORMATICS What is bioinformatics? Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological,

More information

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database

More information

Databases in Bioinformatics. Molecular Databases. Molecular Databases. NCBI Databases. BINF 630: Bioinformatics Methods

Databases in Bioinformatics. Molecular Databases. Molecular Databases. NCBI Databases. BINF 630: Bioinformatics Methods Databases in Bioinformatics BINF 630: Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu Molecular Databases Molecular Databases Nucleic acid sequences: GenBank, DNA Data Bank of Japan, EMBL

More information

earray 5.0 Create your own Custom Microarray Design

earray 5.0 Create your own Custom Microarray Design earray 5.0 Create your own Custom Microarray Design http://earray.chem.agilent.com earray 5.x Overview Session Summary Session Summary Agilent Genomics Microarray Solution earray Functional Overview Gene

More information

Genome Annotation - 2. Qi Sun Bioinformatics Facility Cornell University

Genome Annotation - 2. Qi Sun Bioinformatics Facility Cornell University Genome Annotation - 2 Qi Sun Bioinformatics Facility Cornell University Output from Maker GFF file: Annotated gene, transcripts, and CDS FASTA file: Predicted transcript sequences Predicted protein sequences

More information

Biology From gene to protein

Biology From gene to protein Biology 205 5.3.06 From gene to protein Shorthand abbreviation of part of the DNA sequence of the SRY gene >gi 17488858 ref XM_010627.4 Homo sapiens SRY (sex determining region Y chromosome) GGCATGTGAGCGGGAAGCCTAGGCTGCCAGCCGCGAGGACCGCACGGAGGAGGAGCAGG

More information

CSE/Beng/BIMM 182: Biological Data Analysis. Instructor: Vineet Bafna TA: Nitin Udpa

CSE/Beng/BIMM 182: Biological Data Analysis. Instructor: Vineet Bafna TA: Nitin Udpa CSE/Beng/BIMM 182: Biological Data Analysis Instructor: Vineet Bafna TA: Nitin Udpa Today We will explore the syllabus through a series of questions? Please ASK All logistical information will be given

More information

Data Basics. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis

Data Basics. Josef K Vogt Slides by: Simon Rasmussen Next Generation Sequencing Analysis Data Basics Josef K Vogt Slides by: Simon Rasmussen 2017 Generalized NGS analysis Sample prep & Sequencing Data size Main data reductive steps SNPs, genes, regions Application Assembly: Compare Raw Pre-

More information

Introduction to 'Omics and Bioinformatics

Introduction to 'Omics and Bioinformatics Introduction to 'Omics and Bioinformatics Chris Overall Department of Bioinformatics and Genomics University of North Carolina Charlotte Acquire Store Analyze Visualize Bioinformatics makes many current

More information

Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G

Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G Annotation Practice Activity [Based on materials from the GEP Summer 2010 Workshop] Special thanks to Chris Shaffer for document review Parts A-G Introduction: A genome is the total genetic content of

More information

Lecture 11. Initiation of RNA Pol II transcription. Transcription Initiation Complex

Lecture 11. Initiation of RNA Pol II transcription. Transcription Initiation Complex Lecture 11 *Eukaryotic Transcription Gene Organization RNA Processing 5 cap 3 polyadenylation splicing Translation Initiation of RNA Pol II transcription Consensus sequence of promoter TATA Transcription

More information

MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE?

MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE? MODULE 1: INTRODUCTION TO THE GENOME BROWSER: WHAT IS A GENE? Lesson Plan: Title Introduction to the Genome Browser: what is a gene? JOYCE STAMM Objectives Demonstrate basic skills in using the UCSC Genome

More information

CS313 Exercise 1 Cover Page Fall 2017

CS313 Exercise 1 Cover Page Fall 2017 CS313 Exercise 1 Cover Page Fall 2017 Due by the start of class on Monday, September 18, 2017. Name(s): In the TIME column, please estimate the time you spent on the parts of this exercise. Please try

More information

NGS Approaches to Epigenomics

NGS Approaches to Epigenomics I519 Introduction to Bioinformatics, 2013 NGS Approaches to Epigenomics Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Contents Background: chromatin structure & DNA methylation Epigenomic

More information

From assembled genome to annotated genome

From assembled genome to annotated genome From assembled genome to annotated genome Procaryotic genomes Eucaryotic genomes Genome annotation servers (web based) 1. RAST 2. NCBI Gene prediction pipeline: Maker Function annotation pipeline: Blast2GO

More information

Applied Biosystems SOLiD 3 Plus System. RNA Application Guide

Applied Biosystems SOLiD 3 Plus System. RNA Application Guide Applied Biosystems SOLiD 3 Plus System RNA Application Guide For Research Use Use Only. Not intended for any animal or human therapeutic or diagnostic use. TRADEMARKS: Trademarks of Life Technologies Corporation

More information

ENCODE DCC Antibody Validation Document

ENCODE DCC Antibody Validation Document ENCODE DCC Antibody Validation Document Date of Submission Name: Email: Lab Antibody Name: Target: Company/ Source: Catalog Number, database ID, laboratory Lot Number Antibody Description: Target Description:

More information

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu.   handouts, papers, datasets Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger

More information

The use of bioinformatic analysis in support of HGT from plants to microorganisms. Meeting with applicants Parma, 26 November 2015

The use of bioinformatic analysis in support of HGT from plants to microorganisms. Meeting with applicants Parma, 26 November 2015 The use of bioinformatic analysis in support of HGT from plants to microorganisms Meeting with applicants Parma, 26 November 2015 WHY WE NEED TO CONSIDER HGT IN GM PLANT RA Directive 2001/18/EC As general

More information

MCB 102 University of California, Berkeley August 11 13, Problem Set 8

MCB 102 University of California, Berkeley August 11 13, Problem Set 8 MCB 102 University of California, Berkeley August 11 13, 2009 Isabelle Philipp Handout Problem Set 8 The answer key will be posted by Tuesday August 11. Try to solve the problem sets always first without

More information

Reference genomes and common file formats

Reference genomes and common file formats Reference genomes and common file formats Overview Reference genomes and GRC Fasta and FastQ (unaligned sequences) SAM/BAM (aligned sequences) Summarized genomic features BED (genomic intervals) GFF/GTF

More information

Introduction and Public Sequence Databases. BME 110/BIOL 181 CompBio Tools

Introduction and Public Sequence Databases. BME 110/BIOL 181 CompBio Tools Introduction and Public Sequence Databases BME 110/BIOL 181 CompBio Tools Todd Lowe March 29, 2011 Course Syllabus: Admin http://www.soe.ucsc.edu/classes/bme110/spring11 Reading: Chapters 1, 2 (pp.29-56),

More information

M1 - Biochemistry. Nucleic Acid Structure II/Transcription I

M1 - Biochemistry. Nucleic Acid Structure II/Transcription I M1 - Biochemistry Nucleic Acid Structure II/Transcription I PH Ratz, PhD (Resources: Lehninger et al., 5th ed., Chapters 8, 24 & 26) 1 Nucleic Acid Structure II/Transcription I Learning Objectives: 1.

More information

Computational gene finding

Computational gene finding Computational gene finding Devika Subramanian Comp 470 Outline (3 lectures) Lec 1 Lec 2 Lec 3 The biological context Markov models and Hidden Markov models Ab-initio methods for gene finding Comparative

More information

The Need for Scientific. Data Annotation. Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences.

The Need for Scientific. Data Annotation. Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences. The Need for Scientific Data Annotation Alick K Law, Ph.D., M.B.A. Marketing Manager IBM Life Sciences alaw@us.ibm.com Cross disciplinary research approach requires organizations to address diverse needs

More information

CRISPR GENOMIC SERVICES PRODUCT CATALOG

CRISPR GENOMIC SERVICES PRODUCT CATALOG CRISPR GENOMIC SERVICES PRODUCT CATALOG DESIGN BUILD ANALYZE The experts at Desktop Genetics can help you design, prepare and manufacture all of the components needed for your CRISPR screen. We provide

More information

Bacterial Genome Annotation

Bacterial Genome Annotation Bacterial Genome Annotation Bacterial Genome Annotation For an annotation you want to predict from the sequence, all of... protein-coding genes their stop-start the resulting protein the function the control

More information

An Introduction to the package geno2proteo

An Introduction to the package geno2proteo An Introduction to the package geno2proteo Yaoyong Li January 24, 2018 Contents 1 Introduction 1 2 The data files needed by the package geno2proteo 2 3 The main functions of the package 3 1 Introduction

More information

Quick reference guide

Quick reference guide Quick reference guide Our Invitrogen GeneArt CRISPR Search and Design Tool allows you to search our database of >600,000 predesigned CRISPR guide RNA (grna) sequences or analyze your sequence of interest

More information

MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC)

MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC) MODULE TSS1: TRANSCRIPTION START SITES INTRODUCTION (BASIC) Lesson Plan: Title JAMIE SIDERS, MEG LAAKSO & WILSON LEUNG Identifying transcription start sites for Peaked promoters using chromatin landscape,

More information

Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea

Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea In the format provided by the authors and unedited. SUPPLEMENTARY INFORMATION VOLUME: 2 ARTICLE NUMBER: 17045 Retroelement-guided protein diversification abounds in vast lineages of Bacteria and Archaea

More information

KDD Cup Task 1 Information Extraction from Biomedical Articles

KDD Cup Task 1 Information Extraction from Biomedical Articles Information Extraction from Biomedical Articles Sub title here Sub title here System Description June / July 2002 The Task: Curate or Not-Curate Build a system for automatic analysis of scientific papers

More information

Synthetic Biology. Sustainable Energy. Therapeutics Industrial Enzymes. Agriculture. Accelerating Discoveries, Expanding Possibilities. Design.

Synthetic Biology. Sustainable Energy. Therapeutics Industrial Enzymes. Agriculture. Accelerating Discoveries, Expanding Possibilities. Design. Synthetic Biology Accelerating Discoveries, Expanding Possibilities Sustainable Energy Therapeutics Industrial Enzymes Agriculture Design Build Generate Solutions to Advance Synthetic Biology Research

More information

Introduction to Bioinformatics for Medical Research. Gideon Greenspan TA: Oleg Rokhlenko. Lecture 1

Introduction to Bioinformatics for Medical Research. Gideon Greenspan TA: Oleg Rokhlenko. Lecture 1 Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il TA: Oleg Rokhlenko Lecture 1 Introduction to Bioinformatics Introduction to Bioinformatics What is Bioinformatics?

More information

Exercises (Multiple sequence alignment, profile search)

Exercises (Multiple sequence alignment, profile search) Exercises (Multiple sequence alignment, profile search) 8. Using Clustal Omega program, available among the tools at the EBI website (http://www.ebi.ac.uk/tools/msa/clustalo/), calculate a multiple alignment

More information

Will discuss proteins in view of Sequence (I,II) Structure (III) Function (IV) proteins in practice

Will discuss proteins in view of Sequence (I,II) Structure (III) Function (IV) proteins in practice Will discuss proteins in view of Sequence (I,II) Structure (III) Function (IV) proteins in practice integration - web system (V) 1 Touring the Protein Space (outline) 1. Protein Sequence - how rich? How

More information

MODULE 5: TRANSLATION

MODULE 5: TRANSLATION MODULE 5: TRANSLATION Lesson Plan: CARINA ENDRES HOWELL, LEOCADIA PALIULIS Title Translation Objectives Determine the codons for specific amino acids and identify reading frames by looking at the Base

More information

Digital information cycle. Database. Database. BINF 630: Bioinformatics Methods

Digital information cycle. Database. Database. BINF 630: Bioinformatics Methods Digital information cycle BINF 630: Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu Creation and capture Storage and management Rights management Search and access Distribution Electronic

More information

Processing Very Large Genomic Files

Processing Very Large Genomic Files Processing Very Large Genomic Files Michael Robinson School of Computer Information Science Florida International University Miami, Florida, USA michael.robinson@cs.fiu.edu Abstract We have developed a

More information

Transcription in Eukaryotes

Transcription in Eukaryotes Transcription in Eukaryotes Biology I Hayder A Giha Transcription Transcription is a DNA-directed synthesis of RNA, which is the first step in gene expression. Gene expression, is transformation of the

More information

Theoretische Biologie

Theoretische Biologie Theoretische Biologie Prof. Computational EvoDevo, University of Leipzig SS 2017 Two Gene Concepts in Comparison Gerstein-Snyder gene definition Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel

More information

Novel methods for RNA and DNA- Seq analysis using SMART Technology. Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc.

Novel methods for RNA and DNA- Seq analysis using SMART Technology. Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc. Novel methods for RNA and DNA- Seq analysis using SMART Technology Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc. Agenda Enabling Single Cell RNA-Seq using SMART Technology SMART

More information

Sequence Analysis Lab Protocol

Sequence Analysis Lab Protocol Sequence Analysis Lab Protocol You will need this handout of instructions The sequence of your plasmid from the ABI The Accession number for Lambda DNA J02459 The Accession number for puc 18 is L09136

More information

2014 Pearson Education, Inc. CH 8: Recombinant DNA Technology

2014 Pearson Education, Inc. CH 8: Recombinant DNA Technology CH 8: Recombinant DNA Technology Biotechnology the use of microorganisms to make practical products Recombinant DNA = DNA from 2 different sources What is Recombinant DNA Technology? modifying genomes

More information