BIOINFORMATICS IN AQUACULTURE. Aleksei Krasnov AKVAFORSK (Ås, Norway) Bergen, September 21, 2007
|
|
- Bonnie Horton
- 6 years ago
- Views:
Transcription
1 BIOINFORMATICS IN AQUACULTURE Aleksei Krasnov AKVAFORSK (Ås, Norway) Bergen, September 21, 2007
2 Research area Functional genomics of salmonids Major in diseases, stress and toxicity Experience is in - Sequence analyses and annotation - Construction and use of microarrays - Management of gene expression data - Microarray statistics
3 Genome-wide level: sequence databases and servers, large microarray platforms and gene expression warehouses Automatic computer-forced analyses have (almost) reached the limit Collaboration between bioinformatic experts and biologists is poor Domain-specific databases linked to tools (modules) for data analyses and annotations Single-gene studies in fish health, welfare, nutrition, reproduction etc Limited use of genomic information Accurace, resolution Number of genes
4 SEQUENCES (mrna) 1. Processing and storage of primary data (GB) Salmo Oncorhynchus (September 7, 2007) 2. Clustering (Unigene, TIGR, GRASP etc) Unique sequences (Unigene) 3. Construction of contigs (TIGR, GRASP) Contigs are problematic (error prone) due to Duplicated genes Multi-gene families with conserved paralogs Splicing variants Unigene deliberately declines from built of contigs
5 IDENTIFICATION AND ANNOTATION Blastn / blastx search across nucleotide / protein databases Identification fully depends on the reference set Search across large databases (e.g. Swissprot, Uniprot) gives best hits to fish proteins with obscure names and poor annotations Sequence banks change continuously Many genes have multiple names, nomenclature is available only for human No rules how to deal with low similarities and uncertain homologies Curated inspection is required Functional annotation (Gene Ontology - GO) Transfer from putative homologs Problems in GO Phylogenetic conservation of function is not granted Structural annotation, assignment to multi-gene families, search for domains (Interpro, Ensembl) Has not been accomplished for salmonids at the genome level (?) Blast is insufficient, more sophisitcated approaches are required (e.g. Hidden Markov Models HMM)
6 GENE INDICES (TIGR, GRASP)
7 +The Gene Index provides - Blastn across salmon or rainbow trout sequences - Sequences and reading frames - Links to Genbank - Positions of EST in contigs - GO annotations - Links to pathways - However - Limited possibilities for search and retrieve of data, especially in a multi-gene format - Many important annotation are missing - Not adapted for comparative genomic studies etc... Shall we wait until developers will introduce everything we need? No database / server is able to meet all requirements What to do with new sequences (EST)?
8 Our first experience in development of software (written by Petri Pehkonen, student of computer science department as a practical exercise) - Stand-alone blast operates with user-specificed sequence databases - Parser, forms for selection of matches, export of data and iterative searches - Database (MS Access) is used for Gene Index
9 Small and simple effort in development of software helped to resolve many problems Analysis, structural and functional annotations of EST Search for members multi-gene families and functional classes Design of microarrays Linking to web databases, e.g. Harvester knowledge base
10
11 Small and simple effort in development of software helped to resolve many problems Analysis, structural and functional annotations of EST Search for members multi-gene families and functional classes Design of microarrays Linking to web databases, e.g. Harvester knowledge base Much more can be done if several groups will join efforts! What do we need?
12 What do we need? Personal opinion Diverse databases / pipe-lines (adapted for research areas, projects, labs personal preferences) MUST be designed by biologists Only USERS know what data and analyses they need Design of database is a time-consuming task The rest is done easily by computer people Toolkit for common use Basic sequences analyses tools adapted for database (blast, sequence alignment, translation, synonymous / non synonymous substitutions etc) Parsers Interfaces and forms to launch application, to import, query, format and export data Task requires joint effort
13 AREA FOR COLLABORATION: IMMUNOGENOMICS A simple question: what immune genes have been (not) identified in salmonid fish? - Interferon-dependent genes? - Immunoglobulins? - Homologs to surface antigens (CD)? A simple solution: - Retrieve all immune proteins by linking GO to Swissprot / Uniprot (use SRS at EBI)
14 SRS is database of databases, an extremely useful and powerful source
15 IMMUNOGENOMICS A task: - Retrieve all immune proteins by linking GO to Swissprot / Uniprot (use SRS at EBI) - Run blast False positives: - Uncertain homology - Errors in annotations False negatives: - Many important genes with immune functions are not annotated in GO - Many fish homologs are not recognized with blast due to low sequence conservation, more powerful methods are required (e.g. HMM) To identify fish immune we must compile a set of reference genes and use advanced methods of sequence analyses To use results we need more precise and extensive annotations (Immune Ontology), description of each gene
16 From EST to microarrays Genome-wide platforms (GRASP, TRAITS) + Good for screening and search for markers - Lack of spot replicates means low accuracy - High quality annotation of large numbers of genes is VERY problematic Medium-size, specialized platforms (e.g. 1.8 K immunochip) - Many important genes are missing, especially those with unknown functions + Spot replicates ensure high sensitivity and accuracy + Coverage of functions that are most important for a research area + High quality annotations are feasible
17 Most important and challenging task is to make sense from gene expression data JVI considers manuscripts that include microarrays and similar parallel profiling analyses of viral or cellular gene expression. However, such manuscripts will be published only if they provide novel insight into the biology of the virus or the infected cell or if they form the basis for additional experiments that provide such insights JVI, guide for authors, 2007
18 STANDARD MICROARRAY MANUSCRIPT INCLUDES - Lists of genes divided into clusters - GO analyses (enriched / depleted classes in the list and / or clusters) - Genes placed on the maps of pathway - Such results are produced easily and do not have any value per se - Papers without important biological findings should not be published - Statistics / data mining is a useful subsidary tool - Researchers should not be confined to any particular method or software - Results are always noisy and should be taken with great caution
19 To find pros and cons of data mining procedures it is important to have samples as a database with strong and flexible querying. Solution: relational database with utilities for data querying and analyses Fish_Chips.exe Fish_Chips includes: - Simple ontology of samples (~ 300) - Easy and flexible querying by experiments, genes, parameters (expression ratio, log-er, ranks etc)
20 Fish_Chips.exe Simple ontolog
21 Fish_Chips.exe Fish_Chips includes: - Simple ontology of samples - Easy and flexible querying by experiments, genes, parameters (expression ratio, log-er, ranks etc) - Sequences, annotations (GO, KEGG), links to web databases Built-in statistical analyses (comparison by GO classes, t-test) - Direct link to Statistica (ANOVA, exact Fisher s test, mean expression profiles) - Formatting of data for external applications (cluster analysis)
22 Cluster analysis why? Lack of choice In microarray analyses number of measurements is greater than number of samples Cluster analysis one of few methods that can work with such data Extremely simple, entirely formal no theory behind Many technical and biological problems Genes members of cluster are co-regulated. Is that true?! - Clusters are found in any data set - Different procedures produce different clusters, clusters must be checked for strength - Similar expression profiles are often observed only in small data sets, most clusters are destroyed by addition of samples - Even strong clusters do not necessarily have bilogical siginificance Example: Genes that were up- or down-regulated in only one outlier form a strong and highly significant cluster
23 Classification of samples. Clustering helps to see the structure of experiment and interaction of factors Separate and combined effects of estrogen and parasitic kidney disease on hepatic gene expression in rainbow trout (Helmut Segner, Univ. Bern) T LE ILE I IHE HE Infection and infection + estrogen Estrogen When two challenges are combined response to pathogen is greater Euclidian distance metric, Ward s method (available only in statistical packages)
24 Finding transcription modules enhances resolution of analyses - Differemtially expressed genes were clustered - Cluster members were checked for correlation to mean expression profile (r > 0.7) - Multiple regression evaluated effects of factors (p < 0.05) and their interaction C Log (Expression ratio) D Log (Expression ratio) Beta E2 = Beta PKD = T LE HE I Study groups T LE HE I Study groups ILE IHE Beta E2 = Beta PKD = ILE IHE Complement component C3 Complement component C5 Complement component C9 Complement factor Bf-1 Complement factor H Properdin C type lectin receptor B Toll-like receptor 20a Ceruloplasmin Endothelial leukocyte adhesion molecu Ig kappa chain V-IV region B17-2 Profilin CC chemokine SCYA110-2 Adenosine kinase 2 Acute phase protein G1/S-specific cyclin D2 Histone deacetylase 4 Fibroblast growth factor-20 Bone morphogenetic protein 8-like Metallothionein-IL Stress 70 protein chaperone Thioredoxin-like protein 4A Induced with parasite, no response to estrogen Bcl2-associated X protein C3a anaphylatoxin chemotactic recepto CC chemokine SCYA110-1 Cytokine inducible SH2-containing prote Egl nine homolog 2 Hemopexin Ig kappa chain V-I region WEA Interferon-related regulator 2-1 Liver-expressed antimicrobial peptide 2 Membrane-type mosaic serine protease Myelin basic protein-1 NAD-dependent deacetylase sirtuin 5 Peptidyl-prolyl cis-trans isomerase 2-2 Semaphorin 7A Induced with parasite, suppressed with estrogen
Sequence Based Function Annotation
Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation 1. Given a sequence, how to predict its biological
More informationBioinformatics for Proteomics. Ann Loraine
Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data
More informationI AM NOT A METAGENOMIC EXPERT. I am merely the MESSENGER. Blaise T.F. Alako, PhD EBI Ambassador
I AM NOT A METAGENOMIC EXPERT I am merely the MESSENGER Blaise T.F. Alako, PhD EBI Ambassador blaise@ebi.ac.uk Hubert Denise Alex Mitchell Peter Sterk Sarah Hunter http://www.ebi.ac.uk/metagenomics Blaise
More informationMicroarray Informatics
Microarray Informatics Donald Dunbar MSc Seminar 4 th February 2009 Aims To give a biologistʼs view of microarray experiments To explain the technologies involved To describe typical microarray experiments
More informationThe Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica
The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database
More informationMicroarray Informatics
Microarray Informatics Donald Dunbar MSc Seminar 31 st January 2007 Aims To give a biologist s view of microarray experiments To explain the technologies involved To describe typical microarray experiments
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org kcoombes@mdanderson.org
More informationBIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology
BIO4342 Lab Exercise: Detecting and Interpreting Genetic Homology Jeremy Buhler March 15, 2004 In this lab, we ll annotate an interesting piece of the D. melanogaster genome. Along the way, you ll get
More informationELE4120 Bioinformatics. Tutorial 5
ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar
More informationCOMPUTATIONAL PREDICTION AND CHARACTERIZATION OF A TRANSCRIPTOME USING CASSAVA (MANIHOT ESCULENTA) RNA-SEQ DATA
COMPUTATIONAL PREDICTION AND CHARACTERIZATION OF A TRANSCRIPTOME USING CASSAVA (MANIHOT ESCULENTA) RNA-SEQ DATA AOBAKWE MATSHIDISO, SCOTT HAZELHURST, CHRISSIE REY Wits Bioinformatics, University of the
More informationProtein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)
Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Brad Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org 8
More informationBIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP
Jasper Decuyper BIOINFORMATICS FOR DUMMIES MB&C2017 WORKSHOP MB&C2017 Workshop Bioinformatics for dummies 2 INTRODUCTION Imagine your workspace without the computers Both in research laboratories and in
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Brad Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org 7
More informationTraining materials.
Training materials Ensembl training materials are protected by a CC BY license http://creativecommons.org/licenses/by/4.0/ If you wish to re-use these materials, please credit Ensembl for their creation
More informationChapter 2: Access to Information
Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI
More informationIntroduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks
Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional
More informationProduct Applications for the Sequence Analysis Collection
Product Applications for the Sequence Analysis Collection Pipeline Pilot Contents Introduction... 1 Pipeline Pilot and Bioinformatics... 2 Sequence Searching with Profile HMM...2 Integrating Data in a
More informationCommunity-assisted genome annotation: The Pseudomonas example. Geoff Winsor, Simon Fraser University Burnaby (greater Vancouver), Canada
Community-assisted genome annotation: The Pseudomonas example Geoff Winsor, Simon Fraser University Burnaby (greater Vancouver), Canada Overview Pseudomonas Community Annotation Project (PseudoCAP) Past
More informationBrowsing Genes and Genomes with Ensembl
Browsing Genes and Genomes with Ensembl Emily Perry Ensembl Outreach Project Leader EMBL-EBI Objectives What is Ensembl? What type of data can you get in Ensembl? How to navigate the Ensembl browser website.
More informationThis practical aims to walk you through the process of text searching DNA and protein databases for sequence entries.
PRACTICAL 1: BLAST and Sequence Alignment The EBI and NCBI websites, two of the most widely used life science web portals are introduced along with some of the principal databases: the NCBI Protein database,
More informationGene Finding Genome Annotation
Gene Finding Genome Annotation Gene finding is a cornerstone of genomic analysis Genome content and organization Differential expression analysis Epigenomics Population biology & evolution Medical genomics
More informationSequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University
Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned
More informationSequence Databases and database scanning
Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.
More informationTranscriptome Assembly, Functional Annotation (and a few other related thoughts)
Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types
More informationBioinformatics for Cell Biologists
Bioinformatics for Cell Biologists 15 19 March 2010 Developmental Biology and Regnerative Medicine (DBRM) Schedule Monday, March 15 09.00 11.00 Introduction to course and Bioinformatics (L1) D224 Helena
More informationTIGR THE INSTITUTE FOR GENOMIC RESEARCH
Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,
More informationWeb-based Bioinformatics Applications in Proteomics
Web-based Bioinformatics Applications in Proteomics Chiquito Crasto ccrasto@genetics.uab.edu January 30, 2009 NCBI (National Center for Biotechnology Information) http://www.ncbi.nlm.nih.gov/ 1 Pubmed
More informationBiology 644: Bioinformatics
Processes Activation Repression Initiation Elongation.... Processes Splicing Editing Degradation Translation.... Transcription Translation DNA Regulators DNA-Binding Transcription Factors Chromatin Remodelers....
More informationFUNCTIONAL BIOINFORMATICS
Molecular Biology-2018 1 FUNCTIONAL BIOINFORMATICS PREDICTING THE FUNCTION OF AN UNKNOWN PROTEIN Suppose you have found the amino acid sequence of an unknown protein and wish to find its potential function.
More informationThe Gene Ontology Annotation (GOA) project application of GO in SWISS-PROT, TrEMBL and InterPro
Comparative and Functional Genomics Comp Funct Genom 2003; 4: 71 74. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.235 Conference Review The Gene Ontology Annotation
More informationWhy learn sequence database searching? Searching Molecular Databases with BLAST
Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results
More informationApplied Bioinformatics
Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement
More informationOutline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases
Chapter 7: Similarity searches on sequence databases All science is either physics or stamp collection. Ernest Rutherford Outline Why is similarity important BLAST Protein and DNA Interpreting BLAST Individualizing
More informationPost-assembly Data Analysis
Assembled transcriptome Post-assembly Data Analysis Quantification: the expression level of each gene in each sample DE genes: genes differentially expressed between samples Clustering/network analysis
More informationGenome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013)
Genome annotation Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation AGACAAAGATCCGCTAAATTAAATCTGGACTTCACATATTGAAGTGATATCACACGTTTCTCTAAT AATCTCCTCACAATATTATGTTTGGGATGAACTTGTCGTGATTTGCCATTGTAGCAATCACTTGAA
More informationFrom Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow
From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with
More informationBIOINFORMATICS TO ANALYZE AND COMPARE GENOMES
BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES We sequenced and assembled a genome, but this is only a long stretch of ATCG What should we do now? 1. find genes What are the starting and end points for
More informationQuestion 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.
Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or
More informationBioinformatics & Protein Structural Analysis. Bioinformatics & Protein Structural Analysis. Learning Objective. Proteomics
The molecular structures of proteins are complex and can be defined at various levels. These structures can also be predicted from their amino-acid sequences. Protein structure prediction is one of the
More informationAGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
More informationGenomic region (ENCODE) Gene definitions
DNA From genes to proteins Bioinformatics Methods RNA PROMOTER ELEMENTS TRANSCRIPTION Iosif Vaisman mrna SPLICE SITES SPLICING Email: ivaisman@gmu.edu START CODON STOP CODON TRANSLATION PROTEIN From genes
More informationEECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science
EECS 730 Introduction to Bioinformatics Sequence Alignment Luke Huan Electrical Engineering and Computer Science http://people.eecs.ku.edu/~jhuan/ Database What is database An organized set of data Can
More information9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3
cdna libraries, EST clusters, gene prediction and functional annotation Biosciences 741: Genomics Fall, 2013 Week 3 1 2 3 4 5 6 Figure 2.14 Relationship between gene structure, cdna, and EST sequences
More informationArray-Ready Oligo Set for the Rat Genome Version 3.0
Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.
More informationFiles for this Tutorial: All files needed for this tutorial are compressed into a single archive: [BLAST_Intro.tar.gz]
BLAST Exercise: Detecting and Interpreting Genetic Homology Adapted by W. Leung and SCR Elgin from Detecting and Interpreting Genetic Homology by Dr. J. Buhler Prequisites: None Resources: The BLAST web
More informationGREG GIBSON SPENCER V. MUSE
A Primer of Genome Science ience THIRD EDITION TAGCACCTAGAATCATGGAGAGATAATTCGGTGAGAATTAAATGGAGAGTTGCATAGAGAACTGCGAACTG GREG GIBSON SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc.
More informationTextbook Reading Guidelines
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum Last updated: January 16, 2013 Textbook Reading Guidelines Preface: Read the whole preface, and especially: For the students with Life Science
More informationA WEB-BASED TOOL FOR GENOMIC FUNCTIONAL ANNOTATION, STATISTICAL ANALYSIS AND DATA MINING
A WEB-BASED TOOL FOR GENOMIC FUNCTIONAL ANNOTATION, STATISTICAL ANALYSIS AND DATA MINING D. Martucci a, F. Pinciroli a,b, M. Masseroli a a Dipartimento di Bioingegneria, Politecnico di Milano, Milano,
More informationDownload the Lectin sequence output from
Computer Analysis of DNA and Protein Sequences Over the Internet Part I. IN CLASS Download the Lectin sequence output from http://stan.cropsci.uiuc.edu/courses/cpsc265/ Open these in BioEdit (free software).
More informationAnnotation. (Chapter 8)
Annotation (Chapter 8) Genome annotation Genome annotation is the process of attaching biological information to sequences: identify elements on the genome attach biological information to elements store
More informationAgilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008
Agilent GeneSpring GX 10: Gene Expression and Beyond Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008 GeneSpring GX 10 in the News Our Goals for GeneSpring GX 10 Goal 1: Bring back GeneSpring
More informationSequence Analysis. Introduction to Bioinformatics BIMMS December 2015
Sequence Analysis Introduction to Bioinformatics BIMMS December 2015 abriel Teku Department of Experimental Medical Science Faculty of Medicine Lund University Sequence analysis Part 1 Sequence analysis:
More informationGenomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010
Genomic Annotation Lab Exercise By Jacob Jipp and Marian Kaehler Luther College, Department of Biology Genomics Education Partnership 2010 Genomics is a new and expanding field with an increasing impact
More informationFINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1)
FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1) 1.1 Finding a gene using text search. Note: For this exercise use http://www.plasmodb.org a. Find all possible kinases in Plasmodium.
More informationGenome Annotation - 2. Qi Sun Bioinformatics Facility Cornell University
Genome Annotation - 2 Qi Sun Bioinformatics Facility Cornell University Output from Maker GFF file: Annotated gene, transcripts, and CDS FASTA file: Predicted transcript sequences Predicted protein sequences
More informationProtein Bioinformatics Part I: Access to information
Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures
More informationBME 110 Midterm Examination
BME 110 Midterm Examination May 10, 2011 Name: (please print) Directions: Please circle one answer for each question, unless the question specifies "circle all correct answers". You can use any resource
More informationBacterial Genome Annotation
Bacterial Genome Annotation Bacterial Genome Annotation For an annotation you want to predict from the sequence, all of... protein-coding genes their stop-start the resulting protein the function the control
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics EMBO Practical Course on Computational analysis of protein-protein interactions: From sequences to networks Mon 28th September Sat 3rd October 2015 TGAC, Norwich, UK Monday
More informationGenome Annotation Genome annotation What is the function of each part of the genome? Where are the genes? What is the mrna sequence (transcription, splicing) What is the protein sequence? What does
More informationFunction Prediction of Proteins from their Sequences with BAR 3.0
Open Access Annals of Proteomics and Bioinformatics Short Communication Function Prediction of Proteins from their Sequences with BAR 3.0 Giuseppe Profiti 1,2, Pier Luigi Martelli 2 and Rita Casadio 2
More informationIdentifying Regulatory Regions using Multiple Sequence Alignments
Identifying Regulatory Regions using Multiple Sequence Alignments Prerequisites: BLAST Exercise: Detecting and Interpreting Genetic Homology. Resources: ClustalW is available at http://www.ebi.ac.uk/tools/clustalw2/index.html
More informationMicroarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison. CodeLink compatible
Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood - a Platform Comparison CodeLink compatible Microarray Analysis of Gene Expression in Huntington's Disease Peripheral Blood
More informationBiological databases an introduction
Biological databases an introduction By Dr. Erik Bongcam-Rudloff SLU 2017 Biological Databases Sequence Databases Genome Databases Structure Databases Sequence Databases The sequence databases are the
More informationAssessing De-Novo Transcriptome Assemblies
Assessing De-Novo Transcriptome Assemblies Shawn T. O Neil Center for Genome Research and Biocomputing Oregon State University Scott J. Emrich University of Notre Dame 100K Contigs, Perfect 1M Contigs,
More informationuser s guide Question 1
Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966
More informationFinal exam: Introduction to Bioinformatics and Genomics DUE: Friday June 29 th at 4:00 pm
Final exam: Introduction to Bioinformatics and Genomics DUE: Friday June 29 th at 4:00 pm Exam description: The purpose of this exam is for you to demonstrate your ability to use the different biomolecular
More informationLarge Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe
Large Scale Enzyme Func1on Discovery: Sequence Similarity Networks for the Protein Universe Boris Sadkhin University of Illinois, Urbana-Champaign Blue Waters Symposium May 2015 Overview The Protein Sequence
More informationGene Prediction in Eukaryotes
Gene Prediction in Eukaryotes Jan-Jaap Wesselink Biomol Informatics, S.L. jjw@biomol-informatics.com June 2010/Madrid jjw@biomol-informatics.com (BI) Gene Prediction June 2010/Madrid 1 / 34 Outline 1 Gene
More informationGlossary of Commonly used Annotation Terms
Glossary of Commonly used Annotation Terms Akela a general use server for the annotation group as well as other groups throughout TIGR. Annotation Notebook a link from the gene list page that is associated
More informationab initio and Evidence-Based Gene Finding
ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene
More informationBioinformatics and computational tools
Bioinformatics and computational tools Etienne P. de Villiers (PhD) International Livestock Research Institute Nairobi, Kenya International Livestock Research Institute Nairobi, Kenya ILRI works at the
More informationIntroduction to EMBL-EBI.
Introduction to EMBL-EBI www.ebi.ac.uk What is EMBL-EBI? Part of EMBL Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands,
More informationKyoto Encyclopedia of Genes and Genomes (KEGG)
NPTEL Biotechnology -Systems Biology Kyoto Encyclopedia of Genes and Genomes (KEGG) Dr. M. Vijayalakshmi School of Chemical and Biotechnology SASTRA University Joint Initiative of IITs and IISc Funded
More informationSince 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL
Since 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL PIR-PSD Funded mainly by NIH (US) to be the highest quality, most thoroughly annotated protein sequence database o A high quality
More informationGuided tour to Ensembl
Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org
More informationFrom assembled genome to annotated genome
From assembled genome to annotated genome Procaryotic genomes Eucaryotic genomes Genome annotation servers (web based) 1. RAST 2. NCBI Gene prediction pipeline: Maker Function annotation pipeline: Blast2GO
More informationGenome annotation & EST
Genome annotation & EST What is genome annotation? The process of taking the raw DNA sequence produced by the genome sequence projects and adding the layers of analysis and interpretation necessary
More informationData Retrieval from GenBank
Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing
More informationSoil invertebrates as a genomic model to study pollutants in the field
Soil invertebrates as a genomic model to study pollutants in the field Dick Roelofs, Martijn Timmermans, Muriel de Boer, Ben Nota, Tjalf de Boer, Janine Mariën, Nico van Straalen ecogenomics Folsomia candida
More informationMetaGO: Predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein-protein network mapping
MetaGO: Predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein-protein network mapping Chengxin Zhang, Wei Zheng, Peter L Freddolino, and Yang
More informationA History of Bioinformatics: Development of in silico Approaches to Evaluate Food Proteins
A History of Bioinformatics: Development of in silico Approaches to Evaluate Food Proteins /////////// Andre Silvanovich Ph. D. Bayer Crop Sciences Chesterfield, MO October 2018 Bioinformatic Evaluation
More informationSpectral Counting Approaches and PEAKS
Spectral Counting Approaches and PEAKS INBRE Proteomics Workshop, April 5, 2017 Boris Zybailov Department of Biochemistry and Molecular Biology University of Arkansas for Medical Sciences 1. Introduction
More informationRepresenting Errors and Uncertainty in Plasma Proteomics
Representing Errors and Uncertainty in Plasma Proteomics David J. States, M.D., Ph.D. University of Michigan Bioinformatics Program Proteomics Alliance for Cancer Genomics vs. Proteomics Genome sequence
More informationIntroduction to Microarray Data Analysis and Gene Networks. Alvis Brazma European Bioinformatics Institute
Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics Institute A brief outline of this course What is gene expression, why it s important Microarrays and how
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools News About NCBI Site Map
More informationTranslating Biological Data Sets Into Linked Data
Translating Biological Data Sets Into Linked Data Mark Tomko Simmons College, Boston MA The Broad Institute of MIT and Harvard, Cambridge MA September 28, 2011 Overview Why study biological data? UniProt
More informationElixir: European Bioinformatics Research Infrastructure. Rolf Apweiler
Elixir: European Bioinformatics Research Infrastructure Rolf Apweiler EMBL-EBI Service Mission To enable life science research and its translation to medicine, agriculture, the bioindustries and society
More informationFACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE
FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE BIOMOLECULES COURSE: COMPUTER PRACTICAL 1 Author of the exercise: Prof. Lloyd Ruddock Edited by Dr. Leila Tajedin 2017-2018 Assistant: Leila Tajedin (leila.tajedin@oulu.fi)
More informationBiological databases an introduction
Biological databases an introduction By Dr. Erik Bongcam-Rudloff SGBC-SLU 2016 VALIDATION Experimental Literature Manual or semi-automatic computational analysis EXPERIMENTAL Costs Needs skilled manpower
More informationEra with Computational Biology/Toxicology
USM Seminar 1/22/2010 Embracing the Post-Omics Era with Computational Biology/Toxicology Ping Gong Environmental Genomics and Genetics (EGG) Team @ Environmental Laboratory Outline Introduction Bioinformatics
More informationStandard Data Analysis Report Agilent Gene Expression Service
Standard Data Analysis Report Agilent Gene Expression Service Experiment: S534662 Date: 2011-01-01 Prepared for: Dr. Researcher Genomic Sciences Lab Prepared by S534662 Standard Data Analysis Report 2011-01-01
More informationChimp Sequence Annotation: Region 2_3
Chimp Sequence Annotation: Region 2_3 Jeff Howenstein March 30, 2007 BIO434W Genomics 1 Introduction We received region 2_3 of the ChimpChunk sequence, and the first step we performed was to run RepeatMasker
More informationBasic Bioinformatics: Homology, Sequence Alignment,
Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi
More informationThe University of California, Santa Cruz (UCSC) Genome Browser
The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,
More informationSupplementary Figure 1. Design of the control microarray. a, Genomic DNA from the
Supplementary Information Supplementary Figures Supplementary Figure 1. Design of the control microarray. a, Genomic DNA from the strain M8 of S. ruber and a fosmid containing the S. ruber M8 virus M8CR4
More informationThe hidden transcriptome: discovery of novel, stress-responsive transcription in Daphnia pulex
University of Iowa Iowa Research Online Theses and Dissertations Spring 2011 The hidden transcriptome: discovery of novel, stress-responsive transcription in Daphnia pulex Stephen Butcher University of
More informationThe hidden transcriptome: discovery of novel, stress-responsive transcription in Daphnia pulex
University of Iowa Iowa Research Online Theses and Dissertations Spring 2011 The hidden transcriptome: discovery of novel, stress-responsive transcription in Daphnia pulex Stephen Butcher University of
More informationOld EXAM 1 BIO409/509 NAME. Please number your answers and write them on the attached, lined paper.
Old EXAM 1 BIO409/509 NAME Please number your answers and write them on the attached, lined paper. 1) Describe euchromatin and heterochromatin. Which form of chromatin would the insulin gene be found in
More information