Compiled by Mr. Nitin Swamy Asst. Prof. Department of Biotechnology

Size: px
Start display at page:

Download "Compiled by Mr. Nitin Swamy Asst. Prof. Department of Biotechnology"

Transcription

1 Bioinformatics Model Answers Compiled by Mr. Nitin Swamy Asst. Prof. Department of Biotechnology Page 1 of 15

2 Previous years questions asked. 1. Describe the software used in bioinformatics 2. Name four Nucleic acid databases. 3. What are the applications of bioinformatics? Bioinformatics Biological data are being produced at a phenomenal rate. For example as of August 2000, the GenBank repository of nucleic acid sequences contained 8,214,000 entries and the SWISS-PROT database of protein sequences contained 88,166. On average, these databases are doubling in size every 15 months. In addition, since the publication of the H. influenzae genome, complete sequences for over 40 organisms have been released, ranging from 450 genes to over 100,000. Add to this the data from the myriad of related projects that study gene expression, determine the protein structures encoded by the genes, and detail how these products interact with one another, and we can begin to imagine the enormous quantity and variety of information that is being produced. As a result of this surge in data, computers have become indispensable to biological research. Such an approach is ideal because of the ease with which computers can handle large quantities of data and probe the complex dynamics observed in nature. Bioinformatics, the subject is often defined as the application of computational techniques to understand and organize the information associated with biological macromolecules. This unexpected union between the two subjects is largely attributed to the fact that life itself is an information technology; an organism s physiology is largely determined by its genes, which at its most basic can be viewed as digital information. Page 2 of 15

3 Bioinformatics - a definition Model Answer B. Sc. 6th Semester 2017 (Molecular) bio informatics: bioinformatics is conceptualising biology in terms of molecules (in the sense of physical chemistry) and applying "informatics techniques" (derived from disciplines such as applied maths, computer science and statistics) to understand and organise the information associated with these molecules, on a large scale. In short, bioinformatics is a management information system for molecular biology and has many practical applications. -As submitted to the Oxford English Dictionary Aims of bioinformatics The aims of bioinformatics are threefold:- 1. First, at its simplest bioinformatics organizes data in a way that allows researchers to access existing information and to submit new entries as they are produced, eg., the Protein Data Bank for 3D macromolecular structures. While data managing is an essential task, the information stored in these databases is essentially useless until analyzed. Thus the purpose of bioinformatics extends much further. 2. The second aim is to develop tools and resources that aid in the analysis of data. For example, having sequenced a particular protein, it is of interest to compare it with previously characterized sequences. This needs more than just a simple text-based search and programs such as FASTA and PSI-BLAST must consider what comprises a biologically significant match. Development of such resources dictates expertise in computational theory as well as a thorough understanding of biology. 3. The third aim is to use these tools to analyze the data and interpret the results in a biologically meaningful manner. Traditionally, biological studies examined individual systems in detail, and frequently compared those with a few that are related. In bioinformatics, we can now conduct global analyses of all the available data with the aim of uncovering common principles that apply across many systems and highlight novel features. Page 3 of 15

4 BIOLOGICAL DATABASES As biology has increasingly turned into a data-rich science, the need for storing and communicating large datasets has grown tremendously. The obvious examples are the nucleotide sequences, the protein sequences, and the 3D structural data. Bioinformatics is the application of Information technology to store, organize and analyze the vast amount of biological data which is available in the form of sequences and structures of proteins (the building blocks of organisms) and nucleic acids (the information carrier). The biological information of nucleic acids is available as sequences while the data of proteins is available as sequences and structures. Sequences are represented in single dimension where as the structure contains the three dimensional data of sequences. Sequences and structures are only among the several different types of data required in the practice of the modern molecular biology. Other important data types includes metabolic pathways and molecular interactions, mutations and polymorphism in molecular sequences and structures as well as organelle structures and tissue types, genetic maps, physiochemical data, gene expression profiles, two dimensional DNA chip images of mrna expression, two dimensional gel electrophoresis images of protein expression, data A biological database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. There are two main functions of biological databases: Make biological data available to scientists. o As much as possible of a particular type of information should be available in one single place (book, site, and database). Published data may be difficult to find or access and collecting it from the literature is very time- consuming. And not all data is actually published explicitly in an article (genome sequences!). To make biological data available in computer-readable form. o Since analysis of biological data almost always involves computers, having the data in computer-readable form (rather than printed on paper) is a necessary first step. Page 4 of 15

5 Data Domains Types of data generated by molecular biology research: Nucleotide sequences (DNA and mrna) Protein sequences 3-D protein structures Complete genomes and maps When Sanger first discovered the method to sequence proteins, there was a lot of excitement in the field of Molecular Biology. Initial interest in Bioinformatics was propelled by the necessity to create databases of biological sequences. Biological databases can be broadly classified into sequence and structure databases. Sequence databases are applicable to both nucleic acid sequences and protein sequences, whereas structure database is applicable to only Proteins. The first database was created within a short period after the Insulin protein sequence was made available in Incidentally, Insulin is the first protein to be sequenced. The sequence of Insulin consisted of just 51 residues (analogous to alphabets in a sentence) which characterize the sequence. While the initial databases of protein sequences were maintained at the individual laboratories, the development of a consolidated formal database known as SWISS-PROT protein sequence database was initiated in 1986 which now has about 70,000 protein sequences from more than 5000 model organisms, a small fraction of all known organisms. These huge varieties of divergent data resources are now available for study and research by both academic institutions and industries. These are made available as public domain information in the larger interest of research community through Internet ( Page 5 of 15

6 Databases in general can be classified in to primary, secondary and composite databases. 1. A primary database contains information of the sequence or structure alone. Examples of these include Swiss-Prot & PIR for protein sequences, GenBank & DDBJ for Genome sequences and the Protein Databank for protein structures. 2. A secondary database contains derived information from the primary database. A secondary sequence database contains information like the conserved sequence, signature sequence and active site residues of the protein families arrived by multiple sequence alignment of a set of related proteins. A secondary structure database contains entries of the PDB in an organized way. These contain entries that are classified according to their structure like all alpha proteins, all beta proteins, etc. These also contain information on conserved secondary structure motifs of a particular protein. Some of the secondary database created and hosted by various researchers at their individual laboratories includes SCOP, developed at Cambridge University; CATH developed at University College of London, PROSITE of Swiss Institute of Bioinformatics, emotif at Stanford. 3. Composite database amalgamates a variety of different primary database sources, which obviates the need to search multiple resources. Different composite database use different primary database and different criteria in their search algorithm. Various options for search have also been incorporated in the composite database. The National Center for Biotechnology Information (NCBI) which hosts these nucleotide and protein databases in their large high available redundant array of computer servers, provides free access to the various persons involved in research. This also has link to OMIM (Online Mendelian Inheritance in Man) which contains information about the proteins involved in genetic diseases. Page 6 of 15

7 Organizations related to Bioinformatics:- The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an important resource for bioinformatics tools and services. Major databases include GenBank for DNA sequences and PubMed, a bibliographic database for the biomedical literature. Other databases include the NCBI Epigenomics database. All these databases are available online through the Entrez search engine. Page 7 of 15

8 Nucleotide database GenBank NCBI has had responsibility for making available the GenBank DNA sequence database since GenBank coordinates with individual laboratories and other sequence databases such as those of the European Molecular Biology Laboratory (EMBL) and the DNA Data Bank of Japan (DDBJ). Since 1992, NCBI has grown to provide other databases in addition to GenBank. NCBI provides Gene, Online Mendelian Inheritance in Man, the Molecular Modeling Database (3D protein structures), dbsnp (a database of single-nucleotide polymorphisms), the Reference Sequence Collection, a map of the human genome, and a taxonomy browser, and coordinates with the National Cancer Institute to provide the Cancer Genome Anatomy Project. The NCBI assigns a unique identifier (taxonomy ID number) to each species of organism. The NCBI has software tools that are available by WWW browsing or by FTP. For example, BLAST is a sequence similarity searching program. BLAST can do sequence comparisons against the GenBank DNA database in less than 15 seconds. European Molecular Biology Laboratory The European Molecular Biology Laboratory (EMBL) is a molecular biology research institution supported by 22 member states, four prospect and two associate member states. EMBL was created in 1974 and is an intergovernmental organisation funded by public research money from its member states. Research at EMBL is conducted by approximately 85 independent groups covering the spectrum of molecular biology. Each of the different EMBL sites have a specific research field. The EMBL-EBI is a hub for bioinformatics research and services, developing and maintaining a large number of scientific databases, which are free of charge. At Grenoble and Hamburg, research is focused on structural biology. EMBL's dedicated Mouse Biology Unit is located in Monterotondo. At the headquarters in Heidelberg, there are units in Cell Biology and Biophysics, Developmental Biology, Genome Biology and Structural and Computational Biology as well as service groups complementing the aforementioned research fields. The DNA Data Bank of Japan The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA sequences. It is located at the National Institute of Genetics (NIG) in the Shizuoka prefecture of Japan. It is also a member of the International Nucleotide Sequence Database Collaboration or INSDC. It exchanges its data with European Molecular Biology Page 8 of 15

9 Laboratory at the European Bioinformatics Institute and with GenBank at the National Center for Biotechnology Information on a daily basis. Thus these three databanks contain the same data at any given time. DDBJ began data bank activities in 1986 at NIG and remains the only nucleotide sequence data bank in Asia. Although DDBJ mainly receives its data from Japanese researchers, it can accept data from contributors from any other country. Protein database Swiss-Prot Swiss-Prot was created in 1986 by Amos Bairoch during his PhD and developed by the Swiss Institute of Bioinformatics and subsequently developed by Rolf Apweiler at the European Bioinformatics Institute. Swiss-Prot aimed to provide reliable protein sequences associated with a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recognizing that sequence data were being generated at a pace exceeding Swiss-Prot's ability to keep up, TrEMBL (Translated EMBL Nucleotide Sequence Data Library) was created to provide automated annotations for those proteins not in Swiss-Prot. Protein Information Resource The Protein Information Resource (PIR) is an integrated public bioinformatics resource to support genomic, proteomic and systems biology research and scientific studies.pir was established in 1984 by the National Biomedical Research Foundation (NBRF) as a resource to assist researchers in the identification and interpretation of protein sequence information. Prior to that, the NBRF compiled the first comprehensive collection of macromolecular sequences in the Atlas of Protein Sequence and Structure, published from under the editorship of Margaret O. Dayhoff. Dr. Dayhoff and her research group pioneered in the development of computer methods for the comparison of protein sequences, for the detection of distantly related sequences and duplications within sequences, and for the inference of evolutionary histories from alignments of protein sequences. Page 9 of 15

10 SOFTWARES USED IN BIOINFORMATICS BLAST In bioinformatics, BLAST for Basic Local Alignment Search Tool is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold. Different types of BLASTs are available according to the query sequences. For example, following the discovery of a previously unknown gene in the mouse, a scientist will typically perform a BLAST search of the human genome to see if humans carry a similar gene; BLAST will identify sequences in the human genome that resemble the mouse gene based on similarity of sequence. The BLAST algorithm and program were designed by Stephen Altschul, Warren Gish, Webb Miller, Eugene Myers, and David J. Lipman at the National Institutes of Health BLAST is actually a family of programs (all included in the blastall executable). These include: 1. Nucleotide-nucleotide BLAST (blastn) This program, given a DNA query, returns the most similar DNA sequences from the DNA database that the user specifies. 2. Protein-protein BLAST (blastp) This program, given a protein query, returns the most similar protein sequences from the protein database that the user specifies. 3. Nucleotide 6-frame translation-protein (blastx) This program compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. 4. Nucleotide 6-frame translation-nucleotide 6-frame translation (tblastx) This program is the slowest of the BLAST family. It translates the query nucleotide sequence in all six possible frames and compares it against the six-frame translations of a nucleotide sequence database. The purpose of tblastx is to find very distant relationships between nucleotide sequences. 5. Protein-nucleotide 6-frame translation (tblastn) This program compares a protein query against the all six reading frames of a nucleotide sequence database. 6. Large numbers of query sequences (megablast) Page 10 of 15

11 When comparing large numbers of input sequences via the command-line BLAST, "megablast" is much faster than running BLAST multiple times. It concatenates many input sequences together to form a large sequence before searching the BLAST database, then post-analyzes the search results to glean individual alignments and statistical values. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. FASTA FASTA is a DNA and protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in Its legacy is the FASTA format which is now ubiquitous in bioinformatics. The original FASTA program was designed for protein sequence similarity searching. Because of the exponentially expanding genetic information and the limited speed and memory of computers in the 1980s heuristic methods were introduced aligning a query sequence to entire data-bases. FASTA (developed in 1988) added the ability to do DNA:DNA searches, translated protein:dna searches, and also provided a more sophisticated shuffling program for evaluating statistical significance.there are several programs in this package that allow the alignment of protein sequences and DNA sequences. The FASTA programs find regions of local or global similarity between Protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence. Data retrival tools ENTREZ The Entrez Global Query Cross-Database Search System is a federated search engine, or web portal that allows users to search many discrete health sciences databases at the National Center for Biotechnology Information (NCBI) website. The name "Entrez" (a greeting meaning "Come in!" in French) was chosen to reflect the spirit of welcoming the public to search the content. Entrez Global Query is an integrated search and retrieval system that provides access to all databases simultaneously with a single query string and user interface. Entrez can efficiently retrieve related sequences, structures, and references. The Entrez system can provide views of gene and protein sequences and chromosome maps. Some textbooks are also available online through the Entrez system. Entrez also provides a similar interface for searching each particular database and for Page 11 of 15

12 refining search results. The Limits feature allows the user to narrow a search a web forms interface. Entrez searches the following databases: PubMed PubMed Central Site Search Books Online Mendelian Inheritance in Man (OMIM) Nucleotide: sequence database (GenBank) Protein: sequence database Genome: whole genome sequences and mapping Structure: three-dimensional macromolecular structures Taxonomy: organisms in GenBank Taxonomy SNP: single nucleotide polymorphism PubMed PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintains the database as part of the Entrez system of information retrieval. From 1971 to 1997, MEDLINE online access to the MEDLARS Online computerized database primarily had been through institutional facilities, such as university libraries. PubMed, first released in January 1996, ushered in the era of private, free, home- and office-based MEDLINE searching. PubMed comprises more than 27 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites. PubMed provides access to: 1. older references from the print version of Index Medicus, back to 1951 and earlier 2. references to some journals before they were indexed in Index Medicus and MEDLINE, for instance Science, BMJ, and Annals of Surgery 3. very recent entries to records for an article before it is indexed with Medical Subject Headings (MeSH) and added to MEDLINE 4. a collection of books available full-text and other subsets of NLM records. 5. PMC citations Page 12 of 15

13 Taxanomy Browsers The Taxonomy Database is a curated classification and nomenclature for all of the organisms in the public sequence databases. This currently represents about 10% of the described species of life on the planet. Applications of bioinformatics in Biotechnology A number of bioinformatics tools, softwares and databases are available for better understanding of biological complexity and analyze; and store the biological data. Thus, the bioinformatics research is accelerated to avoid time, cost and wet lab practice. There exist a number of applications of bioinformatics for accelerating the research in area of biotechnology that include automatic genome sequencing, gene identification, prediction of gene function, prediction of protein dimensional structure and phylogeny-to name a few. As shown in Fig., role of bioinformatics in different important areas of biotechnology which is one of the fastest growing area that can also help in the identification of organisms, drug discovery, vaccine designing, understand the gene and genome complexity, protein structure, functionality and folding. Figure 1: Role of bioinformatics in different important areas of biotechnology. Page 13 of 15

14 1. Automatic Genome Sequencing: Bioinformatics are used in genome sequencing techniques such as a. Bioinformatics are used in the development of automated sequencing techniques that use PCR or BAC based gene amplifications, two dimensional electrophoresis and also automated reading of nucleotides. b. Bioinformatics are used in joining the sequence of smaller fragments of DNA to form a complete genome sequence. c. Bioinformatics are also used in the prediction of promoter region of the genome. d. Bioinformatics are also used in the prediction of promoter coding region of the genome. Small fragments of genome can be amplified using polymerase chain reaction or bacterial artificial chromosome. Amplified fragments may suffer from nucleotide reading errors, repeats; these in turn generate multiple copies of the same genome fragments. These repeats are removed just before the final assembly of all genome fragments via mathematical models. 2. Identification of Genes: Bioinformatics programmes such as GLIMMER and GenBank are used to identify the coding region or open reading frames in the genome. 3. Identifying Gene Function: After identifying the open reading frames present in the genome, nest step is to annotate the structure and function of the genes. Bioinformatics tool such as sequence search and pair-wise gene alignment technique are used to identify the gene function. The major four algorithms such as BLAST, BLOSUM, ClustalX and SMART are used for the functional annotation of genes. 4. Three-dimensional (3D) Structure Modelling: Single protein may present in different conformational states depending upon its interaction with other proteins present in the vicinity. Three-dimensional structure of the protein has got some regions exposed for protein-protein or protein-dna interactions. Also the function of these proteins are also depends upon these interactions. Bioinformatics techniques are also used to predict the possible conformation of the protein coded by a gene; this in turn helps in identifying the function of the same protein. 5. Pair-wise Genome Comparison: After the identification of functioning of a particular gene, bioinformatics is also used for pair-wise genome comparisons. Pair-wise comparison of a genome with itself is done to get the details of paralogous genes or duplicated genes that have the same base sequence with some functional variations. Also pair-wise genome comparison is also done against other genome to get information such as orthologous genes. These genes are equivalent Page 14 of 15

15 genes present in two different genomes due to speciation. This technique is also used to identify different types of gene-groups and adjacent genes that occur in the close proximity as they are involved in common higher level function, and also lateral-gene transfer that is transfer of genes from a microorganism that is evolutionary distant. Bioinformatics technique is also used to analyse gene-fusion or gene-fission, gene-group duplication and much more. 6. Drug Discovery: Bioinformatics techniques are also used in the process of drug discovery, these speeds up the process. 7. Vaccine Design: Bioinformatics techniques are used in the effective and efficient designing of vaccine and also in the designing of antimicrobial agents. 8. Microbial Genome Sequencing: Bioinformatics is also used in automating microbial genome sequencing. 9. Genome Comparison: Bioinformatics techniques is also used to genome as follows, a. Identifying conserved function within a genome family b. Identifying a specific genes in a group of genomes c. Modelling 3D structure of proteins d. Docking of biochemical compounds as well as receptors. This helps in developing integrated data bases over the internet. This also helps in understanding function of gene and also genome. Conclusion: Bioinformatics has got major impacts on biotechnology and its applications. The vast amount of data generated by human genome project or by other genome sequencing project would be unmanageable without the bioinformatics technique. Without bioinformatics handling, interpretation of these data would be unthinkable. Due to the bioinformatics application cost of all these major projects comes down drastically. Bioinformatics also quickened the drug discovery, vaccine design and also the design of anti-microbial agents. Bioinformatics is also used to understand gene and also genome. Bioinformatics programmes are also used to compare gene-pair alignment, which helps in identifying functions of gene and also genome functionality. Page 15 of 15

Types of Databases - By Scope

Types of Databases - By Scope Biological Databases Bioinformatics Workshop 2009 Chi-Cheng Lin, Ph.D. Department of Computer Science Winona State University clin@winona.edu Biological Databases Data Domains - By Scope - By Level of

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information

ELE4120 Bioinformatics. Tutorial 5

ELE4120 Bioinformatics. Tutorial 5 ELE4120 Bioinformatics Tutorial 5 1 1. Database Content GenBank RefSeq TPA UniProt 2. Database Searches 2 Databases A common situation for alignment is to search through a database to retrieve the similar

More information

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks

Introduction to Bioinformatics CPSC 265. What is bioinformatics? Textbooks Introduction to Bioinformatics CPSC 265 Thanks to Jonathan Pevsner, Ph.D. Textbooks Johnathan Pevsner, who I stole most of these slides from (thanks!) has written a textbook, Bioinformatics and Functional

More information

Sequence Databases and database scanning

Sequence Databases and database scanning Sequence Databases and database scanning Marjolein Thunnissen Lund, 2012 Types of databases: Primary sequence databases (proteins and nucleic acids). Composite protein sequence databases. Secondary databases.

More information

Bioinformatics for Proteomics. Ann Loraine

Bioinformatics for Proteomics. Ann Loraine Bioinformatics for Proteomics Ann Loraine aloraine@uab.edu What is bioinformatics? The science of collecting, processing, organizing, storing, analyzing, and mining biological information, especially data

More information

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl)

Protein Sequence Analysis. BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Protein Sequence Analysis BME 110: CompBio Tools Todd Lowe April 19, 2007 (Slide Presentation: Carol Rohl) Linear Sequence Analysis What can you learn from a (single) protein sequence? Calculate it s physical

More information

Why learn sequence database searching? Searching Molecular Databases with BLAST

Why learn sequence database searching? Searching Molecular Databases with BLAST Why learn sequence database searching? Searching Molecular Databases with BLAST What have I cloned? Is this really!my gene"? Basic Local Alignment Search Tool How BLAST works Interpreting search results

More information

Protein Bioinformatics Part I: Access to information

Protein Bioinformatics Part I: Access to information Protein Bioinformatics Part I: Access to information 260.655 April 6, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Outline [1] Proteins at NCBI RefSeq accession numbers Cn3D to visualize structures

More information

NCBI web resources I: databases and Entrez

NCBI web resources I: databases and Entrez NCBI web resources I: databases and Entrez Yanbin Yin Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1 Homework assignment 1 Two parts: Extract the gene IDs reported in table

More information

Introduction to Bioinformatics for Medical Research. Gideon Greenspan TA: Oleg Rokhlenko. Lecture 1

Introduction to Bioinformatics for Medical Research. Gideon Greenspan TA: Oleg Rokhlenko. Lecture 1 Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il TA: Oleg Rokhlenko Lecture 1 Introduction to Bioinformatics Introduction to Bioinformatics What is Bioinformatics?

More information

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases

Outline. Evolution. Adaptive convergence. Common similarity problems. Chapter 7: Similarity searches on sequence databases Chapter 7: Similarity searches on sequence databases All science is either physics or stamp collection. Ernest Rutherford Outline Why is similarity important BLAST Protein and DNA Interpreting BLAST Individualizing

More information

Overview of Health Informatics. ITI BMI-Dept

Overview of Health Informatics. ITI BMI-Dept Overview of Health Informatics ITI BMI-Dept Fellowship Week 5 Overview of Health Informatics ITI, BMI-Dept Day 10 7/5/2010 2 Agenda 1-Bioinformatics Definitions 2-System Biology 3-Bioinformatics vs Computational

More information

Big picture and history

Big picture and history Big picture and history (and Computational Biology) CS-5700 / BIO-5323 Outline 1 2 3 4 Outline 1 2 3 4 First to be databased were proteins The development of protein- s (Sanger and Tuppy 1951) led to the

More information

Basic Bioinformatics: Homology, Sequence Alignment,

Basic Bioinformatics: Homology, Sequence Alignment, Basic Bioinformatics: Homology, Sequence Alignment, and BLAST William S. Sanders Institute for Genomics, Biocomputing, and Biotechnology (IGBB) High Performance Computing Collaboratory (HPC 2 ) Mississippi

More information

Gene-centered resources at NCBI

Gene-centered resources at NCBI COURSE OF BIOINFORMATICS a.a. 2014-2015 Gene-centered resources at NCBI We searched Accession Number: M60495 AT NCBI Nucleotide Gene has been implemented at NCBI to organize information about genes, serving

More information

I nternet Resources for Bioinformatics Data and Tools

I nternet Resources for Bioinformatics Data and Tools ~i;;;;;;;'s :.. ~,;;%.: ;!,;s163 ~. s :s163:: ~s ;'.:'. 3;3 ~,: S;I:;~.3;3'/////, IS~I'//. i: ~s '/, Z I;~;I; :;;; :;I~Z;I~,;'//.;;;;;I'/,;:, :;:;/,;'L;;;~;'~;~,::,:, Z'LZ:..;;',;';4...;,;',~/,~:...;/,;:'.::.

More information

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015

Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH. BIOL 7210 A Computational Genomics 2/18/2015 Leonardo Mariño-Ramírez, PhD NCBI / NLM / NIH BIOL 7210 A Computational Genomics 2/18/2015 The $1,000 genome is here! http://www.illumina.com/systems/hiseq-x-sequencing-system.ilmn Bioinformatics bottleneck

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Dr. Taysir Hassan Abdel Hamid Lecturer, Information Systems Department Faculty of Computer and Information Assiut University taysirhs@aun.edu.eg taysir_soliman@hotmail.com

More information

Computational Biology and Bioinformatics

Computational Biology and Bioinformatics Computational Biology and Bioinformatics Computational biology Development of algorithms to solve problems in biology Bioinformatics Application of computational biology to the analysis and management

More information

Worksheet for Bioinformatics

Worksheet for Bioinformatics Worksheet for Bioinformatics ACTIVITY: Learn to use biological databases and sequence analysis tools Exercise 1 Biological Databases Objective: To use public biological databases to search for latest research

More information

ONLINE BIOINFORMATICS RESOURCES

ONLINE BIOINFORMATICS RESOURCES Dedan Githae Email: d.githae@cgiar.org BecA-ILRI Hub; Nairobi, Kenya 16 May, 2014 ONLINE BIOINFORMATICS RESOURCES Introduction to Molecular Biology and Bioinformatics (IMBB) 2014 The larger picture.. Lower

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be SUPPLEMENTARY CHAPTER: DATA BASES AND MINING 1 What

More information

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University

Sequence Based Function Annotation. Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Sequence Based Function Annotation Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Usage scenarios for sequence based function annotation Function prediction of newly cloned

More information

Sequence Databases. Chapter 2. caister.com/bioinformaticsbooks. Paul Rangel. Sequence Databases

Sequence Databases. Chapter 2. caister.com/bioinformaticsbooks. Paul Rangel. Sequence Databases Chapter 2 Paul Rangel Abstract DNA and Protein sequence databases are the cornerstone of bioinformatics research. DNA databases such as GenBank and EMBL accept genome data from sequencing projects around

More information

CSC 121 Computers and Scientific Thinking

CSC 121 Computers and Scientific Thinking CSC 121 Computers and Scientific Thinking Fall 2005 Computers in Biology and Bioinformatics 1 Biology biology is roughly defined as "the study of life" it is concerned with the characteristics and behaviors

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Alla L Lapidus, Ph.D. SPbSU St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as "the study of

More information

COMPUTER RESOURCES II:

COMPUTER RESOURCES II: COMPUTER RESOURCES II: Using the computer to analyze data, using the internet, and accessing online databases Bio 210, Fall 2006 Linda S. Huang, Ph.D. University of Massachusetts Boston In the first computer

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Changhui (Charles) Yan Old Main 401 F http://www.cs.usu.edu www.cs.usu.edu/~cyan 1 How Old Is The Discipline? "The term bioinformatics is a relatively recent invention, not

More information

10. BIOTECHNOLOGY (Code No. 045)

10. BIOTECHNOLOGY (Code No. 045) 10. BIOTECHNOLOGY (Code No. 045) An unprecedented growth of human knowledge in the field of Biological Sciences coupled with equally significant developments in the field of technology have brought significant

More information

Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources

Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources Identification of Single Nucleotide Polymorphisms and associated Disease Genes using NCBI resources Navreet Kaur M.Tech Student Department of Computer Engineering. University College of Engineering, Punjabi

More information

What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases.

What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases. What is Bioinformatics? Bioinformatics is the application of computational techniques to the discovery of knowledge from biological databases. Bioinformatics is the marriage of molecular biology with computer

More information

SAMPLE LITERATURE Please refer to included weblink for correct version.

SAMPLE LITERATURE Please refer to included weblink for correct version. Edvo-Kit #340 DNA Informatics Experiment Objective: In this experiment, students will explore the popular bioninformatics tool BLAST. First they will read sequences from autoradiographs of automated gel

More information

Computers in Biology and Bioinformatics

Computers in Biology and Bioinformatics Computers in Biology and Bioinformatics 1 Biology biology is roughly defined as "the study of life" it is concerned with the characteristics and behaviors of organisms, how species and individuals come

More information

Engineering Genetic Circuits

Engineering Genetic Circuits Engineering Genetic Circuits I use the book and slides of Chris J. Myers Lecture 0: Preface Chris J. Myers (Lecture 0: Preface) Engineering Genetic Circuits 1 / 19 Samuel Florman Engineering is the art

More information

BIO 152 Principles of Biology III: Molecules & Cells Acquiring information from NCBI (PubMed/Bookshelf/OMIM)

BIO 152 Principles of Biology III: Molecules & Cells Acquiring information from NCBI (PubMed/Bookshelf/OMIM) BIO 152 Principles of Biology III: Molecules & Cells Acquiring information from NCBI (PubMed/Bookshelf/OMIM) Note: This material is adapted from Web-based Bioinformatics Tutorials: Exploring Genomes by

More information

Gene Identification in silico

Gene Identification in silico Gene Identification in silico Nita Parekh, IIIT Hyderabad Presented at National Seminar on Bioinformatics and Functional Genomics, at Bioinformatics centre, Pondicherry University, Feb 15 17, 2006. Introduction

More information

BIOINFORMATICS Introduction

BIOINFORMATICS Introduction BIOINFORMATICS Introduction Mark Gerstein, Yale University bioinfo.mbb.yale.edu/mbb452a 1 (c) Mark Gerstein, 1999, Yale, bioinfo.mbb.yale.edu What is Bioinformatics? (Molecular) Bio -informatics One idea

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics 260.602.01 September 1, 2006 Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Teaching assistants Hugh Cahill (hugh@jhu.edu) Jennifer Turney (jturney@jhsph.edu) Meg Zupancic

More information

Product Applications for the Sequence Analysis Collection

Product Applications for the Sequence Analysis Collection Product Applications for the Sequence Analysis Collection Pipeline Pilot Contents Introduction... 1 Pipeline Pilot and Bioinformatics... 2 Sequence Searching with Profile HMM...2 Integrating Data in a

More information

Concepts of Bioinformatics

Concepts of Bioinformatics 1. Introduction Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. It is the emerging field that deals with the application

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Course overview What is bioinformatics Data driven science: the creation and advancement

More information

AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1

AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1 AGRO/ANSC/BIO/GENE/HORT 305 Fall, 2016 Overview of Genetics Lecture outline (Chpt 1, Genetics by Brooker) #1 - Genetics: Progress from Mendel to DNA: Gregor Mendel, in the mid 19 th century provided the

More information

The String Alignment Problem. Comparative Sequence Sizes. The String Alignment Problem. The String Alignment Problem.

The String Alignment Problem. Comparative Sequence Sizes. The String Alignment Problem. The String Alignment Problem. Dec-82 Oct-84 Aug-86 Jun-88 Apr-90 Feb-92 Nov-93 Sep-95 Jul-97 May-99 Mar-01 Jan-03 Nov-04 Sep-06 Jul-08 May-10 Mar-12 Growth of GenBank 160,000,000,000 180,000,000 Introduction to Bioinformatics Iosif

More information

The Integrated Biomedical Sciences Graduate Program

The Integrated Biomedical Sciences Graduate Program The Integrated Biomedical Sciences Graduate Program at the university of notre dame Cutting-edge biomedical research and training that transcends traditional departmental and disciplinary boundaries to

More information

Evolutionary Genetics. LV Lecture with exercises 6KP

Evolutionary Genetics. LV Lecture with exercises 6KP Evolutionary Genetics LV 25600-01 Lecture with exercises 6KP HS2017 >What_is_it? AATGATACGGCGACCACCGAGATCTACACNNNTC GTCGGCAGCGTC 2 NCBI MegaBlast search (09/14) 3 NCBI MegaBlast search (09/14) 4 Submitted

More information

Application for Automating Database Storage of EST to Blast Results. Vikas Sharma Shrividya Shivkumar Nathan Helmick

Application for Automating Database Storage of EST to Blast Results. Vikas Sharma Shrividya Shivkumar Nathan Helmick Application for Automating Database Storage of EST to Blast Results Vikas Sharma Shrividya Shivkumar Nathan Helmick Outline Biology Primer Vikas Sharma System Overview Nathan Helmick Creating ESTs Nathan

More information

Introduction to EMBL-EBI.

Introduction to EMBL-EBI. Introduction to EMBL-EBI www.ebi.ac.uk What is EMBL-EBI? Part of EMBL Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands,

More information

Introduction to Molecular Biology Databases

Introduction to Molecular Biology Databases Introduction to Molecular Biology Databases Laboratorio de Bioinformática Centro de Astrobiología INTA-CSIC Centro de Astrobiología PRESENT BIOLOGY RESEARCH Data sources Genome sequencing projects: genome

More information

Entrez Gene: gene-centered information at NCBI

Entrez Gene: gene-centered information at NCBI D54 D58 Nucleic Acids Research, 2005, Vol. 33, Database issue doi:10.1093/nar/gki031 Entrez Gene: gene-centered information at NCBI Donna Maglott*, Jim Ostell, Kim D. Pruitt and Tatiana Tatusova National

More information

Access to Information from Molecular Biology and Genome Research

Access to Information from Molecular Biology and Genome Research Future Needs for Research Infrastructures in Biomedical Sciences Access to Information from Molecular Biology and Genome Research DG Research: Brussels March 2005 User Community for this information is

More information

Translating Biological Data Sets Into Linked Data

Translating Biological Data Sets Into Linked Data Translating Biological Data Sets Into Linked Data Mark Tomko Simmons College, Boston MA The Broad Institute of MIT and Harvard, Cambridge MA September 28, 2011 Overview Why study biological data? UniProt

More information

Bioinformatics. Ingo Ruczinski. Some selected examples... and a bit of an overview

Bioinformatics. Ingo Ruczinski. Some selected examples... and a bit of an overview Bioinformatics Some selected examples... and a bit of an overview Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July 19, 2007 @ EnviroHealth Connections Bioinformatics and

More information

Genetics Lecture 21 Recombinant DNA

Genetics Lecture 21 Recombinant DNA Genetics Lecture 21 Recombinant DNA Recombinant DNA In 1971, a paper published by Kathleen Danna and Daniel Nathans marked the beginning of the recombinant DNA era. The paper described the isolation of

More information

Chapter 20: Biotechnology

Chapter 20: Biotechnology Name Period The AP Biology exam has reached into this chapter for essay questions on a regular basis over the past 15 years. Student responses show that biotechnology is a difficult topic. This chapter

More information

Exploring Similarities of Conserved Domains/Motifs

Exploring Similarities of Conserved Domains/Motifs Exploring Similarities of Conserved Domains/Motifs Sotiria Palioura Abstract Traditionally, proteins are represented as amino acid sequences. There are, though, other (potentially more exciting) representations;

More information

Sequence Analysis Lab Protocol

Sequence Analysis Lab Protocol Sequence Analysis Lab Protocol You will need this handout of instructions The sequence of your plasmid from the ABI The Accession number for Lambda DNA J02459 The Accession number for puc 18 is L09136

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Outline Central Dogma of Molecular

More information

FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE

FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE FACULTY OF BIOCHEMISTRY AND MOLECULAR MEDICINE BIOMOLECULES COURSE: COMPUTER PRACTICAL 1 Author of the exercise: Prof. Lloyd Ruddock Edited by Dr. Leila Tajedin 2017-2018 Assistant: Leila Tajedin (leila.tajedin@oulu.fi)

More information

Genome Sequence Assembly

Genome Sequence Assembly Genome Sequence Assembly Learning Goals: Introduce the field of bioinformatics Familiarize the student with performing sequence alignments Understand the assembly process in genome sequencing Introduction:

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Sequencing the Human Genome

Sequencing the Human Genome The Biotechnology 339 EDVO-Kit # Sequencing the Human Genome Experiment Objective: In this experiment, DNA sequences obtained from automated sequencers will be submitted to Data bank searches using the

More information

NATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF ARTS AND SOCIAL SCIENCES COURSE CODE: BIO 316 COURSE TITLE: INTRODUCTION TO BIOINFORMATICS

NATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF ARTS AND SOCIAL SCIENCES COURSE CODE: BIO 316 COURSE TITLE: INTRODUCTION TO BIOINFORMATICS NATIONAL OPEN UNIVERSITY OF NIGERIA SCHOOL OF ARTS AND SOCIAL SCIENCES COURSE CODE: BIO 316 COURSE TITLE: INTRODUCTION TO BIOINFORMATICS 1 Course Code : BIO 316 Course Title : Introduction to Bioinformatics

More information

1.1 What is bioinformatics? What is computational biology?

1.1 What is bioinformatics? What is computational biology? Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, October 16, 2006 3 1 Introduction 1.1 What is bioinformatics? What is computational biology? Bioinformatics and computational biology are multidisciplinary

More information

Discover the Microbes Within: The Wolbachia Project. Bioinformatics Lab

Discover the Microbes Within: The Wolbachia Project. Bioinformatics Lab Bioinformatics Lab ACTIVITY AT A GLANCE "Understanding nature's mute but elegant language of living cells is the quest of modern molecular biology. From an alphabet of only four letters representing the

More information

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments

BLAST. compared with database sequences Sequences with many matches to high- scoring words are used for final alignments BLAST 100 times faster than dynamic programming. Good for database searches. Derive a list of words of length w from query (e.g., 3 for protein, 11 for DNA) High-scoring words are compared with database

More information

Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA)

Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA) Vol. 6(1), pp. 1-6, April 2014 DOI: 10.5897/IJBC2013.0086 Article Number: 093849744377 ISSN 2141-2464 Copyright 2014 Author(s) retain the copyright of this article http://www.academicjournals.org/jbsa

More information

Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature

Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature Bioinformatics to chemistry to therapy: Some case studies deriving information from the literature. Donald Walter August 22, 2007 The Typical Drug Development Paradigm Gary Thomas, Medicinal Chemistry:

More information

Agenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence

Agenda. Web Databases for Drosophila. Gene annotation workflow. GEP Drosophila annotation projects 01/01/2018. Annotation adding labels to a sequence Agenda GEP annotation project overview Web Databases for Drosophila An introduction to web tools, databases and NCBI BLAST Web databases for Drosophila annotation UCSC Genome Browser NCBI / BLAST FlyBase

More information

Microbial Metabolism Systems Microbiology

Microbial Metabolism Systems Microbiology 1 Microbial Metabolism Systems Microbiology Ching-Tsan Huang ( 黃慶璨 ) Office: Agronomy Hall, Room 111 Tel: (02) 33664454 E-mail: cthuang@ntu.edu.tw MIT OCW Systems Microbiology aims to integrate basic biological

More information

Manipulating DNA. Nucleic acids are chemically different from other macromolecules such as proteins and carbohydrates.

Manipulating DNA. Nucleic acids are chemically different from other macromolecules such as proteins and carbohydrates. Lesson Overview 14.3 Studying the Human Genome Nucleic acids are chemically different from other macromolecules such as proteins and carbohydrates. Nucleic acids are chemically different from other macromolecules

More information

Genome and DNA Sequence Databases. BME 110: CompBio Tools Todd Lowe April 5, 2007

Genome and DNA Sequence Databases. BME 110: CompBio Tools Todd Lowe April 5, 2007 Genome and DNA Sequence Databases BME 110: CompBio Tools Todd Lowe April 5, 2007 Admin Reading: Chapters 2 & 3 Notes available in PDF format on-line (see class calendar page): http://www.soe.ucsc.edu/classes/bme110/spring07/bme110-calendar.html

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS The Bioinformatics book covers new topics in the rapidly

More information

Biology 252 Nucleic Acid Methods

Biology 252 Nucleic Acid Methods Fall 2015 Biology 252 Nucleic Acid Methods COURSE OUTLINE Prerequisites: One semester of college biology (BIO 101 or BIO 173) and one semester of college English (ENG 111); completion of CHM 111is recommended.

More information

FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1)

FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1) FINDING GENES AND EXPLORING THE GENE PAGE AND RUNNING A BLAST (Exercise 1) 1.1 Finding a gene using text search. Note: For this exercise use http://www.plasmodb.org a. Find all possible kinases in Plasmodium.

More information

BIOINFORMATICS AN OVERVIEW

BIOINFORMATICS AN OVERVIEW BIOINFORMATICS AN OVERVIEW T.R. Sharma Genoinformatics Lab, National Research Centre on Plant Biotechnology I.A.R.I, New Delhi 110012 trsharma@nrcpb.org Introduction Bioinformatics is the computational

More information

Bioinformatics, in general, deals with the following important biological data:

Bioinformatics, in general, deals with the following important biological data: Pocket K No. 23 Bioinformatics for Plant Biotechnology Introduction As of July 30, 2006, scientists around the world are pursuing a total of 2,126 genome projects. There are 405 published complete genomes,

More information

ALGORITHMS IN BIO INFORMATICS. Chapman & Hall/CRC Mathematical and Computational Biology Series A PRACTICAL INTRODUCTION. CRC Press WING-KIN SUNG

ALGORITHMS IN BIO INFORMATICS. Chapman & Hall/CRC Mathematical and Computational Biology Series A PRACTICAL INTRODUCTION. CRC Press WING-KIN SUNG Chapman & Hall/CRC Mathematical and Computational Biology Series ALGORITHMS IN BIO INFORMATICS A PRACTICAL INTRODUCTION WING-KIN SUNG CRC Press Taylor & Francis Group Boca Raton London New York CRC Press

More information

Comparative Bioinformatics. BSCI348S Fall 2003 Midterm 1

Comparative Bioinformatics. BSCI348S Fall 2003 Midterm 1 BSCI348S Fall 2003 Midterm 1 Multiple Choice: select the single best answer to the question or completion of the phrase. (5 points each) 1. The field of bioinformatics a. uses biomimetic algorithms to

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

New Programs in Quantitative Biology: Hunter College.

New Programs in Quantitative Biology: Hunter College. New Programs in Quantitative Biology: QuBi @ Hunter College What is QuBi? Quantitative Biology An initiative to join computational and quantitative disciplines to the analysis of biological data. Bioinformatics,

More information

What is Bioinformatics? A Proposed Definition and Overview of the Field

What is Bioinformatics? A Proposed Definition and Overview of the Field 346 2001 Schattauer GmbH A Proposed Definition and Overview of the Field N. M. Luscombe, D. Greenbaum, M. Gerstein Department of Molecular Biophysics and Biochemistry Yale University, New Haven, USA Summary

More information

Fundamentals of Bioinformatics: computation, biology, computational biology

Fundamentals of Bioinformatics: computation, biology, computational biology Fundamentals of Bioinformatics: computation, biology, computational biology Vasilis J. Promponas Bioinformatics Research Laboratory Department of Biological Sciences University of Cyprus A short self-introduction

More information

GenBank. Dennis A. Benson*, Mark S. Boguski, David J. Lipman, James Ostell and B. F. Francis Ouellette

GenBank. Dennis A. Benson*, Mark S. Boguski, David J. Lipman, James Ostell and B. F. Francis Ouellette 1998 Oxford University Press Nucleic Acids Research, 1998, Vol. 26, No. 1 1 7 GenBank Dennis A. Benson*, Mark S. Boguski, David J. Lipman, James Ostell and B. F. Francis Ouellette National Center for Biotechnology

More information

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS*

Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* COMPUTATIONAL METHODS IN SCIENCE AND TECHNOLOGY 9(1-2) 93-100 (2003/2004) Ab Initio SERVER PROTOTYPE FOR PREDICTION OF PHOSPHORYLATION SITES IN PROTEINS* DARIUSZ PLEWCZYNSKI AND LESZEK RYCHLEWSKI BiolnfoBank

More information

BLAST. Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences.

BLAST. Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences. BLAST Basic Local Alignment Search Tool. Optimized for finding local alignments between two sequences. An example could be aligning an mrna sequence to genomic DNA. Proteins are frequently composed of

More information

2. Materials and Methods

2. Materials and Methods Identification of cancer-relevant Variations in a Novel Human Genome Sequence Robert Bruggner, Amir Ghazvinian 1, & Lekan Wang 1 CS229 Final Report, Fall 2009 1. Introduction Cancer affects people of all

More information

EXECUTIVE SUMMARY GLOBAL GENOMICS AND BIOINFORMATICS RESEARCH INSTITUTE. May 24, 2017

EXECUTIVE SUMMARY GLOBAL GENOMICS AND BIOINFORMATICS RESEARCH INSTITUTE. May 24, 2017 EXECUTIVE SUMMARY GLOBAL GENOMICS AND BIOINFORMATICS RESEARCH INSTITUTE May 24, 2017 Introduction to the Institute Inova Health System Foundation ( Inova ), The Rector and Visitors of the University of

More information

DNA is normally found in pairs, held together by hydrogen bonds between the bases

DNA is normally found in pairs, held together by hydrogen bonds between the bases Bioinformatics Biology Review The genetic code is stored in DNA Deoxyribonucleic acid. DNA molecules are chains of four nucleotide bases Guanine, Thymine, Cytosine, Adenine DNA is normally found in pairs,

More information

An Introduction to Bioinformatics for Biological Sciences Students

An Introduction to Bioinformatics for Biological Sciences Students An Introduction to Bioinformatics for Biological Sciences Students Department of Microbiology and Immunology, McGill University Version 2.5 (For the BIOC-300 lab), March 2006 2 AN INTRODUCTION TO BIOINFORMATICS

More information

Introduction to Bioinformatics and Gene Expression Technologies

Introduction to Bioinformatics and Gene Expression Technologies Introduction to Bioinformatics and Gene Expression Technologies Utah State University Fall 2017 Statistical Bioinformatics (Biomedical Big Data) Notes 1 1 Vocabulary Gene: hereditary DNA sequence at a

More information

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM)

BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) BIOINFORMATICS AND SYSTEM BIOLOGY (INTERNATIONAL PROGRAM) PROGRAM TITLE DEGREE TITLE Master of Science Program in Bioinformatics and System Biology (International Program) Master of Science (Bioinformatics

More information

Small Genome Annotation and Data Management at TIGR

Small Genome Annotation and Data Management at TIGR Small Genome Annotation and Data Management at TIGR Michelle Gwinn, William Nelson, Robert Dodson, Steven Salzberg, Owen White Abstract TIGR has developed, and continues to refine, a comprehensive, efficient

More information

Introduction to Bioinformatics Accompaniment to Discovering Genomics

Introduction to Bioinformatics Accompaniment to Discovering Genomics Introduction to Bioinformatics Accompaniment to Discovering Genomics Reading in Discovering Genomics, Proteomics, and Bioinformatics by Campbell & Heyer (henceforth referred to as DGPB by C&H) The following

More information

Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods

Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods Companion lecture to the textbook: Fundamentals of BioMEMS and Medical Microdevices, by Prof., http://saliterman.umn.edu/

More information

CHAPTER 2A HOW DO YOU BEGIN TO CLONE A GENE? CHAPTER 2A STUDENT GUIDE 2013 Amgen Foundation. All rights reserved.

CHAPTER 2A HOW DO YOU BEGIN TO CLONE A GENE? CHAPTER 2A STUDENT GUIDE 2013 Amgen Foundation. All rights reserved. CHAPTER 2A HOW DO YOU BEGIN TO CLONE A GENE? 35 INTRODUCTION In the Program Introduction, you learned that the increase in diabetes in the United States has resulted in a great demand for its treatment,

More information

QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd

QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd 1 Our current NGS & Bioinformatics Platform 2 Our NGS workflow and applications 3 QIAGEN s

More information

A Survey On Biological Databases And Applications Of Datamining

A Survey On Biological Databases And Applications Of Datamining Australian Journal of Basic and Applied Sciences, 6(13): 175-180, 2012 ISSN 1991-8178 A Survey On Biological Databases And Applications Of Datamining 1 N Saravanan, 2 T Devi 1 PhD Research Scholar Department

More information

Web based Bioinformatics Applications in Proteomics. Genbank

Web based Bioinformatics Applications in Proteomics. Genbank Web based Bioinformatics Applications in Proteomics Chiquito Crasto ccrasto@genetics.uab.edu February 9, 2010 Genbank Primary nucleic acid sequence database Maintained by NCBI National Center for Biotechnology

More information