REVIEW OF LITERATURE

Size: px
Start display at page:

Download "REVIEW OF LITERATURE"

Transcription

1 REVIEW OF LITERATURE The Members of the Solanaceae family are in agriculture are tomato (Solanum lycopersicum and Solanum pimpinellifolium), pepper (Capsicum spp), eggplant (Solanum melongena), and potato (Solanum tuberosum); pharmacologically significant are tobacco (Nicotiana tabacum) and mandrake (Mandragora officinarum). Potato is the economically most important species within the Solanaceae. Potato is rich source of starch, protein, antioxidants, and vitamins. Tomato is second most consumed vegetable; it is important due to its dietary source of lycopene, beta-carotene, vitamin C and fiber. Solanum lycopersicum and S. pimpinellifolium (only two species with red fruits due to presence of lycopene) are also used as model species for fundamental research on disease resistance, pathogen response, stress tolerance, and fruit quality and development; wild S. pimpinellifolium is the closest wild progenitor of domesticated Solanaceae plants and together they are widely used in tomato breeding as a source material (Tanksley et al., 1992; Livingstone et al., 1999). The nuclear genome of potato and tomato consists of twelve chromosomes. Their genomes are measuring approximately 840 Mb and 950 Mb in size, respectively. Several large-scale rearrangements have been identified between the two genomes. Potato and tomato chromosomes display a similar morphology. The era of cost-effective next-generation sequencing technologies has enabled the rapid sequencing of a large number of plant genomes. Developing creation tools for structural and functional annotations are essential for comparative and evolutionary studies (Tanksley et al., 1992; Doganlar et al., 2002). 2.1 Tomato and Potato genomics The Solanum lycopersicum genomics is in an exciting phase of development following the recent sequencing of the potato and tomato genomes. Tomato was originated in South America, and was spread to rest of the world to become one of the most extensively used vegetable crops. Tomato is used for both basic and applied plant research. Several genetic and genomic resources were available for tomato before the inception of the tomato genome sequencing project. Large germplasm collections consisting of numerous accessions of landraces of tomato are a source of valuable disease resistance and other genes that had been exploited by breeders to develop modern cultivated tomato varieties (Majoros and Salzberg, 2004). Chapter II 5 Review of Literature

2 2.2 Genome analysis The Sol Genomics Network (SGN; is a comparative genomics platform, with genetic, genomic and phenotypic information of the Solanaceae family and its closely related species that incorporates a community-based gene and phenotype curation system. Well-annotated multigene families are useful for further exploration of genome organization and gene evolution across species. As an example, the multigene transcription factor families, WRKY and Small Auxin Up-regulated RNA (SAUR), both play important roles in responding to abiotic stresses in plants (Gascuel et al., 2001; Gremme Genome Annotation et al., 2005; Database URL: The linear strings of nucleotides that together comprise a genome sequence are of limited interest by themselves. Genome annotation encompasses the process of assigning a plausible biological interpretation to a genome sequence through identification and characterization of the elements contained therein. Ideally, each element in a genome such as a gene or a transposon is annotated through experimentally obtained evidence. The disparity in throughput with which genomes can be sequenced and assembled compared to the laboriousness of experiments to determine the precise structure and function of a single element has resulted in the development of algorithms that can predict these features in a genome (Majoros and Salzberg, 2004; Korf, 2004; Lomsadze et al., 2005; Stanke et al., 2008). Genome annotation include structure annotation and function annotation where structural genome annotation is the determination of the precise position, boundaries and composition of different sequence elements, whereas function annotation attempts to assign a probable biological function to each of these sequence elements. The quality of the WGS assembly through alignment to Sanger-derived phases 2 BAC sequences. In an alignment length of ~1 Mb (99.4% coverage), no gross assembly errors were detected. Alignment of cosmid and BAC paired-end sequences to the WGS scaffolds revealed limited ( 0.12%) potential misassemblies. Extensive coverage of the potato genome in this assembly was confirmed using available expressed sequence tag (EST) data; 97.1% of 181,558 available Sanger-sequenced S. tuberosum ESTs (>200 bp) were detected. Repetitive sequences account for atleast 62.2% of the assembled genome (452.5 Mb) with long terminal repeat retrotransposons comprising the majority of the transposable element classes, representing 29.4% Chapter II 6 Review of Literature

3 of the genome. In addition, subtelomeric repeats were identified at or near chromosomal ends (Figure 2.1) (The Potato Genome Sequencing Consortium, 2011). The genome of the inbred tomato cultivar Heinz 1706 was sequenced and assembled using a combination of Sanger and next generation technologies. The predicted genome size is approximately 900 megabases (Mb), consistent with previous estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosomes, with most gaps restricted to pericentromeric regions (Figure 2.2). Base accuracy is approximately one substitution error per 29.4 kilobases (kb) and one indel error per 6.4 kb. The scaffolds were linked with two bacterial artificial chromosomes (BAC)-based physical maps and anchored/oriented using a high-density genetic map, introgression line mapping and BAC fluorescence in situ hybridization (FISH) (The Tomato Genome Consortium, 2012). Annotation of a genome sequence involves the execution of a number of different tools in a particular combination and order on a collection of sequences, ranging from a small number of complete chromosomes to hundreds or thousands of sequence contigs and scaffolds. Genome sequences from evolutionary related species can readily be annotated using the same software tools albeit sometimes with modified parameters and reference data. These properties make genome annotation a repetitive, modular task with many inter-task dependencies that can be described (Burge and Karlin, 1997; Allen and Salzberg, 2005; Picardi and Pesole, 2010). Figure 2.1: The potato genome organization (Source: The Potato Genome Sequencing Consortium, 2011). Chapter II 7 Review of Literature

4 Figure 2.2. Tomato genome topography and synteny (Source: The Tomato Genome Consortium, 2012) 2.4 Structural and functional genome annotation All the features in a genome, protein-coding genes are the most extensively studied. In eukaryotes, the coding region may be interrupted by introns, which can be identified computationally through conserved signals on the border between the coding exons and non-coding introns (splice sites) (Warburton et al., 2004; Kofler et al., 2007). Once the elements in a genome sequence have been identified, the next step is to assign to them with a plausible biological function. Computational inference of the function of a particular sequence can be achieved either directly through sequence similarity searches, or indirectly through the identification of common motifs or domains between groups of functionally related sequences. Both methodologies exploit the wealth of sequence annotations that have been generated and deposited in public databases in the past decades to annotate a newly generated sequence. The accuracy and reliability of annotations derived from database searches depend Chapter II 8 Review of Literature

5 strongly on both the availability of evolutionarily related sequences in the databases, and the quality of their annotations. BLAST searches and can be used on many public as well as private sequence databases (Altschul et al.,1997; Benson et al., 2011; Magrane and Consortium, 2011). Motif and domain searches provide a more coarse-grained alternative to sequence similarity searches. While the latter focus on the similarity between two sequences over their whole length, the former rely on the conservation of small subsequences within a group of functionally related sequences. Prime examples of such conserved subsequences are protein domains, the modular functional sub-parts of proteins. Domains can be identified and extracted from a multiple sequence alignment of functionally related proteins and represented as HMMs or WMMs, which in turn can be used to query novel protein sequences (Hunter et al., 2009; Gene Ontology Consortium, 2010). 2.5 Comparative genomics of Solanum lycopersicum analysis The integration and advancements of molecular biology, evolution, and computer science over the past few decades have led to the development of several new fields of study. Comparative genomics, the study of the similarities and differences between two or more genomes, continues to be fueled by the rapidly growing number of fully sequenced genomes in databases. Comparative genomics has also been aided in the rapid advancements in sequencing technologies over the last few decades. Next Generation (NG) sequencing technologies have become a great resource to the genomics community because of the extremely low 'per base' cost of sequencing (Fickett and Tung, 1992; Guigo, 1998; Rice et al., 2000). In earlier studies on sequence analysis of Solanaceae gene families, P450 mono-oxygenases and serine threonine protein kinases were found to be overrepresented in potato as compared with tomato, and in both plants, the P450 genes were expanded much more than in Arabidopsis thaliana. Confirmation of computationally identified gene families with experimental data adds the next layer of annotation, and these genes can serve as significant anchors in the genome sequence for researchers looking for unknown genes located near the experimentally validated genes. Along with micro-synteny, the conserved order of genes and gene families across organisms are important data types for genome comparisons. Compared to the potato genome, the tomato and S. pimpinellifolium genomes showed more than 8% nucleotide divergence. Chapter II 9 Review of Literature

6 Moreover, the tomato genome is highly syntenic with the genomes of other economically important members of the family Solanaceae, such as eggplant and pepper. Comparative genome analysis identified two consecutive triplication events in the Solanum lineage. Interestingly, these genome triplications added new gene family members such as transcription factors and enzymes necessary for ethylene biosynthesis and perception, which mediate important fruit-specific functions (Eddy, 2001). The most common repeat families in the tomato libraries were the Gypsy ( %) and Copia ( %) classes of retrotransposons. Another prominent class of repeats comprised the ribosomal RNA genes (< %). The tomato Eco (EcoRI) library had the lowest repeat density at 13.0%, which can be attributed to a lower amount of Gypsy retrotransposons (5.0%). The highest repeat content was found in the tomato Mbo (MboI) library (22.9%), more than a third of which (8.6%) consisted of ribosomal RNA genes (Datema, 2011). An initial effort was made to compare the gene and repeat content of the tomato and potato genomes, based on the available BAC-end sequences for both species. The BAC-end sequence comparison is of particular interest as it provides a picture for the complete genome, including both euchromatic and heterochromatic sequence. Comparison using only sequenced tomato BACs will mainly provide a comparison between the euchromatin of tomato and potato. In total, 310,580 BAC-end sequences representing ~19% of the 950-Mb tomato genome were compared to 128,819 potato BAC end sequences representing ~10% of the 840-Mb potato genome (Song et al., 2000; Chen et al., 2004). It is important to note that while most potato varieties used in agriculture are tetraploid, the potato line being sequenced is diploid. The tomato genome has a higher overall dispersed repeat content than the potato genome, with the majority of dispersed repeats in both species belonging to the Gypsy and Copia retrotransposon families. On the other hand, simple sequence repeats (SSRs) motifs are more abundant in potato than in tomato. In both genomes, penta-nucleotide repeats are the most common form of SSRs, and AAAAT is the predominant repeat motif. This is in contrast to previously studied plant species, in which di- and penta-nucleotide repeats generally occur least frequently. Taking into account the difference in genome size and assuming that tomato has ~40,000 genes, potato appears to contain up to 6400 more putative coding regions than tomato. Moreover, the P450 superfamily appears to have expanded dramatically in both species compared with Chapter II 10 Review of Literature

7 Arabidopsis thaliana, suggesting an expanded network of specialized metabolic pathways in the Solanaceae (Kanyuka et al., 1999; Song et al., 2000; Chen et al., 2004). The recently published genome sequences of tomato and potato have shed new light on chromosomal organization in the Solanaceae family, and provide insights into the evolution of plant genomes. 2.6 Genome sequencing and annotation Potato is the first sequenced genome of an Asterid, a clade within Eudicots that encompasses nearly 70,000 species characterised by unique morphological, developmental and compositional features. In potato genome the total number of 39,031 protein-coding genes were predicted (The Potato Genome Sequencing Consortium, 2011) while in tomato genome the total number of protein-coding genes is 34,727 (The Tomato Genome Consortium, 2012) and identified 18,320 orthologous tomato potato gene pairs. 2.7 Improvement for plant resistance against abiotic stresses Plant response to abiotic stress is the result of complex synchronized actions of gene networks. Drought is a worldwide problem and understanding the processes underlying drought stress tolerance in plants is a high priority in many research projects. Drought resistance is a complex process, and little is known about the molecular mechanisms underlying the plant response and tolerance. The responses involve biochemical, physiological, molecular, cellular and whole-plant changes. Genetic, molecular and genomic analyses of drought response and tolerance in a number of plants such as Arabidopsis, rice, maize, tomato and other plants have revealed several drought-inducible genes that appear to play different roles in managing drought stress (Duque et al., 2013). 2.8 Basic facts about the Potato Genome Cultivated potato has a chromosome number of 2n = 4x = 48, and a haploid genome size of 850Mb, roughly six times that of Arabidopsis thaliana and twice the size of the rice genome. The potato genome is very similar in size to its close relative tomato, and genetic maps of the two species show high levels of macrocolinearity. Two genomes are conserved at the microsyntenic level should start to become available as outputs from the respective genome projects accumulate. The tomato genome mainly comprises low-copy-number sequences, which diverged rapidly in evolutionary time (Wang et al., 2005; Zhu et al., 2008). Chapter II 11 Review of Literature

8 It is also known that the majority of tomato heterochromatin is found in centromeric regions with almost all of the euchromatic DNA located distally in long uninterrupted tracts, a structural feature likely to be true of potato. Gene isolation and recent BAC-end sequencing efforts are providing the first detailed glimpses of the genome structure in potato. Using BAC-end sequence and full BAC sequence data, it has also been shown that potato (34%) contains considerably less repetitive DNA than tomato (46%), this difference being consistent with relative genome sizes of the two crops (850 versus 1000Mb, resp.) (Tanksley et al., 1992; Gavrilenko, 2007). Schweizer et al. (2009) who characterised the potato genome in terms of the amounts of different classes of repetitive DNA, suggest that the more highly repeated sequences comprise only 4 7% of the potato genome, suggesting that it was relatively devoid of highly repetitive DNA sequences, thus supporting the earlier tomato study. The generation of large expressed sequence tag (EST) collections is a primary route for large-scale gene discovery. There have been several efforts to generate EST resources for potato. The potato gene index ( compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gimain.pl?) contains almost ESTs, assembled into more than contigs with over singletons. These efforts, while not exhaustive, comprise a major genomics resource for potato researchers, perhaps comprising between 50 70% of the total potato gene repertoire. These ESTs will form an important source not only for the discovery of candidate genes and genetic markers, but also for the development of microarrays, until the whole genome sequence becomes available in potato. The tomato and potato sequencing projects will have huge implications for those working in the Solanaceae, and will further sharpen the requirement for functional genomics tools (Van der Hoeven et al., 2002). Potato, a highly heterozygous tetraploid, is undergoing an exciting phase of genomics resource development. The potato research community has established extensive genomic resources, such as large expressed sequence tag (EST) data collections, microarrays and other expression profiling platforms, and large-insert genomic libraries. Moreover, potato will now benefit from a global potato physical mapping effort, which is serving as the underlying resource for a full potato genome sequencing project, now well underway. These tools and resources are having a major impact on potato breeding and genetics. The genome sequence will provide an invaluable comparative genomics resource for cross-referencing to the other Solanaceae, notably tomato, Chapter II 12 Review of Literature

9 whose sequence is also being determined. Most importantly perhaps, a potato genome sequence will pave the way for the functional analysis of the large numbers of potato genes that await discovery. Potato, being easily transformable, is highly amenable to the investigation of gene function by biotechnological approaches (Zamir and Tanksley, 1988; Schweizer et al., 1993). The available potato EST resources comprise an unknown but significant fraction of the gene complement of potato, and are derived from several genotypes, tissues, and environmental influences. A nonredundant set of of these ESTs was used by the Institute for Genomic Research (TIGR) to develop a cdna potato microarray that was made available to the research community at minimal cost. Moreover, the same organisation offered a transcription profiling service to allow the evaluation of these arrays by a wide range of users working on different Solanaceous plant species asking different biological questions. This allowed generation of massive microarray data that is publicly available ( service2.shtml#aprocedure). These studies, while informative, highlight the dilemma faced by plant molecular biologists in prioritizing genes for further study from a large number of candidate genes in the absence of genetic information and mutations in target trait genes (Arumuganathan and Earle, 1991). 2.9 Functional Studies in Potato Potato geneticists and breeders have generated a great deal of information about the location of genes and QTLs coding for important potato traits, including pest and disease resistance and tuber traits. Developments in genetics and structural genomics are beginning to be matched by concomitant development of functional genomics tools. Potato has a strong need for a highdensity gene map or a genome sequence, to place gene sequences in their genetic/genomic context. Relatively high-throughput methods are also needed for testing and assessing gene function. The availability of mutant populations of potato will also be of tremendous value in this regard (Simko et al., 2006; Muth et al., 2008). Potato cultivars are highly heterozygous and contain very high levels of genetic load. It has been estimated that there is one SNP approximately at every 25 bp. There have been some recent tantalizing developments in functional genetics/genomics tools and resources for potato. Gene expression profiling or microarray studies have a role to play in the identification of a Chapter II 13 Review of Literature

10 pool of candidate genes potentially involved in any given biological process. These methods, in combination with other functional genomics tools such as RNA interference (RNAi), virus-induced gene silencing (VIGS), and activation tagged lines, have the potential to facilitate the identification of the role of thousands of potato genes over the next several years (Simko et al., 2006; Muth et al., 2008). Potato has entered an exciting new era, whereby the development of extensive genetic and genomic resources have opened up many new possibilities for studying important potato traits relevant to potato agronomy. Concomitant development of similar resources for other Solanaceous species, notably tomato, and a growing cohesiveness of the Solanaceae research community, as demonstrated by the SOL vision bode well for future genomic research of potato and its close relatives ( Development of biotechnological tools for assaying potato gene function is likely to progress rapidly in the coming years (Simko et al., 2006; Muth et al., 2008) Computational approaches for plant breeding and genome research The greatest challenge facing the molecular biology community today is to make sense of the wealth of data that has been produced by the genome sequencing projects. Traditionally, molecular biology research was carried out entirely at the experimental laboratory bench but the huge increase in the scale of data being produced in this genomic era has realized a need to incorporate computers into the research process. With the advent of new tools and databases in molecular biology we are now enable to carry out the research not only at genome level but also at proteome, transcriptome and metabalome levels. Bioinformatics is an interdisciplinary area of the science composed of biology, mathematics and computer science. Bioinformatics is the application of information technology to manage biological data that helps in decoding plant genomes. During the last decades enormous data has been generated in biological science, firstly, with the onset of sequencing the genomes of model organisms and secondly, rapid application of high throughput experimental techniques in laboratory research. Biological research that earlier used to start in laboratories, fields and plant clinics is now starts at the computational level using computers for analysis of the data, experiment planning and hypothesis development. Application of various bioinformatics tools in biological research enables storage, retrieval, analysis, annotation and visualization of results and promotes better understanding of biological Chapter II 14 Review of Literature

11 system in fullness. This will help in plant health care based disease diagnosis to improve the quality of Plant life. The sequencing of the genomes of plants and animals will provide enormous benefits for the agricultural community. Bioinformatics tools can be used to search for the genes within those genomes that are useful for the agricultural community and to elucidate their functions. This specific genetic knowledge could then be used to produce stronger, more drought, disease and insect resistant crops and improve the quality, making them healthier, more disease resistant and more productive (Martienssen, 2004). Bioinformatics is the key for realizing the full potential of post-genomic revolution moving plant science toward crop systems biology. It will help in exploring the benefit of bioinformatics application to plant research and, particularly, to crop science. Plant biologists and information technology specialists can contribute equally to such a task by organizing their work in a collaborative and interdisciplinary manner, thus applying in the most effective way their different technical skills to solve agricultural problems. Genomics, proteomics, metabolomics can be applied for data integration known as phenome that comprises all the layers of the phenotype which is studied as a whole and arise from the interaction of the genome with the environment. As for all the other omics, the phenome of a plant thus represents the sum total of its phenotypic traits (Faccioli et al., 2009). InterProScan employs a large collection of protein domain databases to identify conserved protein signatures in sequences of interest. Software like MUMmer and BLASTZ/LASTZ has been developed to align complete genome sequences and extract the variation between them. Tools to predict gene sequences in a genome range from naïve Open Reading Frame (ORF) predictors like getorf from the EMBOSS suite to complex eukaryotic gene finders such as geneid, genscan and Glimmer HMM ( Comparisons of gene content and gene order Gene content comparisons were performed with Multipipmaker (Schwartz et al., 2003), GenomeThreader, EuGene and JIGSAW is used for identification and masking of repetitive elements in a genome sequence a database of known repeat sequences, for example using the RepeatMasker software with the RepBase database. The RECON software identifies repeats through their multiplicity in the genome sequence, without taking into account a prior knowledge Chapter II 15 Review of Literature

12 about the structure of the elements. Annotation of the potato and tomato chloroplast genomes was performed using DOGMA (Dual Organellar GenoMe Annotator; jgipsf.org/dogma). An aligned data set of all of the shared genes among the four Solanaceae chloroplast genomes was constructed by extracting these sequences from the annotated genomes either using DOGMA or the Chloroplast Genome Database. The sequences were aligned using ClustalX followed by manual adjustments using Seq Ap Comparison of intergenic spacer regions Intergenic regions from four Solanaceae chloroplast genomes were compared using MultiPipMaker ( tools.html). The modern biologist works in an unprecedented time in the history of scientific knowledge. There are several factors that have been converging over the last few decades that have increased scientific productivity in a profound way. First, molecular biology has continued to be advanced through the tools and techniques that help researchers to study genetics at the molecular level. Finally, the large volume of sequence data combined with the continued advances in the desktop computer, have led to the integration of molecular biology, evolution, and computer science. This integration has led to the development of several new fields of study including bioinformatics, genomics, and comparative genomics. Bioinformatics is the use of computer science, mathematics, and information theory to model and analyze biological systems. While genomics is the study of an organism s entire genome, comparative genomics is the study of similarities and differences between two or more genomes. PlantTribes offers a unique view of gene families and plant genomes that facilitate comparative analyses. Additional information is connected to each sequence; including domain presence from NCBI s Conserved Domain Database (CDD). The PlantTribes database offers a unique and powerful view of plant genomes and evolution. The automated annotation of the genomes provide an important source of information for gene analysis; however, automated methods do not yet produce perfect results and sometimes require intervention using manual curation. Tools used for this purpose include GenomeView and Apollo ( tools.html). Chapter II 16 Review of Literature

13 2.13 Plant genomes : current status Plant breeding and genetics are powerful tools for increasing plant productivity through development of improved varieties. The rapid progress of plant genomics in recent years has opened new possibilities in targeted breeding of specific traits, and provides a powerful approach to sustainable crop production. Plant genomics, in combination with genetics and breeding, has a particularly crucial role to play in ensuring food security to the rapidly growing world population. Many plant genomes are large and complex due to an abundance of transposable elements and a long history of repeated genome duplication, making genome sequencing a major challenge. The era of plant genomics began with release of the Arabidopsis genome sequence in It was a milestone in plant biology and made Arabidopsis one of the most popular species for basic plant research. Rice, a staple food in most of the world, was the second available plant genome in However, members of the group Euasterids, which has many plants of economic importance, were not represented in the list of known plant genome sequences until the release of the potato (Solanum tuberosum) genome belonging to the family Solanaceae. The recently decoded genome sequences of tomato (Solanum lycopersicum) and its close wild relative (Solanum pimpinellifolium) are significant additions to published Euasterid genomes. These genomes will not only promote plant genomics and breeding studies for crop improvement programs in the Euasterids, particularly in the family Solanaceae, but also provide an unprecedented opportunity for basic plant biological research in the area of development and evolution ( Future directions of tomato genetics and genomics Modern tomato genetics had already used molecular markers and functional analysis to identify a handful of genes underlying developmental or yield traits, but the availability of the tomato genome sequence will further revolutionize tomato genetics and breeding. This will not only help to identify useful SNPs from the wild accessions but also identify rare SNPs within domesticated varieties. Tomato breeders can then target gene variants (SNPs) in the wild species associated with desirable traits such as disease or pest resistance or growth in extreme environmental conditions and introduce them into cultivars in order to exploit the rich tomato germplasm for breeding purposes. More genome sequences will facilitate QTL identification, mapping and cloning of underlying genes, and provide new SNP markers for Chapter II 17 Review of Literature

14 marker-assisted breeding. Additionally, millions of informative markers (SNPs/InDels) and structural variations, such as duplications, inversions, transpositions, and so on, identified through comparison of genome sequences of domesticated and wild tomatoes will promote investigations into the genetic and molecular basis of the process of domestication and crop improvement. Integrating additional functional genomics approaches such as metabolomics and proteomics can significantly reduce the number of candidate genes for a given QTL. One of the major thrusts of functional genomics in future will be RNA-seq enabled transcriptome profiling. For example, comparison of transcriptome profiles from domesticated and wild tomato species will give us insights into the gene expression differences associated with the process of domestication and trait diversity. The tomato functional genomics database (TFGD), which includes microarray, metabolite and small RNA data, has already been established as a comprehensive resource even before the complete tomato genome sequence was released. Availability of the tomato genome sequence will speed up the understanding of gene function in developmental and metabolic pathways and identify key steps in co-regulation mechanisms by mapping relevant tomato mutants. Additionally, multiple TILLING (Targeting Induced Local Lesions IN Genomes) resources in different backgrounds have already been developed for tomato functional genomics. These TILLING resources, in combination with the tomato genome sequence, should be useful for both forward and reverse genetics in tomato for both basic science and/or crop improvement (Aashish et al., 2012). Chapter II 18 Review of Literature

Genome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013)

Genome annotation. Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation Erwin Datema (2011) Sandra Smit (2012, 2013) Genome annotation AGACAAAGATCCGCTAAATTAAATCTGGACTTCACATATTGAAGTGATATCACACGTTTCTCTAAT AATCTCCTCACAATATTATGTTTGGGATGAACTTGTCGTGATTTGCCATTGTAGCAATCACTTGAA

More information

Genome annotation & EST

Genome annotation & EST Genome annotation & EST What is genome annotation? The process of taking the raw DNA sequence produced by the genome sequence projects and adding the layers of analysis and interpretation necessary

More information

I.1 The Principle: Identification and Application of Molecular Markers

I.1 The Principle: Identification and Application of Molecular Markers I.1 The Principle: Identification and Application of Molecular Markers P. Langridge and K. Chalmers 1 1 Introduction Plant breeding is based around the identification and utilisation of genetic variation.

More information

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent

More information

GREG GIBSON SPENCER V. MUSE

GREG GIBSON SPENCER V. MUSE A Primer of Genome Science ience THIRD EDITION TAGCACCTAGAATCATGGAGAGATAATTCGGTGAGAATTAAATGGAGAGTTGCATAGAGAACTGCGAACTG GREG GIBSON SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc.

More information

Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly

Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Supplementary Tables Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly Library Read length Raw data Filtered data insert size (bp) * Total Sequence depth Total Sequence

More information

Genome research in eukaryotes

Genome research in eukaryotes Functional Genomics Genome and EST sequencing can tell us how many POTENTIAL genes are present in the genome Proteomics can tell us about proteins and their interactions The goal of functional genomics

More information

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome Lectures 30 and 31 Genome analysis I. Genome analysis A. two general areas 1. structural 2. functional B. genome projects a status report 1. 1 st sequenced: several viral genomes 2. mitochondria and chloroplasts

More information

TIGR THE INSTITUTE FOR GENOMIC RESEARCH

TIGR THE INSTITUTE FOR GENOMIC RESEARCH Introduction to Genome Annotation: Overview of What You Will Learn This Week C. Robin Buell May 21, 2007 Types of Annotation Structural Annotation: Defining genes, boundaries, sequence motifs e.g. ORF,

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro

Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Sequencing the genomes of Nicotiana sylvestris and Nicotiana tomentosiformis Nicolas Sierro Philip Morris International R&D, Philip Morris Products S.A., Neuchatel, Switzerland Introduction Nicotiana sylvestris

More information

Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline

Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Wheat Genome Structural Annotation Using a Modular and Evidence-combined Annotation Pipeline Xi Wang Bioinformatics Scientist Computational Life Science Page 1 Bayer 4:3 Template 2010 March 2016 17/01/2017

More information

3I03 - Eukaryotic Genetics Repetitive DNA

3I03 - Eukaryotic Genetics Repetitive DNA Repetitive DNA Satellite DNA Minisatellite DNA Microsatellite DNA Transposable elements LINES, SINES and other retrosequences High copy number genes (e.g. ribosomal genes, histone genes) Multifamily member

More information

Genome Biology and Biotechnology

Genome Biology and Biotechnology Genome Biology and Biotechnology Functional Genomics Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology (VIB) University of Gent International course

More information

Annotation of contig27 in the Muller F Element of D. elegans. Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans.

Annotation of contig27 in the Muller F Element of D. elegans. Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans. David Wang Bio 434W 4/27/15 Annotation of contig27 in the Muller F Element of D. elegans Abstract Contig27 is a 60,000 bp region located in the Muller F element of the D. elegans. Genscan predicted six

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database

More information

Bioinformatics, in general, deals with the following important biological data:

Bioinformatics, in general, deals with the following important biological data: Pocket K No. 23 Bioinformatics for Plant Biotechnology Introduction As of July 30, 2006, scientists around the world are pursuing a total of 2,126 genome projects. There are 405 published complete genomes,

More information

Using semantic web technology to accelerate plant breeding.

Using semantic web technology to accelerate plant breeding. Using semantic web technology to accelerate plant breeding. Pierre-Yves Chibon 1,2,3, Benoît Carrères 1, Heleena de Weerd 1, Richard G. F. Visser 1,2,3, and Richard Finkers 1,3 1 Wageningen UR Plant Breeding,

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

Genomic resources and gene/qtl discovery in cereals

Genomic resources and gene/qtl discovery in cereals Genomic resources and gene/qtl discovery in cereals Roberto Tuberosa Dept. of Agroenvironmental Sciences & Technology University of Bologna, Italy The ABDC Congress 1-4 March 2010 Gudalajara, Mexico Outline

More information

CHAPTER 21 GENOMES AND THEIR EVOLUTION

CHAPTER 21 GENOMES AND THEIR EVOLUTION GENETICS DATE CHAPTER 21 GENOMES AND THEIR EVOLUTION COURSE 213 AP BIOLOGY 1 Comparisons of genomes provide information about the evolutionary history of genes and taxonomic groups Genomics - study of

More information

Identifying Genes Underlying QTLs

Identifying Genes Underlying QTLs Identifying Genes Underlying QTLs Reading: Frary, A. et al. 2000. fw2.2: A quantitative trait locus key to the evolution of tomato fruit size. Science 289:85-87. Paran, I. and D. Zamir. 2003. Quantitative

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/

More information

Authors: Vivek Sharma and Ram Kunwar

Authors: Vivek Sharma and Ram Kunwar Molecular markers types and applications A genetic marker is a gene or known DNA sequence on a chromosome that can be used to identify individuals or species. Why we need Molecular Markers There will be

More information

The Diploid Genome Sequence of an Individual Human

The Diploid Genome Sequence of an Individual Human The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 1: Setting the pace 1 Bioinformatics what s

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

Chapter 1 Molecular Genetic Approaches to Maize Improvement an Introduction

Chapter 1 Molecular Genetic Approaches to Maize Improvement an Introduction Chapter 1 Molecular Genetic Approaches to Maize Improvement an Introduction Robert T. Fraley In the following chapters prominent scientists will discuss the recent genetic improvements in maize that have

More information

Chapter 5. Structural Genomics

Chapter 5. Structural Genomics Chapter 5. Structural Genomics Contents 5. Structural Genomics 5.1. DNA Sequencing Strategies 5.1.1. Map-based Strategies 5.1.2. Whole Genome Shotgun Sequencing 5.2. Genome Annotation 5.2.1. Using Bioinformatic

More information

Transcriptomics. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Transcriptomics. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Transcriptomics Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Central dogma of molecular biology Central dogma of molecular biology Genome Complete DNA content of

More information

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005

Following text taken from Suresh Kumar. Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May.2005 Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of

More information

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze SolCAP Solanaceae Coordinated Agricultural Project Supported by the National Research Initiative Plant Genome Program of USDA CSREES for the Improvement of Potato and Tomato Executive Commitee : David

More information

Drosophila White Paper 2003 August 13, 2003

Drosophila White Paper 2003 August 13, 2003 Drosophila White Paper 2003 August 13, 2003 Explanatory Note: The first Drosophila White Paper was written in 1999. Revisions to this document were made in 2000 and the final version was published as the

More information

Genome Projects. Part III. Assembly and sequencing of human genomes

Genome Projects. Part III. Assembly and sequencing of human genomes Genome Projects Part III Assembly and sequencing of human genomes All current genome sequencing strategies are clone-based. 1. ordered clone sequencing e.g., C. elegans well suited for repetitive sequences

More information

Molecular and Applied Genetics

Molecular and Applied Genetics Molecular and Applied Genetics Ian King, Iain Donnison, Helen Ougham, Julie King and Sid Thomas Developing links between rice and the grasses 6 Gene isolation 7 Informatics 8 Statistics and multivariate

More information

Tooling up for Functional Genomics

Tooling up for Functional Genomics Tooling up for Functional Genomics Michael Abberton, Iain Donnison, Phil Morris, Helen Ougham, Mark Robbins, Howard Thomas From model to crop species 6 Genomes and genome mapping 7 The transcriptome 7

More information

Map-Based Cloning of Qualitative Plant Genes

Map-Based Cloning of Qualitative Plant Genes Map-Based Cloning of Qualitative Plant Genes Map-based cloning using the genetic relationship between a gene and a marker as the basis for beginning a search for a gene Chromosome walking moving toward

More information

Biol 478/595 Intro to Bioinformatics

Biol 478/595 Intro to Bioinformatics Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12

More information

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze Marker types Potato Association of America Frederiction August 9, 2009 Allen Van Deynze Use of DNA Markers in Breeding Germplasm Analysis Fingerprinting of germplasm Arrangement of diversity (clustering,

More information

AP Biology. The BIG Questions. Chapter 19. Prokaryote vs. eukaryote genome. Prokaryote vs. eukaryote genome. Why turn genes on & off?

AP Biology. The BIG Questions. Chapter 19. Prokaryote vs. eukaryote genome. Prokaryote vs. eukaryote genome. Why turn genes on & off? The BIG Questions Chapter 19. Control of Eukaryotic Genome How are genes turned on & off in eukaryotes? How do cells with the same genes differentiate to perform completely different, specialized functions?

More information

Genetics and Biotechnology. Section 1. Applied Genetics

Genetics and Biotechnology. Section 1. Applied Genetics Section 1 Applied Genetics Selective Breeding! The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding. Section

More information

Genomes: What we know and what we don t know

Genomes: What we know and what we don t know Genomes: What we know and what we don t know Complete draft sequence 2001 October 15, 2007 Dr. Stefan Maas, BioS Lehigh U. What we know Raw genome data The range of genome sizes in the animal & plant kingdoms!

More information

Chapter 1. from genomics to proteomics Ⅱ

Chapter 1. from genomics to proteomics Ⅱ Proteomics Chapter 1. from genomics to proteomics Ⅱ 1 Functional genomics Functional genomics: study of relations of genomics to biological functions at systems level However, it cannot explain any more

More information

Identifying the functional bases of trait variation in Brassica napus using Associative Transcriptomics

Identifying the functional bases of trait variation in Brassica napus using Associative Transcriptomics Brassica genome structure and evolution Genome framework for association genetics Establishing marker-trait associations 31 st March 2014 GENOME RELATIONSHIPS BETWEEN SPECIES U s TRIANGLE 31 st March 2014

More information

Il trascrittoma dei mammiferi

Il trascrittoma dei mammiferi 29 Novembre 2005 Il trascrittoma dei mammiferi dott. Manuela Gariboldi Gruppo di ricerca IFOM: Genetica molecolare dei tumori (responsabile dott. Paolo Radice) Copyright 2005 IFOM Fondazione Istituto FIRC

More information

REDUCING THE LEVEL OF ANTI-NUTRITIONAL NUTRITIONAL FACTORS IN CANOLA MEAL

REDUCING THE LEVEL OF ANTI-NUTRITIONAL NUTRITIONAL FACTORS IN CANOLA MEAL REDUCING THE LEVEL OF ANTI-NUTRITIONAL NUTRITIONAL FACTORS IN CANOLA MEAL Randall Weselake University of Alberta Jeff Parker Genome Alberta Canola Meal Research Meeting September 28, 2007 DESIGNING OILSEEDS

More information

Complete draft sequence 2001

Complete draft sequence 2001 Genomes: What we know and what we don t know Complete draft sequence 2001 November11, 2009 Dr. Stefan Maas, BioS Lehigh U. What we know Raw genome data The range of genome sizes in the animal & plant kingdoms

More information

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential Applications Richard Finkers Researcher Plant Breeding, Wageningen UR Plant Breeding, P.O. Box 16, 6700 AA, Wageningen, The Netherlands,

More information

Pathway approach for candidate gene identification and introduction to metabolic pathway databases.

Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Marker Assisted Selection in Tomato Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification of polymorphisms in data-based sequences MAS forward

More information

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3

9/19/13. cdna libraries, EST clusters, gene prediction and functional annotation. Biosciences 741: Genomics Fall, 2013 Week 3 cdna libraries, EST clusters, gene prediction and functional annotation Biosciences 741: Genomics Fall, 2013 Week 3 1 2 3 4 5 6 Figure 2.14 Relationship between gene structure, cdna, and EST sequences

More information

Genomics-based approaches to improve drought tolerance of crops

Genomics-based approaches to improve drought tolerance of crops Review TRENDS in Plant Science Vol.11 No.8 Full text provided by www.sciencedirect.com Genomics-based approaches to improve drought tolerance of crops Roberto Tuberosa and Silvio Salvi Department of Agroenvironmental

More information

Functional Genomics in Plants

Functional Genomics in Plants Functional Genomics in Plants Jeffrey L Bennetzen, Purdue University, West Lafayette, Indiana, USA Functional genomics refers to a suite of genetic technologies that will contribute to a comprehensive

More information

Gene Expression Technology

Gene Expression Technology Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene

More information

CHAPTERS 16 & 17: DNA Technology

CHAPTERS 16 & 17: DNA Technology CHAPTERS 16 & 17: DNA Technology 1. What is the function of restriction enzymes in bacteria? 2. How do bacteria protect their DNA from the effects of the restriction enzymes? 3. How do biologists make

More information

ab initio and Evidence-Based Gene Finding

ab initio and Evidence-Based Gene Finding ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene

More information

Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture

Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture Microbially Mediated Plant Salt Tolerance and Microbiome based Solutions for Saline Agriculture Contents Introduction Abiotic Tolerance Approaches Reasons for failure Roots, microorganisms and soil-interaction

More information

Using molecular marker technology in studies on plant genetic diversity Final considerations

Using molecular marker technology in studies on plant genetic diversity Final considerations Using molecular marker technology in studies on plant genetic diversity Final considerations Copyright: IPGRI and Cornell University, 2003 Final considerations 1 Contents! When choosing a technique...!

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS Introduction to BIOINFORMATICS Antonella Lisa CABGen Centro di Analisi Bioinformatica per la Genomica Tel. 0382-546361 E-mail: lisa@igm.cnr.it http://www.igm.cnr.it/pagine-personali/lisa-antonella/ What

More information

Capabilities & Services

Capabilities & Services Capabilities & Services Accelerating Research & Development Table of Contents Introduction to DHMRI 3 Services and Capabilites: Genomics 4 Proteomics & Protein Characterization 5 Metabolomics 6 In Vitro

More information

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson

Grundlagen der Bioinformatik Summer Lecturer: Prof. Daniel Huson Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 11, 2011 1 1 Introduction Grundlagen der Bioinformatik Summer 2011 Lecturer: Prof. Daniel Huson Office hours: Thursdays 17-18h (Sand 14, C310a) 1.1

More information

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Multiple choice questions (numbers in brackets indicate the number of correct answers) 1 February 15, 2013 Multiple choice questions (numbers in brackets indicate the number of correct answers) 1. Which of the following statements are not true Transcriptomes consist of mrnas Proteomes consist

More information

Refresher on gene expression - DNA: The stuff of life

Refresher on gene expression - DNA: The stuff of life Plant Pathology 602 Plant-Microbe Interactions Lecture 2 Molecular methods for studying hostpathogen interactions I Sophien Kamoun kamoun.1@osu.edu The Ohio State University Ohio Agricultural Research

More information

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology. G16B BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY Methods or systems for genetic

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Alla L Lapidus, Ph.D. SPbSU St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as "the study of

More information

Two Mark question and Answers

Two Mark question and Answers 1. Define Bioinformatics Two Mark question and Answers Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. There are three

More information

Outline. Gene Finding Questions. Recap: Prokaryotic gene finding Eukaryotic gene finding The human gene complement Regulation

Outline. Gene Finding Questions. Recap: Prokaryotic gene finding Eukaryotic gene finding The human gene complement Regulation Tues, Nov 29: Gene Finding 1 Online FCE s: Thru Dec 12 Thurs, Dec 1: Gene Finding 2 Tues, Dec 6: PS5 due Project presentations 1 (see course web site for schedule) Thurs, Dec 8 Final papers due Project

More information

Background Wikipedia Lee and Mahadavan, JCB, 2009 History (Platform Comparison) P Park, Nature Review Genetics, 2009 P Park, Nature Reviews Genetics, 2009 Rozowsky et al., Nature Biotechnology, 2009

More information

LECTURE 20. Repeated DNA Sequences. Prokaryotes:

LECTURE 20. Repeated DNA Sequences. Prokaryotes: LECTURE 20 Repeated DNA Sequences Prokaryotes: 1) Most DNA is in the form of unique sequences. Exceptions are the genes encoding ribosomal RNA (rdna, 10-20 copies) and various recognition sequences (e.g.,

More information

Aaditya Khatri. Abstract

Aaditya Khatri. Abstract Abstract In this project, Chimp-chunk 2-7 was annotated. Chimp-chunk 2-7 is an 80 kb region on chromosome 5 of the chimpanzee genome. Analysis with the Mapviewer function using the NCBI non-redundant database

More information

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools News About NCBI Site Map

More information

Finding Genes with Genomics Technologies

Finding Genes with Genomics Technologies PLNT2530 Plant Biotechnology (2018) Unit 7 Finding Genes with Genomics Technologies Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License

More information

The tomato genome re-seq project

The tomato genome re-seq project The tomato genome re-seq project http://www.tomatogenome.net 5 February 2013, Richard Finkers & Sjaak van Heusden Rationale Genetic diversity in commercial tomato germplasm relatively narrow Unexploited

More information

Fruit and Nut Trees Genomics and Quantitative Genetics

Fruit and Nut Trees Genomics and Quantitative Genetics Fruit and Nut Trees Genomics and Quantitative Genetics Jasper Rees Department of Biotechnology University of the Western Cape South Africa jrees@uwc.ac.za The Challenges of Tree Breeding Long breeding

More information

Workshop on. Genome analysis tools applied to forest tree breeding

Workshop on. Genome analysis tools applied to forest tree breeding Workshop on Genome analysis tools applied to forest tree breeding Vantaa (Finland), the 18 th October 2012 BOOK OF ABSTRACTS Introduction Giusi Zaina University of Udine, Udine, Italy Contact: giusi.zaina@uniud.it

More information

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding UCSC Genome Browser Introduction to ab initio and evidence-based gene finding Wilson Leung 06/2006 Outline Introduction to annotation ab initio gene finding Basics of the UCSC Browser Evidence-based gene

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics Processes Activation Repression Initiation Elongation.... Processes Splicing Editing Degradation Translation.... Transcription Translation DNA Regulators DNA-Binding Transcription Factors Chromatin Remodelers....

More information

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES

BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES BIOINFORMATICS TO ANALYZE AND COMPARE GENOMES We sequenced and assembled a genome, but this is only a long stretch of ATCG What should we do now? 1. find genes What are the starting and end points for

More information

Patterns and mechanisms of recombination at the barley VRN- H1 locus. James Cockram

Patterns and mechanisms of recombination at the barley VRN- H1 locus. James Cockram Patterns and mechanisms of recombination at the barley VRN- H1 locus James Cockram Talk Outline: Project background Genetic markers used Homologous and non-homologous recombination within BM5A Putative

More information

Era with Computational Biology/Toxicology

Era with Computational Biology/Toxicology USM Seminar 1/22/2010 Embracing the Post-Omics Era with Computational Biology/Toxicology Ping Gong Environmental Genomics and Genetics (EGG) Team @ Environmental Laboratory Outline Introduction Bioinformatics

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

7 Gene Isolation and Analysis of Multiple

7 Gene Isolation and Analysis of Multiple Genetic Techniques for Biological Research Corinne A. Michels Copyright q 2002 John Wiley & Sons, Ltd ISBNs: 0-471-89921-6 (Hardback); 0-470-84662-3 (Electronic) 7 Gene Isolation and Analysis of Multiple

More information

Meeting Report: Soybean Genomics Assessment and Strategy Workshop July 2005 St. Louis, Missouri

Meeting Report: Soybean Genomics Assessment and Strategy Workshop July 2005 St. Louis, Missouri Meeting Report: Soybean Genomics Assessment and Strategy Workshop 19-20 July 2005 St. Louis, Missouri Writing Team: Randy Shoemaker, USDA-ARS, Ames, Iowa Wayne Parrott, University of Georgia, Athens, AG

More information

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Transcriptomics analysis with RNA seq: an overview Frederik Coppens Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)

More information

DNA Cloning with Cloning Vectors

DNA Cloning with Cloning Vectors Cloning Vectors A M I R A A. T. A L - H O S A R Y L E C T U R E R O F I N F E C T I O U S D I S E A S E S F A C U L T Y O F V E T. M E D I C I N E A S S I U T U N I V E R S I T Y - E G Y P T DNA Cloning

More information

HUMAN GENOME BIOINFORMATICS. Tore Samuelsson, Dec 2009

HUMAN GENOME BIOINFORMATICS. Tore Samuelsson, Dec 2009 HUMAN GENOME BIOINFORMATICS Tore Samuelsson, Dec 2009 The sequenced (gray filled) and unsequenced (white) portions of the human genome. Peter F.R. Little Genome Res. 2005; 15: 1759-1766 Human genome organisation

More information

Motivation From Protein to Gene

Motivation From Protein to Gene MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein

More information

Introduction to Bioinformatics and Gene Expression Technology

Introduction to Bioinformatics and Gene Expression Technology Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA

More information

NGS developments in tomato genome sequencing

NGS developments in tomato genome sequencing NGS developments in tomato genome sequencing 16-02-2012, Sandra Smit TATGTTTTGGAAAACATTGCATGCGGAATTGGGTACTAGGTTGGACCTTAGTACC GCGTTCCATCCTCAGACCGATGGTCAGTCTGAGAGAACGATTCAAGTGTTGGAAG ATATGCTTCGTGCATGTGTGATAGAGTTTGGTGGCCATTGGGATAGCTTCTTACC

More information

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses Course Information Introduction to Algorithms in Computational Biology Lecture 1 Meetings: Lecture, by Dan Geiger: Mondays 16:30 18:30, Taub 4. Tutorial, by Ydo Wexler: Tuesdays 10:30 11:30, Taub 2. Grade:

More information

Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics If the 19 th century was the century of chemistry and 20 th century was the century of physic, the 21 st century promises to be the century of biology...professor Dr. Satoru

More information

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Abundance RNA is... Diverse Dynamic Central DNA rrna Epigenetics trna RNA mrna Time Protein Abundance

More information

Draft 3 Annotation of DGA06H06, Contig 1 Jeannette Wong Bio4342W 27 April 2009

Draft 3 Annotation of DGA06H06, Contig 1 Jeannette Wong Bio4342W 27 April 2009 Page 1 Draft 3 Annotation of DGA06H06, Contig 1 Jeannette Wong Bio4342W 27 April 2009 Page 2 Introduction: Annotation is the process of analyzing the genomic sequence of an organism. Besides identifying

More information

Lecture #8 2/4/02 Dr. Kopeny

Lecture #8 2/4/02 Dr. Kopeny Lecture #8 2/4/02 Dr. Kopeny Lecture VI: Molecular and Genomic Evolution EVOLUTIONARY GENOMICS: The Ups and Downs of Evolution Dennis Normile ATAMI, JAPAN--Some 200 geneticists came together last month

More information

Gene Annotation Project. Group 1. Tyler Tiede Yanzhu Ji Jenae Skelton

Gene Annotation Project. Group 1. Tyler Tiede Yanzhu Ji Jenae Skelton Gene Annotation Project Group 1 Tyler Tiede Yanzhu Ji Jenae Skelton Outline Tools Overview of 150kb region Overview of annotation process Characterization of 5 putative gene regions Analysis of masked

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

RNA-Sequencing analysis

RNA-Sequencing analysis RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges

More information

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012 Plant Breeding and Agri Genomics Team Genotypic 24 November 2012 Genotypic Family: The Best Genomics Experts Under One Roof 10 PhDs and 78 MSc MTech BTech ABOUT US! Genotypic is a Genomics company, which

More information

Concepts of Genetics, 10e (Klug/Cummings/Spencer/Palladino) Chapter 1 Introduction to Genetics

Concepts of Genetics, 10e (Klug/Cummings/Spencer/Palladino) Chapter 1 Introduction to Genetics 1 Concepts of Genetics, 10e (Klug/Cummings/Spencer/Palladino) Chapter 1 Introduction to Genetics 1) What is the name of the company or institution that has access to the health, genealogical, and genetic

More information

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of

More information