Bioinformatika a výpočetní biologie. KFC/BIN VIII. Systémová biologie
|
|
- Cory Sharp
- 5 years ago
- Views:
Transcription
1 Bioinformatika a výpočetní biologie KFC/BIN VIII. Systémová biologie RNDr. Karel Berka, Ph.D. Univerzita Palackého v Olomouci
2 Syllabus Pathway Analysis KEGG Whole cell simulation
3 Pathway analysis
4
5
6
7
8
9
10
11
12
13
14 Caveats
15
16
17
18
19
20
21
22 Detailed view on KEGG database
23 What Is KEGG? KEGG is a repository of pathways, genes, orthologs, drugs, functional hierarchies, ligands and diseases KEGG stands for Kyoto Encyclopedia of Genes and Genomes The KEGG databases allow for linking (and displaying) of the various biological components Mike Sweredoski
24 KEGG Pathway Maps Mike Sweredoski
25 How to get started KEGG is located at: You can start by search for a single protein id at Note, you must enter the IPI ids like IPI:IPI , Uniprot Ids like Uniprot:Q08170, SGD ORFs like sce:ygl111w But, you probably have a long list of proteins you would like to analyze Mike Sweredoski
26 Dealing with many proteins I developed a quick tool to retrieve all of the KEGG ids associated with your MaxQuant analysis and identify which pathways you have identified most of the genes This tool is currently available at Note: This script is a work in progress. There are several issues regarding protein groups and missing KEGG annotations that I m still working on
27 Mike Sweredoski An Example: SILAC SGD Step One: Upload proteingroups.txt to KEGGER
28 This link will show all KEGG Gene Ids KEGG Pathways are sorted by percentage of genes found in sample Mike Sweredoski An Example: SILAC SGD These links will take you to the KEGG Pathway These Page links will show all the KEGG Ids found for the individua l
29 Mike Sweredoski An Example: SILAC SGD Suppose we are interested in looking further at the Proteasome, for which we have found 34 out of 35 yeast genes We can click on the link taking us to the KEGG page for the Proteasome, or we view the list of yeast genes we ve identified in the proteasome
30 An Example: SILAC SGD We can combine this list of KEGG genes with the KEGG maps This will allow us to visualize which components we have identified To do this, we copy the KEGG gene list, and navigate to r_pathway.html Once there, we need to change Search Against to our species of interest, in this example Saccharomyces cerevisiae We then paste the list in the text box I like to change the default color to red so they stand out Mike Sweredoski
31 Mike Sweredoski An Example: SILAC SGD Maybe several pathways identified, but we will select the pathway of interest
32 An Example: SILAC SGD Identified genes in our default color (red) Unidentified known yeast genes are in pale green
33 Mike Sweredoski An Example: SILAC SGD
34 and back to pathway analysis
35 Example of stages
36 Example: microrna network in REH/MSC cells
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51 A Whole-Cell Computational Model Predicts Phenotype from Genotype Literature review Roman Laskowski 19th September 2011
52 Na skripta/bin/cell-wholesimulation.pdf samotný článek skripta/bin/wholecellsim.mp4 - video
53 Model organism: M. genitalium
54 Whole-cell simulation M. genitalium 525 genes
55 Whole-cell simulation Previous methods: ODEs ordinary differential equations - difficulty of obtaining model parameters Boolean network modelling - fewer parameters required Constraint-based modelling - not practical for whole-cell models
56 Cellular function models Each timestep is 1 sec Modules use current cell variables to calculate their effect on them Loop until cell divides Poisson processes Flux-balance analysis
57 End of simulation - when pinched diameter is zero
58 Overview
59 Validation Simulated 128 cells in typical Mycoplasma culture environment Predictions: Cell properties - cell mass - growth rate Molecular properties - count - localization - activity
60 Training Observed doubling time Observed doubling time Cellular chemical composition Major cell mass fractions
61 Independent validation 1. Metabolic fluxes
62 Independent validation 2. Metabolite concentrations
63 Independent validation 3. Bursts of protein synthesis Caused by - intermittent mrna expression - availability of amino acids following protein degradation
64 Independent validation 4. Copy number distribution
65 Protein-DNA interactions Model has 30 DNA-binding proteins Chromosome explored v. quickly 50% of chromosome by 1 or more proteins within the first 6 mins 90% within 20 mins RNA polymerase binds 90% of chromosome within 49 mins 90% of genes are expressed within the first 143 minutes
66 DNA replication
67 Protein-protein collisions on chromosome Over 30,000 collisions occur per cell cycle Nearly 1 protein is displaced from chromosome per second Most collisions are caused by RNA polymerase (84%) and DNA polymerase (8%) Most commonly displaced proteins are: structural maintenance of chromosome (SMC) proteins (70%) and single-stranded binding proteins (6%)
68 Rate of DNA replication Initial rapid DNA replication Acts as a control on cell cycle duration Rate limited by available dntp (deoxyribonucleotide triphosphate)
69 Synthesis of energy storers Mainly used in production of protein and mrna
70 Waste of energy 44% discrepancy between synthesis and use of ATP and GTP
71 Knock-out simulations Knocked out each of the 525 genes in turn Found 284 genes to be essential for growth and division and 117 nonessential Unable either to produce one of the crucial biomass components, or preventing division
72 Knock-out studies
73 Use of model for biological discovery Experimentally measured growth rates of 12 single-gene-disruption strains 2/3 of the growth rates matched the predicted rates Investigation of the discrepancies led to new insights into the organism s biology However, model should be just considered a first draft Plus, M genitalium is not as experimentally tractable as E coli