SGN-6106 Computational Systems Biology I

Size: px
Start display at page:

Download "SGN-6106 Computational Systems Biology I"

Transcription

1 SGN-6106 Computational Systems Biology I A View of Modern Measurement Systems in Cell Biology Kaisa-Leena Taattola

2 The cell a complex system (Source: Lehninger Principles of Biochemistry 4th Ed.) 2

3 The cell a system of molecules and ions The 4 groups of macromolecules present in all living organisms: Nucleic acids (nucleic and organelle DNA, transcriptome) Proteins (e.g. cytoskeleton, enzymes, receptors, signalling pathways) Carbohydrates (e.g. energy reserve starch, cell wall cellulose) Lipids (e.g. cell membrane, intracellular membranes, energy reserves) Some macromolecules are polymers consisting of monomers Nucleic acids consist of nucleotides Proteins consist of amino acids Carbohydrates consist of sugars Water H 2 0 The solute that is the basis for the cellular organisation e.g. lipids organise into membranes and proteins gain their 3-dimensional structures due to interactions with water Ions Na +, K + Ca 2+, Mg 2 +, H +, OH - etc... Important e.g. in creating membrane potential across cell membranes and functioning in active sites of many enzymes 3

4 Metabolism Metabolism refers to the overall network of enzyme-catalyzed reactions in cells that consists of anabolism = synthetic reactions, i.e. constructing larger macromolecules from their constituent monomers or other smaller molecules. Anabolism requires energy and thus creates energy stores in the form of macromolecules. catabolism = degradative reactions, i.e. breaking down macromolecules into smaller molecules. Catabolism releases energy from macromolecules and converts it into a form useful for the cell. (Source: Lehninger Principles of Biochemistry 4th Ed.) 4

5 The cell a system with many omes The omes are a useful way to encapsulate a particular class of cellular processes and suitable for systems biology Remember that this is only a simplification: many biological processes and reactions are interconnected and thus involved in many omes Examples of the most used: Genome = totality of genes Transcriptome = totality of gene transcripts Proteome = totality of proteins translated Metabolome = totality of metabolites Less used: Interactome = totality of molecular interactions Signalome = totality of signal transduction components 5

6 Increasing knowledge of omes Efficient methods for acquiring large-scale information from different omes Genome: sequencing Transcriptome: DNA microarrays Metabolome and proteome: e.g. liquid and gas chromatography, mass spectrometry, protein arrays These methods are generally rapid, automated, highly sensitive, economical due to small sample consumption, and produce lots of data in one experiment. 6

7 Milestones in genetics 1859 Charles Darwin publishes On the Origin of Species 1865 The age of genetics begins when Gregor Mendel invents the notion of heritable factors (genes) 1941 One gene, one enzyme 1953 Unraveling the structure of DNA double helix 1967 Cracking the genetic code (how DNA is interpreted to proteins) 1970 Restriction enzymes discovered 1972 Recombinant DNA technology modification of DNA 1975 DNA sequencing developed 1986 The Polymerase Chain Reaction (PCR) 1989 The Human Genome Project (HGP) begins 1996 Development of the GeneChip by Affymetrix 2001 Human genome is published 2003 Human Genome Project is finished 7

8 DNA microarrays The latest invention among a number of techniques created for measuring gene expression levels. Exploit hybridisation, the ability of a single-stranded nucleic acid molecule to pair sequence-specifically with another single-stranded nucleic acid molecule (base pairing A-T/U, C-G). DNA microarrays are simple microscope slides which contain DNA probes (complementary strands to their targets), organised as separate spots, for the transcription products of different genes. The presence of a particular gene s expression product produces a fluorescent signal, which can be detected with a special scanner. 8

9 DNA microarrays DNA microarrays allow simultaneous measuring of the expression of altogether thousands of genes in a biological sample DNA microarray experiments can be used for genome-wide screening of genes which have different expression levels under different conditions, e.g. in different tissue types or in disease-state cells as opposed to normal healthy cells DNA microarrays have applications in several fields, including e.g. clinical diagnostics, toxicity studies, gene function studies, identification of co-regulated genes and cellular signalling pathways 9

10 DNA microarrays DNA microarrays can be divided into two main categories according to the technology of slide preparation: Spotted microarrays are produced by immobilising the probes, either short synthesised DNA oligonucleotides (typically nucleotides in length) or longer cdna molecules (typically nucleotides), as separate spots on a solid surface by an automated printing process Oligonucleotide arrays (e.g. Affymetrix), where short oligonucleotides are synthesised on the slide nucleotide by nucleotide using a special printing technique 10

11 cdna microarrays cdna microarrays refer to the type of spotted microarrays where the probes are cdna molecules. cdna molecules are usually derived by reverse-transcribing mrna from DNA libraries or other collections (Source: Amersham Microarray Handbook ) 11

12 cdna microarrays cdna microarrays involve measuring the relative abundance of a specific gene s mrna product within two samples. The RNA is first extracted from both samples, e.g. from cell cultures or tissue samples. The mrna fraction of the total RNA is then reverse-transcribed to produce cdna. The cdna samples are labelled with different fluorescent dyes, usually a green Cyanine 3 (Cy3) and a red Cyanine 5 (Cy5) dye. Both the labelled samples are mixed in equal proportions and allowed to hybridise with the probes of a microarray slide. 12

13 cdna microarrays After this competitive hybridisation, the microarray slide is scanned using a scanner which detects separately the two fluorescent signals and creates an image of both. From the scanned images, both fluorescent signals of each spot are quantified, and the relative abundance of a transcription product in the two samples is obtained as their ratio, referred to as intensity ratio intensity ratio = R/G When calculated using the intensity of a reference sample as a denominator, an intensity ratio higher than 1 corresponds to an upregulated (induced) gene, an intensity ratio less than 1 to a downregulated (repressed) gene and an intensity ratio equal to 1 refers to unchanged expression 13

14 cdna microarray data 14

15 Oligonucleotide arrays Two central differences compared to cdna microarrays 1. A single RNA sample hybridised to the slide instead of two cdna samples Comparisons between samples are done computationally 2. Each gene is represented as a probe set of oligonucleotide pairs instead of one full length cdna clone Intensity value for each probe set representing a gene is calculated using all the intensities within a probe set More precise and often more efficient Even a slide including probes for all known human genes Human Genome U133 Plus 2.0 Array First and most comprehensive whole human genome expression array More expensive than cdna arrays 15

16 Mass spectrometry A technique that has for long been used in chemistry Also usable in biochemical studies metabolites and even macromolecules, such as proteins and nucleic acids can be studied Used to identify and quantify unknown compounds in samples, quantify known compounds and study the structure and chemical composition of molecules. Also used in sequencing proteins and other biopolymers Detection can be done with small compound quantities (even 10 pg 0,001 pg) Subtypes of mass spectrometry that enable studying macromolecules: Matrix-assisted laser desorption mass spectrometry = MALDI MS Electrospray ionization mass spectrometry = ESI MS 16

17 Steps in mass spectrometry Molecules to be analysed are ionized in a vacuum (formation of gas phase ions from a sample that can be solid, liquid or vapor) The ionization leads to breakage of the molecules into smaller ionized units The charged components are then introduced into an electric and/or magnetic field (ion sorting) The speed of the charged components in the field depends on their massto-charge ratio (m/z) The components come out of the field in the order defined by their m/z ratio as separate fractions which are detected by the detector The detector is associated with a data system which produces a mass spectrum. The peaks represent the amount of each fraction in the sample. 17

18 Functional units of a mass spectrometer (Source: 18

19 Mass spectrum Plots the ion intensity (quantity) as a function of m/z ratio Examplary mass spectrum of CO 2 The entire ionized molecule CO 2+ forms one peak and its fragments form others (Source: 19

20 Mass spectrometry in molecule identification and characterization The m/z ratios of various ionized molecules, their functional groups and constituent elements are known and they can thus be identified from the spectrum The proportions of elements or functional groups can reveal the chemical composition and molecular structure of individual molecules passed through mass spectrometer Data systems associated with mass spectrometers usually include software that quantify and identify the compounds using mass spectrum libraries 20

21 Mass spectrometry in molecule identification and characterization One molecule must be introduced to the mass spectrometer at a time for the identification to be unambigious Separation of molecules from samples can be done by gas chromatography (GC), liquid chromatography (LC) and capillary electrophoresis, which can be coupled to MS and introduce one molecule type at a time into the mass spectrometer 21

22 Mass spectrometry in molecule quantitation When the compounds in a sample are known and one only wishes to know their quantity Can be performed for low compound concentrations The spectrometer is set to monitor only the m/z value of an ion that corresponds to the molecule of interest, although the sample includes a variety of others The form of ionization can also be chosen to favor production of a single type of ion instead of fragmenting the molecules This method is called selected ion monitoring (SIM) 22

23 MS/MS A method in which two stages of mass analysis are coupled, also referred to as tandem mass spectrometry Used in identifying compounds in complex mixtures The sample introduced to the first MS can be a complex mixture of molecules which are there transformed into ionized molecules each with unique m/z ratio. These are introduced to the second MS one type at a time and broken further into their fragment ions to obtain an identifying mass spectrum for each molecule in the original mixture. Used in determining structures of unknown substances The sample introduced to the first MS can also be a pure compound that is there broken into its fragment ions. These are introduced to the second MS one type at a time and broken further to their constituents. The product spectra obtained from the second MS can support structural analysis, because it represents the constituents of the fragment ions. 23

24 Chromatography A process that has also long been used in chemistry Molecules in complex mixtures are separated by partitioning between a mobile phase that flows through a column and a stationary phase that is packed inside the column liquid chromatography: the mobile phase is liquid gas chromatography: the mobile phase is gas As a result of differences in mobilities of the molecules, sample components will become separated from each other as they travel through the stationary phase. Used to fractionate molecules having particular chemical characteristics (e.g. charge, size, affinity) from cell lysates (broken cells), from cell culture media, or in vitro tests for further studies 24

25 Chromatography Subtypes of liquid chromatography Ion-exchange chromatography Separates molecules according to sign and magnitude of the net electric charge Size-exclusion chromatography Separates molecules according to size Affinity chromatography Separates molecules by their binding specificities 25

26 Ion-exchange chromatography (Source: Lehninger Principles of Biochemistry 4th Ed.) 26

27 Size-exclusion chromatography (Source: Lehninger Principles of Biochemistry 4th Ed.) 27

28 Affinity chromatography (Source: Lehninger Principles of Biochemistry 4th Ed.) 28

29 HPLC High-performance liquid chromatography (HPLC) a modern refinement to the presented liquid chromatographic methods makes use of high-pressure pumps that speed the movement of the molecules down the column and materials that withstand the pressure the pressure prevents diffusional spreading and improves resolution of the molecule fractions. 29

30 Gas chromatography Involves a sample being vapourised and injected onto the head of the chromatographic column The sample is transported through the column by the flow of inert, gaseous mobile phase and fractionized according to the solubility of the sample components to the liquid stationary phase and mobile gas HPLC and gas chromatography (GC) can, like MS, equally be used for identifying and quantifying molecules or ions within a sample because they can produce a spectrum of the produced fractions Requires standard reference samples whose spectra are compared with the ones produced of the sample 30

31 Protein arrays Protein arrays are, like DNA arrays, also solid phase molecule binding assays which use immobilised proteins and other molecules on a surface to bind target molecules Immobilised capture reagents include antibodies, protein domains, peptides and nucleic acids that specifically bind to target molecules (Source: Protein Arrays Resource Page 31

32 (Source: Protein Arrays Resource Page 32

33 Detection of captured molecules: Fluorescence labelling methods are widely used The same instrumentation as used for reading DNA microarrays is applicable to protein arrays (e.g. Cy-3, Cy-5 fluorescent labels and scanner) (Source: Protein Arrays Resource Page 33

34 Protein arrays : application areas Protein studies: protein expression profiling is a given protein expressed and how much? Protein functional analysis: protein-protein interactions what proteins does a given protein (e.g. An enzyme or a receptor) bind to? enzyme activities is a given enzyme active? Diagnostics: disease markers to screen whether a a patient has one of suspected diseases Isolation: individual proteins from molecule libraries for further expression or manipulation 34

35 Examplary bottlenecks in measuring cellular processes Measurement methods often involve studying of cell populations instead of one cell Problems of cells being in different phases of e.g. the cell cycle and thus having different cellular processes at a given time error introduced to measurements Imaging of single cells is yet possible (also visualization of intracellular structures with the help of fluorescizing and radioactive labels) 35

36 Examplary bottlenecks in measuring cellular processes Cells are often cultured to obtain a standardised cell line to study not the same environment as the cell normally has in an organism Difficulty of obtaining information of cell compartments/regions Cellular processes are dynamic and some occur fast difficult to observe Reaction kinetics are often studied in vitro not the same interactions with other pathways as in cells 36