Reductionism has been described as the attempt to

Size: px
Start display at page:

Download "Reductionism has been described as the attempt to"

Transcription

1 ARTICLE SYSTEMS BIOLOGY: EVOLUTION OF A NEW DISCIPLINE NOOR JAILKHANI*, SAMRAT CHATTERJEE* AND KANURY VS RAO* Introduction and History Reductionism has been described as the attempt to explain complex phenomenon by defining the functional properties of the individual components that compose multi-component systems 1. Most biological and biomedical research has traditionally followed a reductionist approach. While this has undoubtedly helped to reveal great depths of knowledge of biological systems such approaches, however, are limited in their ability to decipher a more global level of understanding of the complexity of biological phenomenon. In contrast to reductionist strategies, systems approaches are those that adopt a more holistic perspective towards the resolution of biological phenomena and processes. The core precept of systems biology is that complex biological processes cannot be attributed to individual molecular entities. Rather, information on properties of the individual components needs to be integrated in order to arrive at a comprehensive or systems view. Recent successes achieved through the application of systems-level analysis have emphasized that biomedical research needs to move away from naive reductionism or the belief that the complete understanding of living organism and systems can be achieved by this approach 1, 2. Systems biology is an integrative, and interdisciplinary, approach in biology which aims to understand and predict the behaviour or emergent properties of complex, multicomponent biological systems and processes 1. It is a relatively new approach with the potential of completely transforming how we understand biological problems and tackle disease. The systems approach attempts to integrate information regarding the components of a system (depending on the system under study these could be genes, * Immunology Group, International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi , India proteins, organs etc), the component dynamics (how they function and regulate each other), and finally, the manner in which this regulates biological processes. This discipline integrates concepts from mathematics, computer science, engineering and physics to understand biological complexity (Figure 1). Systems approaches are believed to be holistic and not atomistic. They employ hypothesis-driven as well as discovery-driven approaches. They are designed to address the complexity of biological systems, both normal and perturbed, at multiple levels of organization (from molecules, cells and organs to organisms and ecosystems) 3-6. A hallmark of biological systems is their non-linear behaviour, and an understanding of the basis and consequences of such behaviour can only be approached by using the tools of systems biology. The general systems approach can be explained with the example of any machine, for example a car. The traditional reductionists would limit their study to deciphering the functioning of each part, without an attempt to decipher how they all come together. Though such approaches can help entangle the individual components of a complex machine, they however fail to provide any insights on how these parts must be assembled in the correct manner so as to recreate the car with the properties, as we know it. On the other hand, systems based approaches would attempt to integrate the individual properties of each part within the context of the appropriate structural organization. Although systems biology is a recently emerging discipline 4,7, its origins however date back to as early as the 1930s when Von Bertalanffy first applied concepts of systems theory to various fields including biology, psychology, social science and economics. He defined wholeness as problems of organization, phenomena not resolvable into local events, dynamic interactions manifest in the difference of behaviour of parts when isolated or in higher configuration, etc.; in short, systems of various 100 SCIENCE AND CULTURE, MARCH-APRIL, 2012

2 Figure 1. Systems biology is holistic and integrative. A complete understanding of the system needs integration of biological data with concepts and tools from different fields of science, especially computational and mathematical approaches. orders not understandable by investigation of their respective parts in isolation. Systems biology is essentially based on this definition. There has been a renewed interest in applying systems approaches in biology in the post genomic era. This has largely been spurred by the explosion in new technologies that can generate high-throughput quantitative data on each of the molecular constituents of the cell. While understanding the role of individual molecular components is still important, the new challenge that arose, as a result of these new technologies, for biomedical researchers was the integration and interpretation of such large-scale data sets. The advent of high throughput experimental approaches and other related advances in molecular biology have led to a flooding of biological data, with little or no interpretation of it. Therefore, the current focus is to develop systematic approaches to mine, integrate, process, represent and most importantly, analyze such data. This can only be achieved using computational and mathematical tools. In principle, the goals of system biology can broadly be distinguished into two related compartment. One of these is knowledge discovery. This involves data-mining to extract the hidden patterns from large quantities of experimental data, which then enable formation of a hypothesis. The second step, which follows from this, is to perform simulation-based analysis in order to test the hypothesis by performing in silico experiments. Such experiments yield predictions that are subsequently validated by wet experiments involving either in vitro or in vivo procedures. One of the limitations that biologists face in this exercise is that, hitherto, they have largely dealt with data that is qualitative and not quantitative. Systems biology however demands precisely quantitative data on the molecular components being studied. That is, a change in mindset is called for where biologists now need to implement and develop analytical tools for multi-variate data analysis. Such tools have for long been used by mathematicians, theoretical physicists, and engineers to study the behaviour of complex systems. In essence therefore, systems biology constitutes a framework for exploiting highthroughput experimental data to perform predictive, hypothesis driven science 8. Development of this field requires comprehensive and quantitative data sets. The models derived need to be validated with literature and tested experimentally. To increase the predictability and precision of the models, multiple experimental data sets are integrated, each providing additional information about the system under study. These could include genomic, proteomic, metabolomic, or transcriptomic data. Such data sets provide the parts-list whose organizational assembly into functional networks needs to be deduced in order to understand the structure and dynamics of the biological system under study. Although the application of systems approaches to study biological systems is relatively recent, considerable progress has been made in high throughput experimental and data analytical methods in order to model cellular functions 3,4,9. Such approaches have already found a large number of applications, which include identification of pathway based biomarkers, deciphering genetic global interaction maps, identification of disease networks and disease genes, molecular diagnostics and understanding host- pathogen interactions among others. The combination of computational, experimental and observational enquiry that systems biology encapsulates has also yielded exciting new promises in the field of drug discovery. Additionally it is also finding use in optimizing medical treatment regimes for individual patients. While systems biology is now applied to every branch of biological/biomedical research the tools that it employs are, in turn, taken from VOL. 78, NOS

3 different scientific disciplines such as physics and economy. For example, the game theory, which is mainly used in economy, is now becoming much relevant in understanding the host-pathogen interaction and their strategies. Thus the interdependency of mathematics and biology with computer simulation is the evolutionary trend of the day, and biological research can now no more be separated from mathematical and computational sciences. General Approach in Computational Systems Biology Systems biology involves the application of various concepts and tools of mathematics, computational science and statistics. Data obtained from high-throughput experimentation is first normalized with respect to some control. In statistical terms, normalization is the process of isolating statistical error in repeated measured data. The first challenge here is to filter the data by reducing the noise in it to the extent possible. Subsequent to this, a preliminary analysis is performed to capture the variations in the data between the test/control and experimental groups. Such variations would be reflective of the underlying causality of the biological phenomenon being studied and various clustering algorithms are available for this purpose. The next step would aim to determine whether any correlation exists between the nature of the variables and the biological response under investigation and a variety of methods such as the Principal Component Analysis (PCA) can be exploited to this end 10. The PCA, invented by Karl Pearson in 1901, orthogonally transforms a set of observations of potentially correlated variables into a set of values of uncorrelated variables. Such a procedure helps to extract that subset of variables that best correlates with the response being investigated. These findings can then be further refined through the employment of tools such as Partial Least Square Regression (PLSR) and Support Vector Machine (SVM). PLSR is a statistical method used to find a linear regression model between predicted variables and the observable variables (i.e., observed responses). Support vector machines (SVMs) are a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis. The original SVM algorithm was invented by Vladimir Vapnik and then further improvised upon by Corinna Cortes and Vladimir Vapnik. Another key aspect of Systems Biology is the analysis of molecular networks and their properties. The approaches used here are similar to those employed for the analysis of social networks, although in biology the focus is to study local patterns in the larger network. The development of newer molecular tools has now made it possible to characterize protein protein interaction networks and gene regulatory networks with ever-increasing accuracy 4,9. These molecular networks have structures that are similar to one another and even bear similarity to non-biological networks. For example, metabolic, gene regulatory and protein-protein interaction networks have a power-law degree distribution, where most nodes (or, molecular components) in the network interact with just one other node, while a few interact with tens or hundreds of others. Such a network analysis provides a global perspective on a given biological process, while at the same time describing the contributions from each individual bimolecule to that process. The application of mathematical modelling, by employing various kinds of differential equations 5,6,11, to such networks can then reveal the underlying mechanisms that drive or regulate this process. Importantly, such mathematical models capture dynamical features of the network and this aspect is especially relevant since all biological processes occur over defined time-spans. Areas of Applications Systems biology approaches have been successfully used for molecular diagnostics of disease such as cancer. Research efforts in cancer have largely been in the direction of identifying biomarkers (genes or proteins), which are then further investigated as potential drug targets. Efforts to identify cancer biomarkers by traditional molecular approaches have yielded some insight in to genes or proteins, which may be able to predict disease onset and/ or severity. Most techniques are based on expression of biomarkers at the gene or protein level. However such approaches have their limitations due to cellular and genetic heterogeneity. Therefore, biomarkers differ among cell types and also across the same cell type in different patients. Such limitations can be addressed by integration of multiple types of omics data sets from multiple patients with pathway data or protein-protein interaction (PPI) networks 8. Pathway and PPI data are obtained from literature curation and PPI databases. Overlaying of gene expression and proteomics data on PPI networks has emerged as a powerful tool in understanding the molecular basis of such complex diseases. These integrative pathway/network based models allow a strong biological interpretation of the expression levels of biomarkers with the disease, and these pathway/network based approaches can also be used to predict biomarkers 8. Some groups have recently identified cancer biomarkers directly from protein- protein interaction networks. 102 SCIENCE AND CULTURE, MARCH-APRIL, 2012

4 Systems approaches have greatly accelerated the search for disease causing genes in complex genetic disorders. It is now well established that cellular functions cannot be attributed to single molecules and are carried out by groups of molecules in a modular manner. In general, modularity is a hallmark of biology and refers to a group of physically or functionally linked molecules (nodes) that work together to achieve a (relatively) distinct function 12. For example, components of a module can be structural such as proteins which are part of the ribosome while others can be temporal, i.e., molecules working together at a specific stage of a temporal process such as signal transduction or cell cycle 12. Research has shown that the same disease is often caused by different genes - in different individuals - which are, however, functionally related by being a part of a functional module. Such approaches also rely on integrating data from multiple types of genome wide analysis and the disease genes are inferred from interaction networks. The networks and modules obtained shed light on the dynamics of the biological process or disease (Figure 2). In addition to this, stem cell biology is another field where the tools of systems biology are being aggressively applied particularly for the Figure 2. A hypothetical protein-protein interaction (PPI) network consisting of proteins depicted by the green and pink nodes and the interactions between them depicted as links. The pink nodes may represent protein involved in a specific biological process or disease. computation of cell fate and cellular programming. Such studies have lead to the identification of embryonic stem cell transcription factors, their downstream targets, and the core embryonic regulatory networks that control cell fate decisions. The field of systems biology has also greatly impacted the field of drug discovery, and its application has greatly increased the probability of success in these ventures. Systems biology has not only helped in understanding the biological systems, but also reveals how perturbations, such as those caused by drugs, affect such systems. Further, the delineation of the core network of interactions or modules regulating cellular processes has also helped in the identification of probable drug targets for therapy 13. Drug discovery efforts, especially in the case of complex genetic disorders such as cancer have often been hampered by the complexity of the cellular signalling network. Traditionally, drug discovery involves the screening of random chemical libraries against the biological target, such as a pharmacologically important protein. Systems approaches are applied in multiple ways in drug discovery. They can aid in the identification of novel drug targets. Many disease causing genes/proteins have been identified and are also targeted for therapy in disease such as COX-2 inhibitors for general inflammation and arthritis 14. However, success using conventional approaches has only been observed in unifactorial diseases, with little or no success in complex disorders. The bigger challenge for drug discovery is multifactorial disorders such as cancer, diabetes and infectious disease. Biological processes, including disease progression are dynamic processes. A complete systems view of such processes can emerge only when there is a better resolution of the networks and regulators that control different stages. Therefore another application in drug discovery is the elucidation of signaling relationships, which may enable targeting of the most appropriate node in the signalling network. Here again the use of orthogonal experimental approaches integrated with computational and network tools have been useful in identifying sets of dynamic functional modules and their constituent proteins. The least redundant nodes from these functional modes may prove to be important drug targets. Systems approaches are also being used to identify a smaller set of chemical compounds, which may be tested experimentally for drug screening using methods of structural systems biology. Such methods employ existing protein structure and interaction data, and protein interactions are modelled using computational tools such as VOL. 78, NOS

5 molecular docking. As a result, designing of structurally defined libraries, which target specific groups of molecules such as receptors can greatly reduce the number of compounds to be screened. Systems approaches are also being applied to medicine. Conventional medical practices are also largely reductionist, such that the disease or the diseased part of the body is identified and then treated in isolation from the rest of the body. However no system can function in isolation within the body, and diseases emerge from a complex system that comprises the whole body. A systems approach which helps understand disease complexity can yield novel insights into combating disease states. Systems biology has the potential to transform healthcare and how we understand and deal with disease. The use of PPI networks in medicine, have helped in the identification of disease genes and disease modules. A disease module represents a group of network components that together contribute to a cellular function and disruption of which results in a particular disease phenotype 13. Systems approaches have already been applied to understand a wide range of diseases and pathophenotypes, including several different types of cancers, neurological diseases, cardiovascular diseases, systemic inflammation, obesity, and atherosclerosis and type-2 diabetes among others. Advancements in existing technologies and development of new technologies such as next generation sequencing, high throughput proteomics and single cell analyses among others, along with developments in mathematical and computational tools, are likely to transform medicine to be more predictive, personalized, preventive and participatory (P4) 15. Traditional Medicine and System Biology Though the concepts of systems biology and systems medicine are relatively new in modern science and medicine, such holistic approaches have in fact been practiced in many parts of the world as alternative or traditional medicine. Some of the prominent forms still practiced in Asia include Unani medicine, Ayurveda and traditional Chinese medicine among others. Such approaches are referred to as complementary or integrative medicine when combined with conventional medical practices. Many of these are based on cultural or historical traditions as compared to modern practices, which have a conventional scientific basis. The scientific basis for the efficacy of such practices is either not understood or have not been examined by modern scientific methods. Though often criticised by conventional medical practitioners, such approaches are successfully used by practitioners of alternative medicine and some have also been tested with promising results. Therefore, the inability of conventional scientific techniques to understand their mode of action and efficacy cannot alone be used to underplay their potential as therapeutic agents. It is important to recognize that many conventional medical practices are also empirical and may not have a sound evidence based support. The approaches used by traditional medicine are not just medical practices but healing systems which focus on disease management rather than just a cure. For example, in the case of Unani medicine, tackling a disease involves understanding the cause, the aggravating factors, disease pathogenesis, its pathology and clinical manifestations. In Chinese medicine, the approach is again holistic, where the body is viewed as a set of inter-related organ systems carrying out a set of functions rather than in terms of individual organs. Similarly, Ayurveda also has a holistic approach based on wellness. Health management is both preventive as well as curative, and clearly calls for a systems perspective. Alternative medicines are often based on plants and herbs, but may also contain substances from meats and other animal products. Addition of minerals is also prevalent. As opposed to conventional medicines, they represent complex mixtures (often undefined), possibly containing multiple active components. A parallel can be drawn to the new age multi-component drugs arising from the concepts of multi-targeted therapy. Such drugs are gaining popularity as effective cures for complex disorders such as cancer. Such medicines not only contain multiple components, they probably affect multiple targets to bring about the desired effects. Conclusion As described above, systems approaches of systematic high throughput data generation and integration with computational and network tools have proven to be great discovery tools. The way forward is the improvement of existing experimental methods and integration of multiple disciplines to solve the questions and problems of biological complexity. Though in its nascent stage, systems biology holds a lot of promise and potential to deliver novel biological insights into the complexity of biological systems, and help to understand and tackle disease. References 1. K. Strange, Am. J. Physiol. Cell. Physiol. 288, C (2005). 2. F.E. Bloom, J. Neurosci. 21, (2001). 3. C. Auffray, Z. Chen and L. Hood, Genome Med. 1, 2 (2009). 104 SCIENCE AND CULTURE, MARCH-APRIL, 2012

6 4. H. Kitano, Nature. 420, (2002). 5. M. Lakshmanan and S. Rajasekar, Nonlinear dynamicsintegrability, chaos and patterns (Springer-verlag, Berlin, 2003). 6. J.D. Murray, Mathematical biology: I. An introduction (Interdisciplinary applied mathematics) (Springer, New York- third edition, 2001). 7. T. Ideker, T. Galitski and L. Hood, Annu. Rev. Genomics Hum. Genet. 2, (2001). 8. H.Y. Chuang, M. Hofree and T. Ideker, Annu. Rev. Cell. Dev. Biol. 26, (2010). 9. T.G. Buchman, Nature. 420, (2002). 10. P. Pavzner and Jones Neil, C, An introduction to bioinformatics algorithms (MIT press 2004). 11. Y. Kuang, Delay differential equations with application in population dynamics (Academic Press, New York, 1993). 12. A.L. Barabasi and Z.N. Oltvai, Nat. Rev. Genet. 5, (2004). 13. A.L. Barabasi, N. Gulbahce and J. Loscalzo, Nat. Rev. Genet. 12, (2011) E. Davidov, J. Holland, E. Marple and S. Naylor, Drug Discov Today. 8, (2003). 15. L. Hood and S.H. Friend, Nat. Rev. Clin. Oncol. 8, (2011). VOL. 78, NOS