Comparative Study of Intelligent Soft-Sensors for Bioprocess State Estimation

Size: px
Start display at page:

Download "Comparative Study of Intelligent Soft-Sensors for Bioprocess State Estimation"

Transcription

1 Comparative Study of Intelligent Soft-Sensors for Bioprocess State Estimation Rimvydas Simutis, Vytautas Galvanauskas, Donatas Levisauskas, Jolanta Repsyte, and Vygandas Vaitkus Process Control Department, Kaunas University of echnology, Kaunas, 38, Lithuania desired accuracy or with the desired sampling frequency. Hence the quantities must be determined indirectly using soft-sensors [3]-[]. he central idea behind a soft sensor is to use (relatively) easily accessible on-line data for the estimation of other important process variables. Softsensors determine important process variables from easily measurable quantities by means of expressions, relating the corresponding quantities with each other. Easy-to-use black-box estimators can be taken to represent the complex relationships between the on-line measured variables and bioprocess state. he diversity of black-box soft-sensors for process monitoring is very broad []. However, recently more and more attention is drawn to soft-sensors based on so-called intelligent soft-sensors [7]. Specially, the flexible artificial neural networks, support vector machines and relevance vector machines are often in the focus of researchers. Unfortunately, comparative analysis of these techniques for monitoring of biotechnological processes is not well explored. his paper investigates the quality of the above-mentioned intelligent soft-sensors for estimation of biomass and product concentration during a recombinant protein production process. he paper is structured as follows. In section, biotechnological process and data generation procedure is presented. Section 3 provides information about employed soft-sensors techniques. he quality of the employed soft-sensors and recommendations for users are discussed in section. II. S, [g kg-] X, [g kg-] S....3 F, [kg h-] qp, [g g- h-] -, [h ] PW, [g] P, [g kg-] O, [g kg-] x. Figure. ypical trajectories of the process variables for the analyzed biotechnological process Manuscript received June, 3; revised August 7, 3. 3 Engineering and echnology Publishing doi:.7/jolst BIOECHNOLOGICAL PROCESS AND INPU DAA FOR SOF-SENSORS L, [g kg-] Biotechnology and bioengineering play an increasing role in the modern industry and health sectors. Many of the new important active pharmaceutical ingredients are recombinant therapeutic proteins. hey are formed in genetically modified microorganisms or animal cells. hese biotechnological processes are very complicated and require precise real-time monitoring and state estimation of process variables [], []. In state estimation, one is faced with the problem of determining in real-time the actual state of the process represented by a set of important process variables. Without information about these variables, it is impossible to supervise and control the biotechnological processes efficiently. Preferentially, various measurement devices are employed to determine the values of the quantities characterizing the state of bioprocess []. Nevertheless the key quantities of biotechnological processes state (e. g., biomass and product concentrations, biomass specific growth rate, substrate specific consumption rate, etc.) cannot be measured online, at least not with justifiable expenditures and/or sufficiently short response times with respect to changes in the process. And even if the key quantities can be determined, it is often done not with the L INRODUCION OUR, [g kg- h-] I. F, [kg h-] Index erms state estimation, biomass and product concentrations, artificial neural networks, support vector regression, relevance vector regression. A, [g kg-] Abstract In this article, application of soft-sensors for indirect determination of biomass and product concentration in a complex fed-batch biotechnological process is discussed. hree advanced techniques for softsensor design were investigated: feed-forward artificial neural networks, support vector regression model, and relevance vector regression model. Glucose /lactose feed rates and oxygen uptake rate along with its integrated quantity were used as direct reference measurements for estimation. Estimation quality of analyzed soft-sensors was tested using data generated by mechanistic process model. All three analyzed estimation techniques provided very similar estimation results from statistical point of view; nevertheless employment of regression models has some advantage because of its simplicity. Based on that, recommendations for application of the elaborated softsensors are given. 3

2 o make comparative analysis of the intelligent softsensors, the authors applied them for state estimation purpose in a complex recombinant protein production process. he data for this analysis was generated by firstprinciple mathematical model developed for recombinant protein production process [8], [9]. In this fed-batch process, Escherichia coli B carrying the plasmid pubs was used as the host. A second plasmid p3, which is a derivate of the pkk3-3, was introduced into the strain. he bacterium was able to express the Fvfragment of the antibody MAK33 under the control of the tac-promoter. he structure and the parameters of the first-principle mathematical model were identified using data from a series of fed-batch processes on a defined medium. Initially, the culture was grown on a glucosebased medium until the glucose was depleted and the biomass concentration reached a value of about - [g kg-]. hen, the production of the recombinant protein was induced by addition of lactose, which was then utilized by the cells simultaneously as inducer and as carbon source. he biomass growth and product formation were controlled by means of glucose and lactose feed rates (FS and FL) and dissolved oxygen (O) concentration control. Additional details on the experimental set-up for the analyzed process, model development steps and process optimization procedure are given in [8], [9]. Mathematical model of the fed-batch cultivation process consists of differential equations for the main state variables: biomass (X), glucose (S), acetate (A), lactose (L), target product (P), dissolved oxygen (O), and broth weight (W). Additionally, the specific reaction rates for biomass growth (µ), glucose and lactose consumption (qs, ql), acetate production/consumption (qa) as well as for the target product formation (qp) were build using the well-known Monod, Haldane, Moser, and Pirt-type kinetics. his first-principle mathematical model was used as a virtual bioreactor to generate various data sets for the soft-sensors. he process can be divided into two distinct phases: ) biomass (X) growth on glucose (S) and ) biomass growth on lactose (L) with the simultaneous product (P) formation. In the first phase, during the biomass growth on glucose, acetate (A) is built. Its production rate was described by means of the bottleneck kinetics. In the second part of the first phase (3.-. cultivation hours), when the glucose consumption rate drops below the critical value due to substrate limitation, acetate is gradually consumed. In the first phase, there is no product generation as the inducer (lactose) is not present in the medium. In the second phase, which starts immediately after the glucose and acetate are consumed and lactose is fed, the production of the recombinant protein (P) is induced. he protein specific production rate depends on many factors but the most important ones are: relatively high specific growth rate (µ. [h-]) and sufficient inducer (lactose) concentration (L. [g kg-]) in medium. Oxygen uptake rate (OUR) and carbon dioxide production rate (CPR) are related to biomass growth rate and biomass concentration. his relationship was modelled using wellknown Luedeking-Piret equation with different yield coefficients in growth and product production phases. ypical trajectories of the analyzed biotechnological process are presented in Fig.. o design the soft-sensors for biomass and product concentration estimation data sets with various glucose and lactose feed profiles were generated. randomly selected data sets were used to identify the parameters of estimators and data sets for testing the quality of the estimators. As input variables for the estimators the following on-line measurable variables were chosen: glucose feed rate, lactose feed rate (adjusted for reactor weight, W), oxygen uptake rate (OUR), and integrated value of OUR. Output variables for the softsensors were biomass concentration and product concentration with measurement interval of. hour. o secure more realistic working conditions, all measurement data were disturbed by adding white noise disturbances (3% from the real value) to the simulation data. he quality of the soft-sensors was evaluated by comparison of root mean square errors (RMSE) between true and estimated values of biomass (RMSE X ) and product (RMSE P ) concentrations. III. ESIMAORS A. Model Based on Artificial Neural Network Artificial neural networks (ANN) are universal and highly flexible function approximators first used in pattern recognition, classification, and time series forecasting tasks [], [] and recently employed also in bioprocess monitoring tasks [], [3]. In this paper, the authors use feed-forward artificial neural networks for estimation of biomass and product concentrations in bioprocess. he general idea behind the ANN-based softsensor is to allow the ANN to map the nonlinear relationships between various measurable process variables and biomass and product concentrations. he most important components in the design of neural network models are the structure of the chosen ANN and the data necessary to train the network. In this study, the authors have used the data from the mentioned virtual reactor to train and evaluate a three-layer feed-forward neural network. he neural network was trained using Levenberg-Marquardt optimization method, and root mean square error between the predicted and real values was the optimization criteria. he input variables for ANN were coded values of glucose feed rate, lactose feed rate (adjusted for reactor weight, W), oxygen uptake rate (OUR), and integrated value of OUR. he output variable of ANN was biomass and product concentrations. he number of neurons in the hidden layer was chosen relatively high ( hyperbolic tangents neurons in the hidden layer). Such neural network can approximate very complex relationships between input and output variables but the generalization properties of such complex neural network can be very pure. Consequently, the authors used a special cross-validation procedure for ANN training. For that purpose, special regularization term D was included in the training criteria of the ANN, according the following equation: 3 Engineering and echnology Publishing

3 E (y d y ) w D w, SVR technique avoids underfitting and overfitting the training data by minimizing the training error C Σ (ξi + ξi) as well as the regularization term ½ ww. In the case of using classical least-square regression technique, ε is always zero and original data is not mapped into higher dimensional spaces. Equation () can be minimized by solving a quadratic programming problem, which is uniquely solvable. his is very important characteristic of support vector regression techniques because the training of a SVM involves only the solving of a quadratic optimisation task, which has one unique solution and does not involve the random initialisation of weights as training of ANN does. here are a lot of public available software libraries for realization of SVR methods. he authors used well-known SVM public software library LIBSVM [8] in their experimental investigations. When training an SVR model, user must choose some important parameters. he latter influence the performance of an SVR model. In order to get a satisfactory model, these parameters need to be selected properly. As it was already mentioned, the most important parameters are: mapping function, cost of error C, and the width of the ε insensitive tube. he radial basis function (RBF) is one of the most commonly used kernel function in SVR technique and the authors used it in their experiments. D I () N N where yd,y desired and real output, w- ANN weights, Iunit matrix, α reguliarization term, N- number of data. Regularization term allows to control the complexity of the ANN and to improve ANN s generalization quality []. Optimal regularization term α was determined using cross-validation procedure. Software package NNSYSID [] was used for realization of ANN based soft-sensor. B. Model Based on Support Vector Regression Method Support vector machine (SVM) procedure is a new and promising technique for data classification and regression []. he basic idea of SVM is to map the linear nonseparating training data from the input space into a higher dimensional feature space by means of a special function Φ and then to construct a separating hyperplane with maximum margin in the feature space. Consequently, although one cannot determine linear function in the input space to decide what class of the given data is, one can easily determine a hyperplane that can discriminate between two classes of data. Support vector regression (SVR) is a special modification of support vector machine technique dedicated for solving of regression problems. Given training data (x,y), (x,y)...(xl,yl), where xi are input vectors and yi are the associated output value thereof, the support vector regression solves the following optimization problem [], []: where xi is an input variable mapped to a higher dimensional space by the nonlinear function Φ; ξi and ξi are the upper and lower training errors subject to the εinsensitive tube y-(wφ(x)+b) ε, w model parameter vector. he parameters, which control the regression quality, are the cost of error C, the width of the insensitive tube ε, and the mapping function Φ. hese parameters must be set by user. he constraints in equation () imply that it is necessary to put most data xi in the tube y-(wφ(x)+b) ε. If xi is not in the tube, there is an error ξi or ξi, which one would like to minimize in the objective function. Graphical illustration of ε insensitive loss function is shown in Fig.. C. Model Based on Relevance Vector Regression Method Relevance vector regression technique is a Bayesian sparse kernel technique for regression that is very similar to the support vector regression technique but uses Bayesian inference to obtain parameters of the regression model [9]. When designing soft-sensors, the objective is to find an underlying functional model y(x) that estimates output values well, given input vector x, and is not compromised by noisy, real-world data. Such model y(x) could be of the following form: M y (x) wiφi (x) w (x) Figure. Graphical illustration of ε insensitive loss function [], [7]. l w w C ( i i ) i min w, b,, subject to yi ( w Φ ( xi ) b) i, () ( w Φ ( xi ) b ) yi i, i, i, i,...l, 3 Engineering and echnology Publishing (3) i his model is linear in the parameters and has a number of analytic advantages. Nevertheless, by choosing the basis functions Φm(x) to be nonlinear, y(x;w) will be nonlinear too. In particular, the basis functions often are given by kernels, with one kernel associated with each of the data points from the training set. his type of model is very flexible, and if statistical complexity of the model is appropriately managed, it can be very effectively applied to build efficient soft-sensors for various processes. In the presented experiments the authors used radial basis functions kernels for RVR model. Recently the sparse Bayesian methodology has been designed to estimate the regression parameters wi very efficiently [9]. As a result, the relevance vector regression models derived by these methodology comprise only few non-zero parameters wi and the model incorporates a compact set of basis functions only. In this application the authors used SparseBayes software package (MALAB environment) for implementation of RVR algorithms [8].

4 X, P, X, P, IV. RESULS AND CONCLUSIONS he quality of the analyzed soft-sensors was evaluated by calculating the root mean square error and maximum error for biomass and product concentration estimates in test-cultivation runs. All three analyzed estimation techniques provided very similar estimation results from statistical point of view. he testing results are presented in able I. ABLE I. ESING RESULS OF HE ANALYZED SOF-SENSORS FOR ESIMAION OF BIOMASS AND PRODUC CONCENRAION Soft-sensor (technique) Biomass concentration RMSE X Max-error Product concentration RMSE P Max-error results when trained with identical data and parameters; - RVR algorithm fixes the complexity of the model automatically using Bayesian inference to obtain a compact model; - SVR and RVR techniques are more robust for soft-sensor models with multidimensional inputs. In addition, the ANN based sensor has one significant drawback: estimation results depend on initial weights of the neural network. Consequently, when designing such sensors, different initial weights should be tested and only the best network should be selected for the real application. In the future the authors are planning to carry out more extensive studies to widen application area of SVR and RVR estimators in various industrial bioprocesses. Feed-forward ANN (estimation results depend on the initial weights, consequently average of estimation trials is presented here) Support Vector Regression Model Relevance Vector Regression Model.37 ±..8 ± ±.3. ± true value measurements fann SVR RVR he biomass estimation RMSE X for all three softsensors was around.38 and RMSE P for product concentration was around.. Maximum estimation error was around 3 for biomass estimation and around. for product estimation. he obtained estimation quality, especially for biomass concentration, is good enough to involve the analyzed soft-sensors for practical implementation in real bioreactors. ypical estimation patterns for biomass and product concentration in two cultivation runs are shown in Fig. 3. he obtained forecasting results are in some disparity with the today s opinion about the possibilities of SVR and RVR techniques. It is assumed that SVR and RVR should provide significantly better results as compared to the traditional feed-forward neural networks. Despite of the extensive attempts of the authors to find the best parameters for SVR and RVR models, their experimental results could not confirm this opinion. Nevertheless, SVR and RVR techniques have some important features, which are very useful in practical applications and could give some advantage when comparing with ANN based soft-sensors: - SVR and RVR estimators require less time and expertise to design and train the models in comparison with the feed-forward artificial neural networks; - SVR is trained with a structured algorithm (quadratic optimisation), which has one unique solution and so consistently produces the same t, h t, h Figure 3. ypical patterns of biomass and product concentration estimation for two cultivation runs ACKNOWLEDGEMENS his research was funded by a grant (No.MIP- /3) from the Research Council of Lithuania. REFERENCES [] B. Sonnleitner, Instrumentation of biotechnological processes, in Advances in Biochemical Engineering and Biotechnology, B. Sonnleitner, Ed., Springer, 999, pp [] C. F. Mandenius, Recent developments in the monitoring, modeling and control of biological production systems, Bioproc. Biosyst. Eng, vol., pp. 37 3,. [3] A. Chéruy, Software sensors in bioprocess engineering, J. Biotechnol., vol., pp , 997. [] R. Luttmann, D. G. Bracewell, Cornelissen, G. Gernaey, et al., Soft sensors in bioprocessing: A status report and recommendations, Biotechnology Journal, vol. 7, no. 8, pp. -8,. [] K. Kiviharju, K. Salonen, U. Moilanen, and. Eerikäinen, Biomass measurement online: the performance of in situ measurements and software sensors, J. Ind. Microbiol. Biotechnol, vol. 3, pp. 7, 8. [] P. Kadlec, B. Gabrys, S. Strandt, Data-driven soft sensors in the process industry, Comput. Chem. Eng, vol. 33, pp.79 8, 9. [7] M. M. Zhang and X. G. Liu, A soft sensor based on adaptive fuzzy neural network and support vector regression for industrial melt index prediction, Chemometrics and Intelligent Laboratory Systems, vol., pp. 83 9, July 3. [8] V. Galvanauskas, R. Simutis, N. Volk, and A. Lübbert, Model based design of a biochemical cultivation process, Bioprocess and Biosystems Engineering, vol. 8, no. 3, pp. 7-3, Engineering and echnology Publishing

5 [9] V. Galvanauskas, N. Volk, R. Simutis, and A. Lübbert, Design of recombinant protein production processes, Chemical Engineering Communications, vol. 9, no., pp.73-78,. [] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, Inc., 999. [] C. M. Bishop, Pattern Recognition and Machine Learning, Springer,. [] M. Jenzsch, R. Simutis, G. Eisbrenner, I. Stückrath, and A. Lübbert, Estimation of biomass concentrations in fermentation processes for recombinant protein production, Bioprocess and Biosystems Engineering, vol. 9, no., pp. 9 7,. [3] J. Glassey, M. Ignova, A. C. Ward, G. A. Montague, and A. J. Morris, Bioprocess supervision: neural networks and knowledge based systems, Journal of Biotechnology, vol., pp. -, 997. [] M. Nrgaard and P. M. Norgaard, Neural Networks for Modelling and Control of Dynamic Systems: A Practitioner's Handbook (Advanced extbooks in Control and Signal Processing), Springer.. [] V. Vapnik, Statistical Learning heory, Wiley, New York, NY, 998. [] B. Schölkopf and A. J. Smola, Learning with Kernels, MI Press, Cambridge, MA, [7] J. A. Suykens,. Gestel, J. Brabanter, B. Moor, and J. Vanderwalle, Least squares support vector machines, World Scientific, New Jersey,. [8] C. C. Chang and C. J. Lin, LIBSVM: A Library for Support Vector Machines,. [9] M. E. ipping, Sparse bayesian learning and the relevance vector machine, Journal of Machine Learning Research, vol., pp.,. [] Sparse Bayes Version. Software Package for Matlab. (3) [Online]. Available: Rimvydas Simutis is professor of automation and system Engineering at Process Control department, Kaunas University of echnology, Lithuania. His research interest includes computational intelligence methods and intelligent control systems with application in Bioengineering and echnical systems. Vytautas Galvanauskas is professor of control engineering at Process Control department, Kaunas University of echnology, Lithuania.. His research interest includes modern modeling, optimization and control techniques with application in Bioengineering and echnical systems. Donatas Levisauskas is professor of control engineering at Process Control department, Kaunas University of echnology, Lithuania. His research interest includes adaptive control systems and optimization techniques with application in Biotechnology. Jolanta Repsyte is associate professor of automation and control engineering at Process Control department, Kaunas University of echnology, Lithuania. Her research interest includes advanced automation systems with application in industrial processes and waste water treatment plants. Vygandas Vaitkus is associate professor of control engineering and system analysis at Process Control department, Kaunas University of echnology, Lithuania. His research interest includes computational intelligence and machine learning algorithms with application in Financial and echnical systems. 3 Engineering and echnology Publishing 7