OPTIMIZATION OF FED-BATCH BAKER S YEAST FERMENTATION PROCESS USING LEARNING ALORITHM

Size: px
Start display at page:

Download "OPTIMIZATION OF FED-BATCH BAKER S YEAST FERMENTATION PROCESS USING LEARNING ALORITHM"

Transcription

1 Proceedings of th regional symposium on Chemical Engineering (RSCE) OPTIMIZATION OF FED-BATCH BAKER S YEAST FERMENTATION PROCESS USING LEARNING ALORITHM H. S. E. Chuo, M. K. Tan, H. J. Tham and K. T. K. Teo School of Engineering and Information Technology, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia ABSTRACT Industrial fed-batch fermentation process is a typical highly nonlinear dynamic process that requires good controlling technique and monitoring to achieve optimization. Due to the various uncertainties in the process, optimization to get the desired product quality with less capital and without compromising the environment is therefore, and has always been an issue of interest in chemical process control. This research aims to report an optimization in substrate feeding to maximize the production and to minimize the production of under the influence of process disturbance using Q-learning (QL) algorithm. QL is a form of reinforcement learning. It is of interest to investigate the ability of QL in exploring and handling the uncertainties of a dynamic nonlinear system without prior experience of the system. The algorithm applies step-to-step learning to estimate the effect of the system upsets and seek for the best path to optimize the process. On the other hand, monitoring on concentration of is needed to prevent the quality of from deterioration with the increasing concentration. The performance of the proposed controller is evaluated based on the boosting of the amount of final produced and the disturbance rejection.. INTRODUCTION Fed-batch operations have been favourable for its manipulatable input stream with accordance to process necessity and design. With this the resource wastage, propagation process error or extreme conditions can be avoided, at the same time maintaining the product quality. In recent years, industrial fed-batch fermentation processes have been carried out widely to boost the bioproduction (Hoek et al., ). In a fermentation process, relies on sufficient sugar (substrate), oxygen, and nutrients for growth, producing carbon dioxide and alcohol as metabolite. A good control on substrate feeding, oxygen concentrations and the broth conditions hence determine the robustness in the growth of. However, there remain challenges in fermentation process control. Johnson () summarized the difficulties in measuring process variables and determining the set-point values as the process dynamicity tends to cause the parameters and optimal trajectory to swift with the changing of time. In terms of the difficulties in getting an accurate model, controllers such as fuzzy, neural-network and etc are developed to study the behabiour of metabolism without the need for the mathematical model (Jin et al., ; Hisbullah et al., ). Other than that, numerical converging method such as evolutionary algorithm (Yüzgeç, ; Franco-Lara and Weuster Botz, ; Chiou and Wang, ), and dynamic programming that performs iterative states searching (Peroni et al., ; Berber et al., ) were suggested for optimal control. In this work, an unsupervised learning agent that is able to explore and interact with its environment, and to learn the best conditions for fermentation is aimed. In this paper, Q-learning (QL) was attempted to determine the optimal substrate feeding under the influence of process disturbances. The performances of QL to maximize the production under low concentration were reported. MAL-- Page

2 Proceedings of th regional symposium on Chemical Engineering (RSCE). PROCESS DESCRIPTION To maximize the production of, it is necessary to maintain the growth of at its exponential phase whereby the reproduction of is favourable. At this phase, the substrate, oxygen, nutrients must be sufficient in order to maintain the sustainable growth of. However, overfeeding of substrate will accelerate the production that eventually affects the quality. This phenomenon is called the bottleneck hypothesis or the overflow metabolism (Sonnleitnert and Käppeli, ; Pham et al., ). On the other hand, low substrate concentration will lower the ethnol production, meanwhile lower the reproduction of. The substrate insufficiency and starvation tends to cause small amount of being consumed as food (Sonnleitnert and Käppeli, ; Pham et al., ). In this paper, the industrial-based fermentation model developed by Karakuza et al. (Yüzgeç, ) is used as the simulation plant as shown in Table, with the plant parameters shown in Table. The process is run for h by manipulating the flow rate of the substrate feeding stream to achieve the optimization. Table : Kinetic and dynamic model (Yüzgeç, ; Karakuzu et al., ; Pham et al., ; Sonnleitnert and Käppeli, ) Parameters Kinetic equations Glucose uptake rate () Oxidation capacity () Specific growth rate (limit) Oxidative metabolism Reductive metabolism Ethanol uptake rate Oxidative metabolism Ethanol production rate () Total specific growth rate () Carbon dioxide production rate () Oxygen consumption rate () Respiratory quotient () Component Dynamic mass balance equations Glucose mass balance Oxidation balance Ethanol mass balance Biomass balance Total change of volume () () () () () () () () () () MAL-- Page

3 Proceedings of th regional symposium on Chemical Engineering (RSCE) Table : Numerical values of parameters of fed-batch fermentation (Yüzgeç, ) This simulation was carried out under the assumptions that: (a) The system is in quasi steady state that changes in temperature and pressure are negligible; (b) Gas phase is insignificant that only liquid phase is considered (Pertev et al.,) and the system is well-mixed; (c) The substrate, oxygen and metabolism follows the Monod kinetics; (d) There is no deficiency in nutrients supply. To achieve the goal of maximum and minimum production, monitoring on concentration at low amount is essential. The development of Q-learning are discussed in section.. Q-LEARNING ALGORITHM There are summation of three main components in Q-learning process, which are past experience, current reward and future state, with referring to equation (). The learning agent needs to accumulate its experience through exploring and interacting with the environment until it finds the best pathway to take to achieve its goal, which is the purpose of the fermentation process. This is indicated by α which is the learning rate of the system. The bigger the value of α, the more the dependence of the process on the past experiences. While more attempts the learner are making, a system s experience will increase with respect to time and the time taken to learn and make decision will be shorten throughout the process. Along the leraning process, rewards is accumulated, indicated by the term R t and the final state is determined by the route that gives the most rewards. The future state is represented by the term γ, which is the discount factor, indicates the importance of process accuracy and where it is heading towards. In other words, the Q-learning algorithm attempts to learn a state-action pair value Q(s,a) by starting in state s, taking an action a, and following the optimal policy thereafter using value iteration (Huang et al., ). The steps of iterations is developed based on Fig.. (). Let the current state be s.. Select an action a to perform.. Let the reward received for performing a be R, and the resulting state be Q t+.. Update Q(s,a) to reflect the observation <s, a, R, Q t+ >.. Go back to step. Fig. The Q-learning algorithm (Dearden et al., ). MAL-- Page

4 Proceedings of th regional symposium on Chemical Engineering (RSCE) The role of Q-learning was to study the dynamics of the fermentation plant in section to determine the optimal feeding profile. The input and output into the plant would be send into Q- learning controller to update the Q-values by each iterations. The maximum Q-value was chosen from each iterations to determine the optimal path. In this case, feeding rate range from L/h to L/h was tried on each iterations so best feeding rate could be chosen. The sampling time was. h with the total fermentation time of h. The volume of the reactor was m with the initial broth of m. The flow of the entire process is simplified as shown in Fig.. Fig. The simulation plant with Q-learning controller. The initial substrate, and oxygen concentration in broth were g/l, g/l and mg/l respectively, with substrate feeding stream at concentration of g/l. The presence of and carbon dioxide in reactor are nil at the beginning of the process. The process was run under the four different conditions: (i) open loop with nominal exponential feeding, F = e.t, (ii) optimal feeding using Q-learning control, (iii) nominal exponential feeding under the influcence of disturbance, and (iv) optimal feeding under disturbance to evaluate the performance of Q- learning controller.. RESULTS AND DISCUSSION The concentration of, and production under normal exponential feeding are shown in Fig. (a). The initial substrate concentration ( g/l) was less than initial concentration ( g/l) to avoid generation in broth. The initial concentration must be high enough for more biomass reproduction without starvation. At the initial stage of process, the reproduction rate of was low, and was produced before h but was reconsumed and remained zero thereafter. Fig (a) shows the exponential growth of throughout the process. At h, the total obtained is approximately g/l. The substrate feed flow curve is shown in Fig. (b), together with the increasing culture colume throughout the process. Volume x Fig. (a) Concentration profiles of, and throughout the process and (b) Exponential feed flow rate and change of volume of the system. MAL-- Page

5 Proceedings of th regional symposium on Chemical Engineering (RSCE) Under the same initial conditions, Q-learning was applied to determine the substrate feeding rate in case to test the performance of the controller and the results are shown in Fig.. At the end of the process, the produced showed a steady growth and had increased to approximately double compared to the exponential feeding as seen in Fig. (a), i. e. at approximately g/l. The suggested optimal feeding profile using Q-learning is shown in Fig. (b). By the optimal profile, the amount of produced was reduced, as shown in Fig. (a). x Volume. Fig. (a) Concentration profiles of, and using Q-learning control and (b) Optimal feed flow rate suggested using Q-learning with the respective change of volume of the system..... Additional added into the feed stream can upset the system, resulting in extra formation. At h and h, an increased concentration of had caused to overshoot and affected the growth of, as shown in Fig. (a). In a fed-batch system, the additional contaminant can only be removed at the end of the process hence overshooting in concentration is undesirable and will affect the quality of the. Exponential feeding in Fig. (b) can only be stopped or tuned by human operator using trial and error method once undesirable conditions happen. Volume x Fig. (a) Concentration profiles of, and throughout the process under process disturbance and (b) Exponential feed flow rate and respective change of volume of the system. To tackle this problem, Fig. shows the optimal trajectory in reducing the effect of process disturbance and while continuing to increase the using Q-learning. The corrective action was taken before the situation deteriorated hence maintaining the steady growth of. At the moment additional and ethnol overshot were encountered, the supply of substrate was minimized as shown in Fig. (b). Approximately g/l was produced throughout the h and Q-learning showed controlled trajectory and response of satisfactory. MAL-- Page

6 Proceedings of th regional symposium on Chemical Engineering (RSCE) Volume x Fig. (a) Concentration profiles of, and under process disturbance using Q- learning control and (b) Optimal feed flow rate suggested by Q-learning to tackle disturbance and the respective change of volume of the system... CONCLUSION In this work, it was shown that the application of unsupervised Q-learning algorithm in a highly non-linear dynamic bioprocess is possible. The optimized feeding profile developed produced maximum at the end of the process, and mimimising production. The results also show that the proposed approach was able to performed well in rejecting disturbance. REFERENCES. A. Johnson, Automatica, Vol., No. (), pp.-.. B. Q. Huang, G. Y. Cao, and M. Guo, in Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, (), pp. -.. B. Sonnleitnert, and O. Käppeli, Journal of Biotechnology and Bioengineering, Vol. XXVIII (), pp. -.. C. Pertev, M. Türker, and R. Berber, Journal of Computers and Chemical Engineering, Vol., Suppl. (), pp. S-S.. C. V. Peroni, N. S. Kaisare and J. H. Lee, IEEE Transactions on Control System Technology, Vol, (), pp.-.. E. Franco-Lara and D. Weuster-Botz, Journal of Bioprocess and Biosystem Engineering, Vol (), pp. -.. H. T. B. Pham, G. Larsson., and S. O. Enfors, Journal of Biotechnology and Bioengineering, Vol. () (), pp. -.. Hisbullah, M. A. Hussain and K. B. Ramachandran, Trans IChemE, Vol (c) (), pp. -.. J. P. Chiou and F. S. Wang, Journal of Computers and Chemical Engineering, Vol (), pp.-.. P. V. Hoek, A. Aristidou, J. J. Hahn and A. Patist, Chemical Engineering Progress, January (), pp. S S.. R. Dearden, N. Friedman and S. Russell, in Proceedings of American Association for Artificial Intelligence (AAAI), (), pp. -.. S. Jin, K. M. Ye, K. Shimizu and J. Nikawa, Journal of Fermentation and Bioengineering, Vol, (), pp. -.. U. Yüzgeç, M. Türker, and A. Hocalar, ISA Transactions, Vol. (), pp. -. MAL-- Page