APPLICATION OF ARTIFICIAL NEURAL NETWORKS FOR WATER QUALITY PREDICTION

Size: px
Start display at page:

Download "APPLICATION OF ARTIFICIAL NEURAL NETWORKS FOR WATER QUALITY PREDICTION"

Transcription

1 International Journal of ISSN Systems and Technologies IJST Vol.1, No.2, pp KLEF 28 APPLICATION OF ARTIFICIAL NEURAL NETWORKS FOR WATER QUALITY PREDICTION P Sirisha 1, K N Sravanti 2 V Ramakrishna 3* Birla Institute of Technology and Science, Pilani, peyyetisirisha@gmail.com, karrasravanti@gmail.com K L College of Engineering, Vaddeswaram, ABSTRACT: Many domestic and industrial users are concerned about the hardness of water since it effects consumption of soap in laundry and formation of scales in boilers respectively for their applications. In the present study, empirical models based on multiple regression and artificial neural networks are developed to predict the value of hardness with respect to the corresponding values of chloride, fluoride, and calcium contents of the groundwater sample based on a region specific data. A thirty-point data set consisting of data regarding chloride, calcium, fluoride and hardness is taken and is used in developing the physical models for predicting the value of hardness based on the above-mentioned parameters. Initially a Multiple Regression Model is developed using Multiple Regression technique. The accuracy of the model is verified using a tenpoint data set by calculating the Standard Deviation (SD). The SD value in this study found to be high (.44). Novel techniques such as Artificial Neural Networks (ANNs) can be used to predict the output from the data set with better accuracy than that using Regression technique. Hence, ANNs are used in the present study to predict the hardness of water using the above data base. Back Propagation Network of ANN is used for the study and the results are obtained. The SD value obtained in the ANN model is encouraging (.54). Keywords: Artificial Neural Network Modeling, Multiple Regression Modeling, Water Quality Prediction, Hardness, Back Propagation Network, Physical Modeling. INTRODUCTION Industry and domestic users of water are concerned about the hardness of water and its effect on water quality. The range for water hardness depends on its usage. Predicting the hardness value of a water sample based on a few other parameters adds significance in this context. The models that are usually attempted for such type of studies are physical models. Physical modeling for any engineering application is usually based on proposing empirical relations with large amount of experimental data and the relevant nondimensional parameters using Regression

2 P.Sirisha, K.N Sravanti,V.Ramakrishna techniques. Physical modeling consists in seeking nearly the same similarity criterion for the model and the real process. Physical modeling for obtaining the trends of research results is very widely adopted by many researchers (Abbasi et al., 1996; Kumaresan and Bagavathiraj, 1996; Garg et al., 1998; Aravinda et al., 1998; Pande and Sharma, 1998; Babu and Ramakrishna, 22; Saxena et al., 25). A number of approaches are available for physical modeling such as Regression (Aishwath, 25; Lingeswara Rao et al., 25), Artificial Neural Networks etc. (Basheer and Yacoub, 1996; Behera, 1997; Babu and Ramakrishna, 22; Babu et al., 23; Debashish and Dharmappa, 24) which depend on huge amount of experimental data. Regression model is very simple in its approach, while ANNs have the ability to infer solutions from the data presented to them, capturing subtle relationships between the input and output information. In the present study, Hardness value of the water sample is predicted based on the values obtained for other parameters such as Chloride, Calcium and Fluoride content. Two different physical models using multiple regression and ANN are developed for predicting the Hardness value. The ground water quality data available from a small village in Rajasthan is considered for the study purpose. CASE STUDY The study zone in Rajasthan relies heavily on groundwater for fulfilling the needs of the population. Hardness of water is an important factor to be determined for deciding upon the purpose for which it can be used. A database comprising of forty-point data set is collected. The experimental database available for ten wells in the study zone comprising the set of parameters covering all the four parameters of the present study viz., Hardness, Chlorides, Calcium, Fluoride Content is collected. The range of the database is given in Table-1. Table-1. Range of variables used in the study Variable Upper Limit Lower limit Chloride (mg/l) Hardness (mg/l) Calcium (mg/l) Fluoride (mg/l) 1.8.6

3 The Physical Model is developed assuming three independent variables (Calcium, Chloride and Fluoride content) and a dependent variable (Hardness). The data base consisted of forty data points that are used for finding a relation between water hardness and calcium, chloride and fluoride content. A Multiple Regression Model is (MRM) developed using the above variables. The MRM is written as- Hardness = f (Calcium, Chloride, Fluoride) which is mathematically expressed as Hardness = a (Calcium) b1 (Chloride) b2 (Fluoride) b3 The values of a, b 1, b 2 and b 3 are found out from regression analysis. The values are , -.582,.989 and respectively. The regression of the MRM is carried out using SPSS for windows (version 7.5.1). Thirty data points are used to develop the Regression Model and ten points (TPDS) are used to test the relation thus found and to calculate standard deviation. The accuracy of the prediction is determined using Standard Deviation (SD) calculated with reference to the actual data. The SD used (Babu and Ramakrishna, 22; Ramakrishna, 24) in the present study is determined using: SD n 1 y1 y y1 n where y 1 is the expected hardness, y 2 is the predicted hardness and n is the no of data points. The results yielded a SD of.44 for the database. The results are shown in Fig. 1. It may be noticed from Fig.1 that most of the results obtained from MRM are underpredicted values compared to that of actual values. 117

4 P.Sirisha, K.N Sravanti,V.Ramakrishna actual value predicted value Fig.1: Comparison of results obtained for prediction of hardness values with that of actual values using MRM ANN MODELING The success of empirical modeling is highly dependent on the thorough understanding and complete knowledge of the physical phenomenon that is taking place in the given system. The present study proved that Regression Techniques have the limitation in accurately predicting the output due to incomplete understanding of physical phenomena. Under these circumstances, one has to try the novel techniques such as Artificial Neural Networks (ANNs), which can serve as an alternative and can be successfully applied for Physical Modeling purposes (Basheer and Yacoub, 1996; Babu et al., 23; Debashish and Dharmappa, 24; Ramakrishna, 24). ANN can be used to predict the output from the data set with better accuracy than that using Regression technique. ANN is a form of artificial intelligence designed, from a blue print of the brain and the central nervous system, to simulate brain s capability to think and learn through perception, reasoning and interpretation. They have the ability to infer solutions from the data presented to them, capturing subtle relationships between the input and output information (Behera, 1997). Specifically they can handle problems involving complex non-linear mapping or relationships. The results obtained in the present study using MRM are not encouraging (SD.44). This situation makes the ANN modeling approach a rational choice for the prediction of hardness.

5 Standard Deviation (SD) A model based on a three-layer Back propagation Network (BPNM-1) of ANN available at Pythia The Neural Network Design software, version 1.2 is used to verify the accuracy of hardness prediction results with that obtained from MRM. The network was trained by varying the learning parameters such as Number of Neurons in Hidden Layer (NNHL), Learning Rates (LR) and Epochs by fixing the error tolerance as.1. The network is trained in two different Trials viz., Trial-1 and Trial-2 using a combination of learning parameters. The details are given in Table-2. Table-2: Combination of Learning parameters used in the study Learning Parameters Trial-1 Trial-2 NNHL 2,3,4,5,6,7,8,9,1 2,3,4,5,6,7,8,9,1 Epochs 3, 6, 1, 3, 6, 1, Error Tolerance.1.1 LR.25.5 The network is tested for each of the Trials i.e., Trial-1 & Trial-2 with the same ten-point data set (TPDS) that is used for testing the accuracy of MRM. The SD values are calculated from the results. The SD values obtained vs. NNHL are plotted (Refer Fig. 2) for each of the two values of LR tested. The best combination of the network is chosen based on the lowest SD value obtained from these two trials. It may be noted from Fig. 2 that the combination of LR=.5 and NNHL = 6 shows the lowest value of SD (.54). Hence the optimum combination of learning parameters arrived from the study are: NNHL=6; Epochs = 1,; ET=.1; LR=.5. The predicted values obtained for this optimum combination are compared with that of actual values of hardness. The results are shown in Fig SD for LR.5 SD for LR Number of Neurons in Hidden Layer (NHHL) Fig 2: Comparative values showing SD vs. NNHL for different values of LR for BPNM-1 119

6 Standard Deviation (SD) hardness P.Sirisha, K.N Sravanti,V.Ramakrishna Actual Values Predicted Values data points Fig.3. Comparison of results obtained for prediction of hardness values with that of actual values for optimum combination of BPNM-1 It may be noted from Fig. 3 that the predicted values of hardness are very close to that of actual values of hardness proving that prediction of hardness using ANN is more effective than that using MRM. Theoretically, hardness depends upon the chloride and calcium contents of a water sample. The relationship is explored by the BPN model (BPNM-2) using the available database (excluding the values available for Fluoride) in the present study. The network is tested for each of the Trials i.e., Trial-1 & Trial-2 with the same ten-point data set (TPDS) that is used for BPNM - 1.The SD values obtained vs. NNHL are plotted (Refer Fig. 4) for each of the two values of LR tested Number of Neurons in Hidden Layer (NNHL) SD for LR =.5 SD for LR =.25 Fig 4: Comparative values showing SD vs. NNHL for different values of LR for BPNM-2

7 hardness It may be noted from Fig. 4 that the combination of LR=.5 and NNHL = 8 shows the lowest value of SD (.34). Hence the optimum combination of learning parameters arrived from the study are: NNHL=8; Epochs = 1,; ET=.1; LR=.5. The predicted values obtained for this optimum combination are compared with that of actual values of hardness. The results obtained (Refer Fig. 5) are also encouraging but show relatively high error in terms of SD when compared to that obtained using BPNM Actual Values Predicted Values data points Fig.5. Comparison of results obtained for prediction of hardness values with that of actual values for optimum combination of BPNM-2 It shows that, probably the relationship of hardness with respect to chloride and calcium alone is not complete but data from few other ions is also essential to accurately establish the relationship. The present study highlighted that (1) the Regression approach has a limitation in understanding the relation among the variables used in modeling resulting in erratic prediction whereas ANN has the ability to map the input values with the output values to bring out the best possible prediction (2) the ANN has the ability to back propagate the error obtained at the output neuron(s) till a minimum error specified by the user is obtained leading to more accurate predictions. Similar observations are reported in literature (Babu and Ramakrishna, 22; Ramakrishna, 24). (3) the theoretical relationship among the parameters of water quality is ascertained from the study. 121

8 P.Sirisha, K.N Sravanti,V.Ramakrishna 4.SUMMARY AND CONCLUSIONS A physical model is developed based on multiple regression technique and ANN to predict the hardness value of a water sample. Groundwater quality data available from a village in Rajasthan is used for this purpose. Data sets consisting of thirty- and ten- data points are used to develop and test the validity of the models respectively. Results showed that, physical modeling using Regression approach is not giving encouraging results due to incomplete understanding of the relationship among the variables involved in modeling. Prediction using ANN is relatively better than that of regression model due to its flexibility to map the inputs to outputs. REFERENCES Abbasi S.A., D.S. Arya, A.S. Ahmed, and Naseema Abbasi. (1996). Water quality of a typical river of Kerala: Punnurpuzha, Pollution Research, 15(2), Aishwath O.P. (25). Coefficient of Variation and Correlation Coefficient in underground Water Quality parameters in and adjoining municipal area of Boriavi, Gujarat, India, Pollution Research, 24(4), Aravinda H.B., S. Manjappa, and E.T. Puttaih. (1998). Correlation coefficients of some physico-chemical parameters of River Thunga Bhadra, Karnataka, Pollution Research, 17(4), Babu B.V. and V. Ramakrishna. (22). Applicability of Regression Technique for Physical Modeling, Proceedings of International Symposium & 55 th Annual Session of IIChE (CHEMCON-22), O.U.College of Engineering, December 19 22, pp , Hyderabad. Babu B.V., V. Ramakrishna and K. Kalyan Chakravarthy. (23). Artificial Neural Networks for Modeling of Adsorption, Second International Conference on Computational Intelligence, Robotics, and Autonomous Systems (CIRAS-23), Singapore, December Basheer A. I. and N. M. Yacoub. (1996). Predicting dynamic response of adsorption columns with neural nets, Journal of Computing in Civil Engineering, ASCE, 1 (1), Behera L. (1997). Artificial Neural Networks & Applications, EDD Notes, Educational Development Division, Birla Institute of Technology & Science, Pilani.

9 Debashish R. and D. Dharmappa. (24). Predicting Effluent Oil and Grease (O & C) for a primary Sewage Treatment plant using Artificial Neural Networks (ANN), Indian Chemical Engineer, Section B Vol. 46, No. 2. Garg V.K., I. S. Sharma, and M. S. Bishnoi. (1998). Fluoride in underground waters of Uklana town, District Hisar, Haryana, Pollution Research, 17(2), Kumaresan A. and B. K. Bagavathiraj (1996). Physico-chemical and Microbiological aspects of Courtallam water, Pollution Research, 15(2), Lingeswara Rao S.V., T. Sambasiva Rao and S. Sreenivasulu (25). Analysis of ground water quality of Nellore coast by Correlation Technique, Nature Environmental Pollution Technology, 4(4), Pande K.S. and S.D. Sharma. (1998). Natural purification capacity of Ramganga river at Moradabad (U.P), Pollution Research, 17(4), Ramakrishna V. (24). Modeling for Wastewater Treatment by Adsorption using Analytical-, Regression-, and Neural Network- Approaches, Ph.D. Thesis, Birla Institute of Technology and Science, Pilani. Saxena S, N. Jain and R.K. Shrivastava. (25). Fluoride pollution in ground water region of Jabalpur region: Part I - Correlation between occurrences of Sodium and Fluoride, Pollution Research, 24(4),