DATA ANALYTICS FOR OPTIMAL WATER USAGE IN AGRICULTURAL LANDS OF SOUTHERN PARTS OF INDIA

Size: px
Start display at page:

Download "DATA ANALYTICS FOR OPTIMAL WATER USAGE IN AGRICULTURAL LANDS OF SOUTHERN PARTS OF INDIA"

Transcription

1 International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 10, October 2018, pp , Article ID: IJCIET_09_10_070 Available online at ISSN Print: and ISSN Online: IAEME Publication Scopus Indexed DATA ANALYTICS FOR OPTIMAL WATER USAGE IN AGRICULTURAL LANDS OF SOUTHERN PARTS OF INDIA Deepa Kanmani. S Assistant Professor, Karunya Institute of Technology and Sciences Shamila Ebenezer. A Assistant Professor, Karunya Institute of Technology and Sciences Leela. S Assistant Professor, Karunya Institute of Technology and Sciences Mahima. S mahimglory@gmail.com ABSTRACT Water scarcity is a major problem in the southern parts of India for the past few years resulting in people facing a major water crisis. Water demand depends on the region s climate. Agricultural water demand is a serious issue particularly at the southern parts of India like Tamil Nadu, Kerala and Andhra Pradesh. This paper aims to enable optimal water usage in the agricultural field of southern parts of India in order to reduce water scarcity and to produce an efficient water management system. On predicting the Evapotranspiration level, water usage in the agriculture can be reduced to a large extent that will reflect in effective water conservation. In this paper linear based regression model ARIMA, is combined with Penmann, Hargreaves- Samani, Blaney-Criddle, Turc, Mcguinness-Bordneand inorder to predict Evapotranspiration level. Keywords: Analytics, linear regression, evapotranspiration level, optimal usage Cite this Article: Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S, Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India, International Journal of Civil Engineering and Technology, 9(10), 2018, pp editor@iaeme.com

2 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India 1. INTRODUCTION An agricultural field requires a large usage of water which is a little hard to satisfy because of the water crisis that is a major problem in South India. Optimal usage of water resources in agriculture of the southern region has become increasingly important because of hasty reduction of water resources, industrial growth and community increase, drought conditions, and poverty of ground and surface water standard in many parts of the India.Evaporation and evapotranspiration are crucial parts of the water cycle and water balance. Prediction of evapotranspiration (ET o ) is one of the key components of the water cycle. In this paper, the problem that has been emphasized is the water scarcity faced by people in South India and how this problem affects the agricultural field. A model that considers the main climatic factors that influence the values of Evapotranspiration is developed that could be implemented in order to reduce this water crisis. An efficient method that predicts the level of water required by the crops and plants would help people uses the water as per the needs and hence, there is optimal conservation of water. Characterizing the climate-related water demand is playing major part for development of irrigated farmland over India. Gathering sufficient information to facilitate to improve the land use by optimal use of available water resources ET is a variable which compute water loss from a depends mainly on a strict knowledge on crop water demand issue. All these issues were discussed in [1,2,3]. The authors[4] discussed about the calculation methods of lake evaporation. They proposed their work with 30-year dataset from Dickie lake south-central Ontario, Canada on lake evaporation. Lake evaporation is critical for investigate and supervision of water resources and ecosystems. Evaporation during ice-free period was calculated independently using seven evaporation methods, based on field meteorology, hydrology and lake water temperature data. Seven methods utilized here are: Hamon (HM), Penman (PM), Priestley- Taylor (PT), DeBruin-Kejiman (DK), Jensen-Haise (JH), Makkink (MK) and water balance (WB). The authors [5] discussed the planning and management of water resources systems needs the data of the magnitude and variation of evaporative losses. There exist a multitude of methods for measurement and calculation of evaporation, which can be grouped into seven different classes: They are: (i) empirical (ii) water budget (iii) energy budget (iv) mass transfer (v) combination (vi) radiation and (vii) measurement. This paper made eight radiation-based equations for estimating evaporation were evaluated using meteorological data from Changins station in Switzerland. The methods are Turc method, Makkink method, Jensen_Haise method, Hargreaves method, Doorenbos and Pruitt method, McGuinness and Bordne method, Priestley and Taylor method, Abtew method. The authors of [6] discussed about the climate in Georgia and other southeastern states of the United States is admitted to be humid and the annual precipi-tation is usually greater than the annual potential evapotranspiration ET. Here Priestley-Taylor PT equation has been taken into compute ET in Georgia. They made a study for a site in the humid southeastern United States found that PT overestimated ET. The authors [7] aimed to investigate the evaporation and evapotranspiration measured on the Căldăruşani Lake, and their influence on the lake's water volume. This paper made different analysis on i) the temporal variation of the evaporation and evapotranspiration and the control of some controlling climatic parameters ii) the relationship calculation between the evaporation and evapotranspiration; and iii) the degree of water slaughter by evaporation and evapotranspiration. The study is focused on climatic, hydrologic and morphometric data editor@iaeme.com

3 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S of the Căldăruşani Lake. The analysis is performed for the period by using statistical and linear correlations mathematical methods. The authors [8] investigated the El Haouareb dam (Merguellil catchment) in central Tunisia, which is an important characteristic of semi-arid environments. This paper made focus on most estimates of loss from water bodies located in semi-arid environments bear from the insufficient of data, or biased field measurements. It is very much important one for hydrologists to assess the comparative performance of the various available methods used to calculate this water loss, as well as their uncertainties. The main objective of this paper is to produce and manage optimal water usage in the Agricultural field of Southern parts of India in order to reduce water scarcity and satisfy the needs and demands to produce an efficient water management system. The data is collected and analyzed in order to determine the method that efficiently predicts Evapotranspiration. 2. PROPOSED SYSTEM DESIGN Figure 1 Overall design of proposed system Figure. 1 gives the overall design of proposed system. The original data is preprocessed by removing noisy and redundant data. This will reduce the compilation time and program size. Feature selection is the process of selecting the attributes that have an impact on the output attribute. Such features are selected and they are different for each method. The dataset the first analyzed to verify the output obtained using the code and the sensors. Based on the features selected, a model is developed for each method which is a simple linear equation that tells about the relationship between the output attribute and the input predictor variables. The accuracy of these models is computed to identify the most accurate model. An ARIMA model and a model combining these methods are developed and their accuracies are compared to find the most accurate model that could be implemented in the agricultural field of South India Module implementation Data pre-processing Data is retrieved from the given overall dataset according to the year i.e.2013, 2014, 2015 and Firstly, Data cleaning is performed where duplicate, unknown and inconsistent values were removed by filtering the required, necessary values from the year wise dataset. In the given dataset, the data from the year 2013 consisted of many unknown values (-999) which were removed. The data of the year 2014 consisted of inconsistent data which were also removed. This decreases the processing time and size of the dataset imported in RStudio. The data hence acquired is imported in RStudio which is further processed editor@iaeme.com

4 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India 2.2. Feature selection There are many attributes that influence the Evapotranspiration value. According to the given dataset, these are the factors that were computed that contribute to the value of Evapotranspiration. On performing feature selection, we can identify that the method used for analyzing the data and for model development require certain features. These features have been tabulated in the table below. Table 1 Features Selected for each method Methods used Features Selected Penman Tmax, Tmin, RHmax, RHmin, Rs, n and uz Hargreaves-Samani Tmax and Tmin Blaney-Criddle Tmax, Tmin, RHmin, n and uz Turc Tmax, Tmin and n McGuinness-Bordne Tmax and Tmin 2.3. Model development Model development is the process of indicating the relationship between the input predict or variables or input attributes and the output attribute or outcome variable by developing an equation. Consider Y as the outcome variable and X as the input predictor variable. Then the model for this wouldbe: Y = β1 + β2x Where, β1 is the intercept and β2 is the slope Penman function It is method in R that takes the features T max, T min, RH max, RH min, uz,r s as input to predict Evapotranspiration. The output generated is the daily, monthly and annual estimations of ET values. To analyze the entire yearly data, need to specify arguments required by Penman method like the time series, missing values, wind function, humidity level etc. Figure 2 gives the analysis about daily and monthly Evapotransipration for 2013 Dataset using Penman Method editor@iaeme.com

5 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S Figure 3 gives the analysis about daily and monthly Evapotransipration for 2014 Dataset using Penman Method. Figure 4 gives the analysis about daily and monthly Evapotransipration for 2015 Dataset using Penman Method. Figure 5 Open water Evaporation daily analysis (2016) A model is built using linear regression for this method based on the features it takes as input. The model developed for Penman method: (1) editor@iaeme.com

6 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India 2.5. Hargreaves-SamaniMethod This method in R that takes the features T max and T min as input to predict Evapotranspiration. The output generated is the daily, monthly and annual estimations of ET values. To analyze the entire yearly data, we have to specify arguments required within the Hargreaves-Samani method like the time series we need, the way to compensate for the missing values, wind function, humidity level etc. We have obtained the following outputs. Figure 6 gives the analysis about daily Evapotransipration for 2013 Dataset using Hargreaves-Samani method. Figure 7 gives the analysis about daily Evapotransipration for 2014 Dataset using Hargreaves-Samani method Figure 8 gives the analysis about monthly Evapotransipration for 2015 Dataset using Hargreaves- Samani method editor@iaeme.com

7 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S Figure 9 Open water Evaporation daily analysis (2016) A model is built using linear regression for this method based on the features it takes as input. The modeldeveloped for Hargreaves-Samani method: 2.6. Blaney-CriddleFunction It is method in R that takes the features p and T mean as input to predict Evapotranspiration. The output generated is the daily, monthly and annual estimations of ET values. To analyze the entire yearly data, we have to specify arguments required within the Blaney-Criddle method like the time series we need, the way to compensate for the missing values, wind function, humidity level etc. We have obtained the following outputs on perfoming these. (2) Figure 10 gives the analysis about daily and monthly Evapotransipration for 2013 Dataset using Blaney-Criddle method Figure 11 gives the analysis about daily Evapotransipration for 2014 Dataset using Blaney-Criddle method editor@iaeme.com

8 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India Figure 12 gives the analysis about daily Evapotransipration for 2015 Dataset using Blaney-Criddle method Figure 13 Open water Evaporation monthly analysis(2016) A model is built using linear regression for this method based on the features it takes as input. The model developed for Blaney-Criddle method: 2.7. TurcMethod It is method in R that takes the features T mean, R s, uz, C as input to predict Evapotranspiration. The output generated is the daily, monthly and annual estimations of ET values. To analyze the entire yearly data, we have to specify arguments required within the Turc method like the time series we need, the way to compensate for the missing values, wind function, humidity level etc. We have obtained the following outputs on performing these. (3) editor@iaeme.com

9 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S Figure 15 gives the analysis about monthly Evapotransipration for 2014 Dataset using Turc method. Figure 16 gives the analysis about daily Evapotransipration for 2015 Dataset using Turc method. Figure 17 Open water Evaporation daily analysis (2016) Figure 17 gives the analysis about daily using Turc method. A model is built using linear regression for this method based on the features it takes as input. The model developed for Turc method: 2.8. Mcguinness-BordneFunction It is method in R that takes the features T mean, R s, uz, C as input to predict Evapotranspiration. The output generated is the daily, monthly and annual estimations of ET values. To analyze the entire data, we must specify arguments required by this method like the time series we need, to compensate for the missing values, wind function, humidity level etc. We have obtained the following outputs on performing these. (4) editor@iaeme.com

10 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India Figure 19 gives the analysis about daily Evapotransipration for 2014 Dataset using mcguinness-bordnemethod Figure 20 gives the analysis about daily and monthly Evapotransipration for 2015 Dataset using mcguinness-bordne method

11 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S A model is built using linear Figure 21 Open water Evaporation daily analysis (2016) s ) (5) 2.9. ARIMA((AUTOREGRESSIVE INTEGRATED MOVINGAVERAGE) Model ARIMA models are used for time series forecasting for stationary time series. The series is made stationary by differentiating using diff( ) function. ARIMA Model consistsoftheargumentstospecifyorderofautoregressionandmovingaveragewhich is identified using p and q values. These orders are identified using ACF and PACF plots. Using these plots, the best model can be fit for the given dataset. Accuracy for the models is computed using AIC (Akaike information criterion) and BIC (Bayesian information criterion). ARIMA(2,0,1)Where p=2,d=0 and q=1 Two models combining all these 5 methods are developed where for one model the Evapotranspiration value predicted by Mcguinnes-Bordne is taken as the output attribute as it has the maximum accuracy and for the other model the mean value of Evapotranspiration predicted by all 5 methods is taken as the output attribute editor@iaeme.com

12 Data Analytics for Optimal Water Usage in Agricultural Lands of Southern Parts of India Combined model with mcguinnes-bordne evapotranspiration prediction as output attribute Combined model with mean evapotranspiration prediction as output attribute (6) 3. TESTING The model is developed using the training data and the tested with the testing dataset to analyze theaccuracy of the model. The accuracy is computed using Mean Squared error, Mean absolute percentage error, AIC (Akaike Information criterion), BIC(Bayesian Information criterion).the results obtained from the model built using linear regression is cross validated using k- Fold Cross validation ARIMA MODEL (7) Figure 23 Comparison between actual and predicted values by ARIMA Model 4. CONCLUSION Water scarcity is a big problem in South India for the past few years especially in the agricultural field where majority of the water supply is needed. Hence, a solution is needed for this problem. By predicting Evapotranspiration level using the climatic factors that contribute to it, we can predict the level of water that is needed and hence, water conservation is successfully accomplished. Time forecast is an efficient way to forecast the future values on a daily, monthly and annual basis. This will enable the farmers to understand the pattern for the Evapotranspiration. ARIMA is considered as an efficient method for time series forecast. It is also used to develop a model in order to compare and find the most accurate model that could be implemented. Linear regression has aided in identifying the most significant factors that influence the Evapotranspiration value and hence these values can be found out in order to compute Evapotranspiration. As these models have been developed editor@iaeme.com

13 Deepa Kanmani. S, Shamila Ebenezer. A, Leela. S and Mahima. S considering only the most significant climatic factors and also have been verified by means of measures like Mean square error, mean absolute percentage error, AIC, BIC. The results obtained from the model built using linear regression is cross validated using k- Fold Cross validation. According to these verification methods, the combined model that is developed by considering the mean values of Evapotranspiration generated by the methods Penman, Hargreaves-Samani, Blaney-Criddle, turc and Mcguinnes-Bordne has the maximum accuracy of %. Hence, this combined model is considered most efficient method that could be implemented in the agricultural field. REFERENCES [1] Bapuji Rao Bodapati, Sandeep Vm,P. Shantibhushan Chowdary et al., Reference crop evapotranspiration over India: A comparison of estimates from open pan with Penman- Monteith method,december 2013,Journal of agrometeorology. [2] S Goroshi, R Pradhan, RP Singh, KK Singh et al., Trend analysis of evapotranspiration over India: Observed from long-term satellite measurementsjournal of Earth System Science 126 (8), 113 [3] Jensen, M.E., Burman, R.D., Allen, R.G., Evapotranspiration and Irrigation Water Requirements. ASCE Manuals and Reports on Engineering Practices, No. 70. ASCE, New York, p. 360 [4] H. YAO, "Long-Term Study of Lake Evaporation and Evaluation of Seven Estimation Methods: Results from Dickie Lake, South-Central Ontario, Canada," Journal of Water Resource and Protection, Vol. 1 No. 2, 2009, pp [5] C. Y. Xu,V. P. SinghEvaluation and generalization of radiation-based methods for calculating evaporation Research Article wiley online library [6] Ayman A. Suleiman, Gerrit HoogenboomComparison of Priestley-Taylor and FAO-56 Penman-Monteith for Daily Reference Evapotranspiration Estimation in Georgia, Journal of Irrigation and Drainage Engineering Vol 133 Issue 2 April 2007 [7] Florentina Iuliana stan, GianinaNeculauLilianaZaharia, GabrielaIoana-ToroimacStudy on the Evaporation and Evapotranspiration Measured on the Căldăruşani Lake (Romania) Procedia Environmental SciencesVolume 32, 2016, Pages [8] M.Alazard, C Leduc, Y.Travi et al., Estimating evaporation in semi-arid areas facing data scarcity: Example of the El Haouareb dam (Merguellil catchment, Central Tunisia) Journal of Hydrology: Regional StudiesVolume 3, March 2015, Pages editor@iaeme.com