Comparison of Probability Distributions for Frequency Analysis of Annual Maximum Rainfall

Size: px
Start display at page:

Download "Comparison of Probability Distributions for Frequency Analysis of Annual Maximum Rainfall"

Transcription

1 Comparison of Probability Distributions for Frequency Analysis of Annual Maximum Rainfall Abstract Estimation of rainfall for a desired return period is of utmost importance for planning, design and management of hydraulic structures in the project site. This can be achieved by fitting of probability distributions to the recorded Annual 1-day Maximum Rainfall (AMR) data through Rainfall Frequency Analysis (RFA). Method of moments is used for determination of the parameters of probability distributions, which are used in RFA. Chi-square and Kolmogorov-Smirnov tests are applied for checking the adequacy of fitting of the distributions to the recorded AMR data. A diagnostic test of D-index is used for the selection of a suitable distribution for estimation of rainfall. The paper presents the Extreme Value Type-1 distribution is better suited amongst six distributions studied in rainfall estimation for Banswara whereas Generalized Extreme Value distribution for Visakhapatnam. Keywords: Chi-square, D-index, Probability distribution, Kolmogorov-Smirnov, Rainfall 1. INTRODUCTION Extreme rainfall events and the resulting floods usually cause a lot of damage to life and properties of human society. Determination of the frequencies and magnitudes of these events are important for flood plain management and design of hydraulic structures, civil protection plans, etc. However, length of available records is not enough large to define the risk of flood, extreme rainfall, low-flow, drought, etc. In these cases, magnitude-frequency analysis involves fitting of samples to a frequency distribution permit the estimation of how often a specific event will occur or the frequency of events greater than those observed during the period records (Rao and Hamed, 000). The frequency analysis includes three underlying assumptions such as (i) the extremes are random variable, and thus can be described by a distribution of probability; (ii) the data series is independent and (iii) the probability distribution does not change from sample to sample (homogeneity). Number of probability distributions such as Exponential (EXP), Extreme Value Type-1 (EV1), Extreme Value Type- (EV), Generalized Extreme Value (GEV), Generalized Pareto (GPA), and Normal (NOR) are generally used in RFA Generally, Method of Moments (MOM) is used for determination of parameters of the distributions. In the recent past, number of studies has been carried out by different researchers on adoption of probability distributions for RFA. Topaloglu (00) reported that the frequency analysis of the N. Vivekanandan Central Water and Power Research Station, Pune anandaan@rediffmail.com largest, or the smallest, of a sequence of hydrologic events has long been an essential part of the design of hydraulic structures. Guevara (003) carried out hydrologic analysis using probabilistic approach to estimate the engineering design parameters of storms in Venezuela. Lee (005) carried out the RFA for analyzing the rainfall distribution characteristics of Chia-Nan plain area. Bhakar et al. (006) studied the frequency analysis of consecutive day s maximum rainfall at Banswara, Rajasthan, India. Fang et al. (007) proposed an approach based on the peak-over-threshold sampling method and a non-identical Poisson distribution to model the flood occurrence within each season. Chen et al. (010) proposed the use of a copula function to jointly model the distributions of flood magnitude and date of occurrence. Mujere (011) applied the Gumbel distribution for modelling flood data for the Nyanyadzi River, Zimbabwe. Baratti et al. (01) carried out a flood frequency analysis on seasonal and annual time scales for the Blue Nile River adopting the Gumbel distribution. Olumide et al. (013) adopted the Normal and Gumbel distributions for the prediction of rainfall and runoff at the Tagwai dam site in Minna, Nigeria. But there is no general agreement in applying particular distribution for RFA for different region or country. Moreover, when different distributional models are used for modelling of rainfall data series, a common problem that arises is how to determine which model fits best for a given set of data. This can be answered by formal statistical procedures involving Goodnessof-Fit (GoF) and diagnostic tests; and the results are quantifiable and reliable. Qualitative assessment is made from the plot of the recorded and estimated rainfall. For quantitative assessment on rainfall within in the recorded range, Chi-square ( ) and Kolmogorov-Smirnov (KS) tests are applied. A diagnostic test of D-index is used for the selection of suitable probability distribution for RFA. The study compares the six probability distributions used in RFA, and illustrates the applicability of GoF and diagnostic tests procedures in identifying which distributional model is best for estimation of rainfall for Banswara and Visakhapatnam.. METHODOLOGY In the present study, MOM is adopted to determine the parameters of the six probability distributions, which are further used in RFA. Table 1 gives the details of quantile Eoryx Publications ISSN: Page 50

2 function and parameters six probability distributions (using MOM) considered in the study. Table 1: Quantile function and parameters of probability distributions S. No. Distribution Quantile function (R T ) Parameters by MOM 1 EXP R T log(1 F) (known); μ = R EV1 RT log( logf) R α = ( 6 π) S R 3 EV e ( ln( ln(f))/ k By using the logarithmic transformation of the RT recorded data, parameters of EV1 are initially computed by MOM and used to determine the parameters of EV from e and k=1/(scale parameter of EV1). 4 GEV R k T (1 log F ) / k R ( ( (1 k) 1)/ k) 5 GPA R (1 1 Fk ) / k 6 NOR (1 k) (1 k 1/ SR k ) (1 3k) 3(1 k)(1 k) (1 k) (sign k) (1 k) (1 k) 1/ R ) T R = ξ + α /(1+ k) ; S = α /(1+ k)(1+ k C (1 k)(1 k) /(1 3k) 1 RT (F) μ = R; σ = SR In Table 1, F(R) (or F) is the cumulative distribution 1 function (CDF) of R; P is the probability of exceedance; is the inverse of the standard normal distribution function and 1 (P (1 P) ZP ) / ;,, k are the location, scale and shape parameters respectively; µ (or R ), (or S R ) and C S (or ) are the average, standard deviation and coefficient of skewness of the recorded rainfall data; sign(k) is plus or minus 1 depending on the sign of k ; R T is the estimated rainfall by the probability distribution for a return period (T). Goodness-of-Fit Tests GoF tests such as and KS are applied for checking the adequacy of fitting of probability distributions to the series of recorded rainfall data for estimation of rainfall. Theoretical description of statistic is as follows: NC O j (R) E j (R) (1) j 1 E j(r) where, O j(r) is the observed frequency value of j th class, E j(r) is the expected frequency value of j th class and NC is the number of frequency classes. The rejection region of statistic at the desired significance level () is C 1. Here, m denotes the number of parameters,ncm1 of the distribution. The KS statistic is defined by: N KS MaxF e R i F D R i () i1 Eoryx Publications ISSN: Page 51 S Here, Fe Ri (i 0.44)/(N 0.1) is the empirical CDF of R i in which i is the rank assigned to the sample values arranged in ascending order and F D R i is the computed CDF of R i. If the computed values of GoF tests statistic given by the distribution are less than that of theoretical values at the desired significance level, then the distribution is acceptable for estimation of rainfall (Zhang, 00). Diagnostic Test The selection of a suitable distribution for rainfall estimation is performed through D-index, which is defined by: 6 D-index = * 1 R R i R i (3) i1 Here, R is the average value of the series of the recorded rainfall, R s (i=1 to 6) are the first six highest values in the i * series of recorded rainfall and R i is the estimated rainfall by probability distribution. The distribution having the least D- index is considered as better suited distribution for rainfall estimation (USWRC, 1981). 3. APPLICATION An attempt has been made to estimate the rainfall for Banswara and Visakhapatnam stations adopting six probability distributions (using MOM). Daily rainfall data recorded at the Banswara for the period and Visakhapatnam for the period is used to derive the series of Annual 1- day Maximum Rainfall (AMR) and further considered for EVA. Table gives the summary statistics of the AMR recorded at the stations under study.

3 Station Table : Summary statistics of AMR Summary statistics of AMR R (mm) SD (mm) Skewness Kurtosis Banswara Visakhapatnam SD: Standard Deviation 4. RESULTS AND DISCUSSIONS By applying the procedures described above, a computer program was developed and used to fit the AMR recorded at Banswara and Visakhapatnam. The program computes the parameters of the probability distributions (using MOM), GoF tests statistic and D-index values. Tables 3 and 4 give the rainfall estimates obtained from six distributions for the stations under study. Table 3: Rainfall estimates given by six probability distributions for Banswara Return period Estimated rainfall (mm) (year) EXP EV1 EV GEV GPA NOR Table 4: Rainfall estimates given by six probability distributions for Visakhapatnam Return period Estimated rainfall (mm) (year) EXP EV1 EV GEV GPA NOR From Tables 3 and 4, it may be noted that the estimated rainfall using EV distribution is relatively higher than the corresponding values of other five distributions for the return periods of 100-yr and above for Banswara and Visakhapatnam. Rainfall Frequency Curves (RFCs) The rainfall estimates obtained from six probability distributions were used to develop the RFCs and presented in Figures 1 and. From Figures 1 and, it can be seen that the RFCs using five distributions other than EV are in the form of linear for Banswara and Visakhapatnam. From the trend lines of the fitted curves, it can also be seen that there is a perfect line of agreement in the upper and lower tail regions while estimating the rainfall by EV1 distribution for Banswara and GEV for Visakhapatnam. Analysis Based on GoF Tests For the present study, the degree of freedom (NC-m-1) is considered as two for 3-parameter distributions (GEV and GPA) and three for -parameter distributions (EXP, EV1, EV and NOR) while computing the statistic values for Banswara and Visakhapatnam. GoF tests statistic was computed from Eqs. (1) and (), and given in Table 5 for the stations under study. Eoryx Publications ISSN: Page 5

4 Figure 1: Plots of recorded rainfall and RFCs for Banswara Figure : Plots of recorded rainfall and RFCs for Visakhapatnam Eoryx Publications ISSN: Page 53

5 Table 5: Computed and theoretical values of GoF tests statistic Probability Computed values of Theoretical values at 5 distribution KS percent level Banswara Visakhapatnam Banswara Visakhapatnam KS EXP (for EV Banswara) EV GEV (for GPA Visakha- NOR patnam) Based on -test results, it may be observed that the GPA distribution is acceptable for estimation of rainfall for Banswara whereas the five distributions other than NOR is acceptable for Visakhapatnam. Similarly, from the KS test results, it may be observed that the five distributions other than EV for Banswara) and GPA (for Visakhapatnam) are acceptable for estimation of rainfall. Analysis Based on Diagnostic Test For the selection of a best suitable distribution for estimation of rainfall, the D-index values of six probability distributions were computed from Eq. (3) and given in Table 6. Table 6: D-index values of six probability distributions Station Indices of D-index EXP EV1 EV GEV GPA NOR Banswara Visakhapatnam From Table 6, it may be noted that the indices of D-index of (using GPA) for Banswara and (using EXP) for Visakhapatnam are comparatively minimum when compared to the corresponding values of other probability distributions. But, the rainfall estimates given by EXP and GPA distributions are generally less accurate when compared to other distributions in the tail regions. By considering the trend lines of the fitted curves by the probability distributions in the lower and upper tail regions, it is identified that the EV1 is the most appropriate distribution for estimation of rainfall for Banswara whereas GEV for Visakhapatnam. 5. CONCLUSIONS The paper presented a computer aided procedure for determination of parameters of six probability distributions (using MOM) for estimation of rainfall for Banswara and Visakhapatnam. The selection of a suitable probability distribution was evaluated by GoF (using and KS) and diagnostic (using D-index) tests. The -test results showed that the GPA distribution is acceptable for estimation of rainfall for Banswara whereas the five distributions other than NOR is acceptable for Visakhapatnam. The KS test results also showed that the distributions other than EV (for Banswara) and GPA (for Visakhapatnam) are acceptable for estimation of rainfall. By considering the trend lines of the fitted curves by probability distributions in the tail regions, the study identified that the EV1 distribution is the most appropriate distribution for estimation of rainfall for Banswara whereas GEV for Visakhapatnam. The study suggested that the 1000-year return period rainfall of about 665 mm (using EV1) and 500 mm (using GEV) may be considered for the design of hydraulic structures in Banswara and Visakhapatnam stations respectively. ACKNOWLEDGEMENTS The author is grateful to the Director, Central Water and Power Research Station, Pune, for providing the research facilities to carry out the study. The author is thankful to the M/s Nuclear Power Corporation of India Limited, Mumbai and India Meteorological Department, Pune, for the supply of rainfall data. REFERENCES [1] Rao AR and Hamed KH, Flood frequency analysis, CRC Publications, New York, 000. [] Topaloglu F, Determining suitable probability distribution models for flow and precipitation series of the Seyhan River basin, Turkish Journal of Agriculture and Forestry, Vol. 6, pp , 00. [3] Guevara E, Engineering design parameters of storms in Venezuela, Hydrology Days, pp , 003. [4] Lee C, Application of rainfall frequency analysis on studying rainfall distribution characteristics of Chia-Nan plain area in Southern Taiwan, Journal of Crop, Environment & Bioinformatics, Vol., pp , 005. [5] Bhakar SR, Bansal AK, Chhajed N and Purohit RC, Frequency analysis of consecutive days maximum rainfall at Banswara, Rajasthan, India. ARPN Journal of Engineering and Applied Sciences, Vol.1, pp.64-67, 006. [6] Fang B, Guo S, Wang S, Liu P and Xiao Y, Nonidentical models for seasonal flood frequency analysis, Journal of Hydrological Sciences, Vol. 5, pp , Eoryx Publications ISSN: Page 54

6 [7] Chen L, Guo S, Yan B, Liu P and Fang B, A new seasonal design flood method based on bivariate joint distribution of flood magnitude and date of occurrence, Journal of Hydrological Sciences, Vol. 55, pp , 010. [8] Mujere N, Flood frequency analysis using the Gumbel distribution, Journal of Computer Science and Engineering, Vol. 3, pp , 011. [9] Baratti E, Montanari A, Castellarin A, Salinas JL, Viglione A and Bezzi A, Estimating the flood frequency distribution at seasonal and annual time scales, Hydrological Earth System Science, Vol. 16, pp , 01. [10] Olumide BA, Saidu M and Oluwasesan A, Evaluation of best fit probability distribution models for the prediction of rainfall and runoff volume (Case Study Tagwai Dam, Minna-Nigeria), Journal of Engineering and Technology, Vo1. 3, pp , 013. [11] Zhang J, Powerful goodness-of-fit tests based on the likelihood ratio, Royal Statistical Society, Vol. 64, pp , 00. [1] United States Water Resources Council (USWRC), Guidelines for determining flood flow frequency, Bulletin No. 17B, pp , Eoryx Publications ISSN: Page 55