An entropy based method for flood forecasting

Size: px
Start display at page:

Download "An entropy based method for flood forecasting"

Transcription

1 New Directions for Surface Water ModelingCProcecdines of the Baltimore Symposium, May IAHSPubl.no. 181,1989. An entropy based method for flood forecasting P.F. Krstanovic Louisiana Geological Survey P.O. Box G University Station Baton Rouge, LA 7893 V.P. Singh Department of Civil Engineering Louisiana State University Baton Rouge, LA 7893 ABSTRACT The entropy theory is used to develop a univariate model for forecasting of long-term streamflow. The model modifies the original Burg equations from the maximum entropy spectral analysis (MESA), and uses autocovariance matrix of streamflow record. The model is verified on five rivers from different regions of the world. The model forecasts are found comparable to those of the ARIMA model. It is shown that this model can be applied to other univariate hydrologie processes. INTRODUCTION Development of stochastic models for streamflow forecasting started with time series analysis (Box & Jenkins, 1976). Such models, both autoregressive (AR), autoregressive-moving average (ARMA), autoregressive-integrated moving average type, proliferated in 197's. The AR model can forecast streamflow without rainfall well, but only for certain values of sampling time interval (STI), and is not used in general for complex hydrologie processes. The ARMA model can represent mixed hydrologie processes, but is complex in parameter estimation, especially in the MA part. Other models such as fractional Gaussian noise (FGN) models are confined to long-term (year or more) forecasting, state-space models -- Kalman filter (KF) and extended Kalman filter (EKF) require precise calibration, and multivariate models require knowledge of additional processes such as rainfall or snowmelt. Currently, there are three forecast models based on the entropy theory. The first model is general and is not specifically developed for hydrological forecasting (Souza, 1978). The second is based on the entropy-minimax approach and has been applied to longterm annual forecasting of drought using seven stations in northern California (Christensen, 1981). The third employs maximum entropy spectral analysis (MESA), which was tested on Spring Creek in Louisiana (Krstanovic and Singh, 1987). To our knowledge, entropy does not appear to have been used for flood forecasting. The objective of this study is to develop a model for real-time streamflow forecasting using entropy. DEVELOPMENT OF STREAMFLOW FORECAST MODEL This development is based on MESA (Burg, 1975), usually used for reconstruction in frequency domain. In time domain, the Burg 15

2 P. F. Krstanovic &V.P. Singh 16 method has received little attention. However, many hydrologie problems such as flood forecasting do require solutions in time domain. Thus, the Burg method is adapted to develop a forecasting scheme discussed below. Streamflow values (x^) are known for a period <,T>, but not thereafter. Future values of the streamflow are predicted from the past values and the autocorrelation function (acf). In real-time forecasting (with updating), after the value x-r+i has been forecasted, it becomes available through measurements. In forecasting the value xj+2> we then use the updated record <,xy + i> and corrected acf. This procedure is repeated for subsequent time intervals. Forecasting involves acf matrix R, written for m lags as p()... p(-m) p(m) P() (1) This matrix is symmetric and non-negative definite, and its principal minors are either positive or zeros. In computing the new acf value (or extending the R matrix dimensions by 1), the new extended acf matrix must preserve its properties. This is enabled by three theorems (Burg, 1975). Specifically, the "Minimum Forecasting Error Theorem (MEET)" allows weighting of every p(k) value by some coefficient a^, such that the solution of the matrix equation p()... p(-m) 1 - Em p(m) P() ai. a m =. (2) is unique, where E m >. To extend the acf matrix and predict the unknown streamflow values xt, we need to solve equation (2) for a^ (k=l, m) and E, m* The solution is obtained by using the recursive algorithm developed originally by Levinson (Wiener, 195) and modified by Burg (1975). Using the Levinson-Burg algorithm, and taking m = N-l, the R Matrix is extended as: P(N)~ 1 'N-l AN b l N-l + c. + C. (3) > b N-l P(N) P() AN N-l

3 17 An entropy based method for flood forecasting where A.,, t> N and c., are algorithm coefficients. Complete solution of equation (3) is given by Burg (1975). In the extended acf matrix, the most recent autocorrelations have the greatest weights, i.e., in computing p(n), p(n-l) is weighted the most, then p(n-2), etc. The number of acf coefficients weighted is also the model order, i.e., in equation p(n) = - Z p(n-m) b (4) n=l " the model order is m. The extension of acf matrix R, given by equation (4), corresponds to the extension by maximum entropy (Burg, 1975). The necessary conditions for the extension are: E m = and CR < 1. Using this algorithm, streamflow values are forecasted according to: x T ( ) -^ (-a n )x T+1 _ n (5) for -step ahead forecast (I > 1) where " " denotes forecast. We call a n coefficients (equation (5)) and b n coefficients (equation (4)) the extension coefficients. These coefficients are the optimum prediction coefficients by virtue of the algorithm. However, in this entropy model, we also need to establish the importance of the Lagrange multipliers. The Lagrange multipliers (A's) measure the importance of acf at associated lags. Burg (1975) derived A's for the MESA, and Krstanovic (1988) modified them to <-l,+l> domain such that their pattern is similar to the pattern of the acf. Specifically, A j N-j 2 a. a... i= i i+j (6) N 2 a 1= 1 where je(-n,n) and A Q = 1 is the maximum Lagrange multiplier. VERIFICATION OF THE MODEL The streamflow model was tested for five data sets representing different climatological areas of the world, as shown in Table 1. We chose River Orinoco and River Caroni in South America, River Krishna and River Godavari governed by monsoon climate in India, and Spring Creek in Louisiana. According to the streamflow pattern, the test data can be classified as: strongly periodic seasonal (River Orinoco); periodic seasonal, but irregular with respect to flood peaks and volumes (River Krishna and River Godavari); periodic seasonal, but irregular with respect to flood peaks, volumes and shapes of flood hydrographs (River Caroni); and completely irregular (Spring Creek). We emphasized accuracy in prediction of peaks, volumes and times to peaks. All tested forecasts are long-term forecasts with no updating (feedback). For evaluating model performance, we also

4 P. F. Krstanovic &V.P. Singh 1 8 Table 1 The streamflows used in testing the univariate forecasting model No, Data Gage Site Drainage Area Record Source 1 Rio Orinoco, Venezuela Palua 95, km 2 monthly ( ) Laboratorlo Nacional de Hidraulica (1981) 2 Krishna River, India Vijayawada (Andhra Paresh) 231,355 km 2 monthly ( ) Kumar and Chander (1985) 3 Godavarl River, India Dowleswaran (Andhra Paresh) 299,32 km 2 monthly ( ) Kumar and Chander (1985) 4 Rio Caroni, Venezuela Guarampo Not available monthly ( ) Vecchla (1985) 5 Spring Creek, Louisiana» USA Rapides Parish, hydrologie unit mi 2 daily ( ) USGS (1986) tested possible residual dependencies by employing Anderson's correlogram test (Anderson, 1941). According to the test, the residual auto-correlations must be inside the confidence limits of an independent time series. We studied the chosen rivers with respect to climatological area. The Venezuelan rivers, River Orinoco and its tributary River Caroni, have diverse properties. The Orinoco is amongst the giant fluvial systems in the world with 1,, km 2 of drainage area and 28, m 3 /sec of annual mean flow. It exhibits strong seasonal behavior with characteristic annual cycle of 12 months. River Caroni is smaller both in size of drainage area and discharge capacity. It also exhibits seasonal behavior but with higher irregularities in flood seasons. The test data indicated that strong seasonal behavior of River Orinoco was modeled satisfactorily with a low order streamflow model (only two significant Lagrange multipliers). The River Caroni was fitted well with high model order (over 28 Lagrange multipliers). A possible reason is the more erratic streamflow behavior. A part of the long record is shown in Fig. 1. Given the data base of 2 years, we evaluated the model capability to forecast two different seasonal floods (in years 197 and 1971), as shown in Fig. la. For the first flood season of the year 197, the flood volume was over-predicted by 12%, and flood peak was underpredicted by 6%. The time of the maximum flood peak was lagged by 1 month, but the peak was inside the multiple flood peak interval. The forecasting errors were higher in the second flood season. The Indian rivers, River Krishna and River Godavari, also exhibit periodic behavior with maximum discharges concentrated in June and July during the monsoon season. In winter season, contributions to discharges are minimal and result in extremely low flows in both rivers. Despite periodicity, the peak flows vary widely. Thus, the streamflow model had to be of higher order for forecasting. For the River Krishna, assuming known record of and forecasting the season 1954, over 3 Lagrange multipliers were needed. For the River Godavari assuming known record of and forecasting seasons 195 and 1951, 14 to 15 Lagrange multipliers were necessary. The first case is shown in Fig. 2a. The first three months in 1954 were forecasted exactly on time (% error). The time of the highest peak was forecasted with lag of 1 month after its occurrence with forecast error of almost 25%. The rest of the flood hydrograph was forecasted with 1 to 5% accuracy.

5 19 An entropy based method for flood forecasting (a) (b) -, i. i i r TIME(MONTHS) * : OBSERVED STREAMFLOW X : RECONSTRUCTED STREAMFLOW B:ERROR BOUND Y: ARIMA MODEL Z: STATE-SPACE MODEL Figw1. River Caroni - monthly forecasts (1949-5): a. Forecast by the streamflow model, and b. forecasts by ARIMA and state space models. The flood volume was underpredicted by 2%. The Spring Creek typifies many Louisiana streams and is subject to short-duration, intense summer rainfall and to long-duration and moderate intensity winter rainfall. As a result, winter floods are longer, and summer floods are shorter but may be excessive. Sometimes excessive winter or completely dry summer seasons occur. The drainage area is relatively small, and more quickly responds to rainfall. We assumed the record to be known, and forecasted the years as shown in Fig. 3a. The volume of the hydrograph in this dry season was predicted to 12% error, but the predicted shape was not good. The dry extremes before and after the hydrograph peak were overpredicted. This highly irregular streamflow was fitted with over 6 Lagrange multipliers in the streamflow model. Consistency of the streamflow model was checked by examining the variance of the forecast residuals. Thus, we computed the error bounds around the mean forecasted value, as shown in all figures

6 P. F. Krstanovic & V. P. Singh 11 (b) n r TIME(MONTHS) *:OBSERVED STREAMFLOW X:RECONSTRUCTED STREAMFLOW B:ERROR BOUND Y: ARIMA MODEL Z: STATE-SFACE MODEL Fig, 2. River Krishna - monthly forecasts (1954): a. Forecast by the streamflow model, and b. forecasts by ARIMA and state space models. (denoted by "B"). all cases. The error bound intervals were found constant in COMPARATIVE EVALUATION We compared the forecasting results of the streamflow model with the ARIMA and state-space results. The ARIMA and state-space models were fitted using the Box-Jenkins procedure (Box & Jenkins, 1976) and the Akaike canonical correlation technique with the recursive Kalman filter algorithm (Akaike, 1976) respectively. Two criteria were employed for comparison of the forecasting models: (a) graphical or visual fit of the forecasted values to data, and (b) numerical (WMO, 1975) mean squared error of the forecasts (MSE), coefficient of variation of the residual error (vj), ratio of relative error to the mean (v2), and ratio of the absolute error to the mean (V3).

7 Ill An entropy based method for flood forecasting (a) 2 21 i ' 1 r T ' r (b) 27 TIME(MONTHS) OBSERVED STREAMFLOW RECONSTRUCTED STREAMFLOW ERROR BOUND ARIMA MODEL STATE-SPACE MODEL Fig. 3. Spring Creek - monthly forecasts ( ): a. Forecast by the streamflow model, and b. forecasts by ARIMA and state space models. For three sample rivers, River Caroni, River Krishna and Spring Creek, the forecasting results of the ARIMA and state-space models are shown in Figs, lb, 2b and 3b. Forecasts for the South American rivers showed overwhelming graphical and numerical similarity between the streamflow model and the ARIMA model. For River Caroni the long record results in more ARIMA parameters and the higher order streamflow model. Forecasts for the Indian rivers were cornparable. For the River Krishna, the streamflow forecasting model had the lowest MSE, vj, V2 and V3, but the ARIMA model predicted the shape of the flood hydrograph better, as shown in Fig. 2. However, prediction of both the peak and recession limb was delayed. The state-space model did not perform well. The ARIMA model had three AR parameters, one of which accounted for seasona- lity, while the streamflow model had significant A 1 s occurring at 12th and 24th lag. For forecasting at the Spring Creek, the streamflow model fore-

8 P. F. Krstanovic &V.P. Singh 112 casted much better, as shown in Fig. 3. The MSE value was also considerably lower. The ARIMA model forecasted poorly despite inclusion of 4 parameters. The reason may be high data irregularity. The streamflow model used over 6 significant Lagrange multipliers to account for this irregularity, but it fitted the record much better. CONCLUSIONS The following conclusions can be drawn from this study: (a) The streamflow model predicted periodicity and trend reasonably well. (b) The streamflow model and the ARIMA model were comparable in forecasting in the case of not very irregular streamflows. For such cases, theoretical equivalency between the two models existed. (c) The streamflow model and the ARIMA model, despite a long record, gave completely different forecasts for very irregular streamflows. Theoretical equivalency between the two models does not hold, (d) The streamflow periodicity in the ARIMA model was accounted for by either AR(12) or MA(12) parameters. The streamflow model accounted for this seasonality by including the seasonal Lagrange multipliers (i.e., À12, À24» etc.). The streamflow model can be applied to any other univariate stationary hydrologie process, where stochasticity is not of high order (i.e., minute rainfall). ACKNOWLEDGEMENTS: This study was supported by the Geological Survey, U.S. Department of the Interior, through the Louisiana Water Resources Research Institute; and the Department of Civil Engineering, Louisiana State University. REFERENCES Akaike, H (1976) Canonical correclation analysis of time series and the use of an information criterion. In Advances and Case Studies in System Identification (ed. by R. Mehra and D.G. Lainoitis, 27-96, Academic Press, New York, USA. Anderson, R.L. (1941) Distribution of the serial correlation coefficients. Annals of Mathematical Statistics, 8(1), Box, G.E.P. & Jenkins, G. (1976) Time series analysis, forecasting and control. Hoi day Day, San Francisco, California, USA. Burg, J.P. (1975) Maximum entropy spectral analysis. Ph.D. Thesis, Stanford University, Palo Alto, California, USA, University Microfilms, 75-25, 499. Christensen, R.A. (1981) An exploratory application of entropy minimax to weather prediction: estimating the likelihood of multi-year droughts in California. In Entropy minimax sourcebook, Vol. IV: applications (ed. by R.A. Christensen), , Entropy Limited, Lincoln, Massachusetts, USA. Krstanovic, P.F. (1988) Application of entropy theory to multivariate hydrologie analysis. Ph.D. Dissertation, Vols. I and II, Louisiana State University, Baton Route, Louisiana, USA.

9 113 An entropy based method for flood forecasting Krstanovic, P.F. & Singh, V.P. (1987) A multivariate stochastic flood analysis using entropy. In Hydrologie frequency modeling (ed. by V.P. Singh), , D. Reidel Publishing Co., Dordrecht, Holland. Souza, R.C. (1978) A Bayesian entropy approach to forecasting. Ph.D. Thesis, University of Warwich, Coventry, UK. Wiener, N. (195) Extrapolation, interpolation and smoothing of stationary time series with engineering applications. John Wiley and Sons, Inc., New York, New York, USA. World Meteorological Organization (WMO) (1975) Intercomparison on conceptual models used in operational hydrologic forecasting. Operational Hydrology Report No. 7, Secretariat of WMO, Geneve, Switzerland.

10