Forecasting Cash Withdrawals in the ATM Network Using a Combined Model based on the Holt-Winters Method and Markov Chains

Similar documents
Choosing Smoothing Parameters For Exponential Smoothing: Minimizing Sums Of Squared Versus Sums Of Absolute Errors

Forecasting Introduction Version 1.7

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS

AWERProcedia Information Technology & Computer Science

Chapter 2 Forecasting

COORDINATING DEMAND FORECASTING AND OPERATIONAL DECISION-MAKING WITH ASYMMETRIC COSTS: THE TREND CASE

Cluster-based Forecasting for Laboratory samples

Comparison of Efficient Seasonal Indexes

A Parametric Bootstrapping Approach to Forecast Intermittent Demand

STATISTICAL TECHNIQUES. Data Analysis and Modelling

Adaptive Time Series Forecasting of Energy Consumption using Optimized Cluster Analysis

THE IMPROVEMENTS TO PRESENT LOAD CURVE AND NETWORK CALCULATION

Prediction of Labour Rates by Using Multiplicative and Grey Model

Comparative study on demand forecasting by using Autoregressive Integrated Moving Average (ARIMA) and Response Surface Methodology (RSM)

Folia Oeconomica Stetinensia DOI: /foli FORECASTING RANDOMLY DISTRIBUTED ZERO-INFLATED TIME SERIES

Selection of a Forecasting Technique for Beverage Production: A Case Study

Application of the method of data reconciliation for minimizing uncertainty of the weight function in the multicriteria optimization model

Synagogue Capacity Planning

THREE LEVEL HIERARCHICAL BAYESIAN ESTIMATION IN CONJOINT PROCESS

arxiv: v1 [cs.lg] 15 Nov 2017

A Decision Support System for Performance Evaluation

Revision confidence limits for recent data on trend levels, trend growth rates and seasonally adjusted levels

LECTURE 8: MANAGING DEMAND

Determination of Optimum Smoothing Constant of Single Exponential Smoothing Method: A Case Study

Getting Started with OptQuest

Electricity Price Modeling Using Support Vector Machines by Considering Oil and Natural Gas Price Impacts

Using the Analytic Hierarchy Process in Long-Term Load Growth Forecast

New Clustering-based Forecasting Method for Disaggregated End-consumer Electricity Load Using Smart Grid Data

Silky Silk & Cottony Cotton Corp

A Data-driven Prognostic Model for

Estimation, Forecasting and Overbooking

Machine Learning Models for Sales Time Series Forecasting

Forecasting Construction Cost Index using Energy Price as an Explanatory Variable

Integration of Demand Management in Production Planning and Purchasing Management: Metal Packaging Industry The Colep Case Study

Stochastic Optimization for Unit Commitment A Review

In Chapter 3, we discussed the two broad classes of quantitative. Quantitative Forecasting Methods Using Time Series Data CHAPTER 5

Maintenance of an ATM network: modeling of cash flows, analysis of cash demand and customer habits

Correlation and Instance Based Feature Selection for Electricity Load Forecasting

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

MBF1413 Quantitative Methods

Traditional Forecasting Applied to Retail Demand

Forecasting product demand for a retail chain to reduce cost of understocking and overstocking Group B5

Statistics, Data Analysis, and Decision Modeling

A Protein Secondary Structure Prediction Method Based on BP Neural Network Ru-xi YIN, Li-zhen LIU*, Wei SONG, Xin-lei ZHAO and Chao DU

There are several other pages on this site where simulation models are described. Simulation of Random Variables Simulation of Discrete Time Markov

Ant Colony Optimization

OPTIMAL ALLOCATION OF WORK IN A TWO-STEP PRODUCTION PROCESS USING CIRCULATING PALLETS. Arne Thesen

An Oracle White Paper November Oracle Planning Central Cloud and Supply Planning Cloud: Managing Safety Stock

Comparison of Different Independent Component Analysis Algorithms for Sales Forecasting

Short-term Electricity Price Forecast of Electricity Market Based on E-BLSTM Model

APPLICATION OF TIME-SERIES DEMAND FORECASTING MODELS WITH SEASONALITY AND TREND COMPONENTS FOR INDUSTRIAL PRODUCTS

Incorporating macro-economic leading indicators in inventory management

Estimation, Forecasting and Overbooking

The application of hidden markov model in building genetic regulatory network

Fraud Detection for MCC Manipulation

Core vs NYS Standards

An evaluation of simple forecasting model selection rules

An Adaptive Inventory Management System for Hospital Supply Chain

Risk Simulation in Project Management System

DEVELOPMENT OF A NEURAL NETWORK MATHEMATICAL MODEL FOR DEMAND FORECASTING IN FLUCTUATING MARKETS

Elementary Indices for Purchasing Power Parities

OPERATIONS RESEARCH Code: MB0048. Section-A

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions

An Integrated Model for a Two-supplier Supply Chain with Uncertainty in the Supply

Introduction to Artificial Intelligence. Prof. Inkyu Moon Dept. of Robotics Engineering, DGIST

A Comparative Study of Different Statistical Techniques Applied to Predict Share Value of State Bank of India (SBI)

What is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur...

Enhancing Forecasting Capability of Excel with User Defined Functions

INDIAN INSTITUTE OF MATERIALS MANAGEMENT Post Graduate Diploma in Materials Management PAPER 18 C OPERATIONS RESEARCH.

SIOPRED performance in a Forecasting Blind Competition

New Methods and Data that Improves Contact Center Forecasting. Ric Kosiba and Bayu Wicaksono

A Simple EOQ-like Solution to an Inventory System with Compound Poisson and Deterministic Demand

AUSTRALIAN ENERGY MARKET OPERATOR

Variable Selection Of Exogenous Leading Indicators In Demand Forecasting

Short Term Load Forecasting by Using Time Series Analysis through Smoothing Techniques

A simple model for low flow forecasting in Mediterranean streams

Study programme in Sustainable Energy Systems

Examination. Telephone: Please make your calculations on Graph paper. Max points: 100

INTRODUCTION BACKGROUND. Paper

Choosing the Right Type of Forecasting Model: Introduction Statistics, Econometrics, and Forecasting Concept of Forecast Accuracy: Compared to What?

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Genetic Algorithm for Supply Planning Optimization under Uncertain Demand

Multi-factor Stock Selection Model Based on Adaboost

Optimal Ordering Policies for Inventory Problems in Supermarkets under Stochastic Demand: A Case Study of Milk Powder Product

Applied Hybrid Grey Model to Forecast Seasonal Time Series

White Paper. Leveraging Artificial Intelligence in Cash Forecasting

Vipul Mehra December 22, 2017

THE ROLE OF NESTED DATA ON GDP FORECASTING ACCURACY USING FACTOR MODELS *

Influence of Natural Gas Price and Other Factors on ISO-NE Electricity Price

Analytic Hierarchy Process (AHP) in Ranking Non-Parametric Stochastic Rainfall and Streamflow Models

Lot Sizing for Individual Items with Time-varying Demand

Creative Commons Attribution-NonCommercial-Share Alike License

Deterministic Crowding, Recombination And Self-Similarity

INSIGHTS PREDICTIVE SCORES

University Question Paper Two Marks

Day-Ahead Price Forecasting of Electricity Market Using Neural Networks and Wavelet Transform

Centre for Computational Finance and Economic Agents WP Working Paper Series. Biliana Alexandrova-Kabadjova, Edward Tsang and Andreas Krause

Time-Constrained Restless Bandits and the Knapsack Problem for Perishable Items (Extended Abstract) 1

The Grey Clustering Analysis for Photovoltaic Equipment Importance Classification

Reliability demonstration for complex redundant systems in railway applications

Transcription:

Forecasting Cash Withdrawals in the ATM Network Using a Combined Model based on the Holt-Winters Method and Markov Chains 1 Mikhail Aseev, 1 Sergei Nemeshaev, and 1 Alexander Nesterov 1 National Research Nuclear University MEPhI, Department of Cybernetics Kashirskoye Highway, 31, Moscow, 115409, Russian Federation. Abstract The article describes the method of forecasting time series of cash withdrawals in ATMs. The hybrid model is based on two methods: Holt-Winters additive method and Markov chainbased model. The combination occurs with the help of weight coefficients which are calculated on the basis of work of each model. Holt-Winters method forecasts time series with trend and seasonal variations. Markov chain enables to forecast patterns of basic time series, such as peaks or holes. The composition of these two approaches will allow banks and other financial organizations to predict cash withdrawals more accurately than the methods used separately. Keywords: Markov Chains, Holt-Winters Method, Time Series, Forecasting, Statistics INTRODUCTION Many companies have accumulated the statistical data that can be used to solve a range of tasks related to business analytics and forecasting. The examples of such tasks include identifying needs for goods, defining market trends, forecasting future sales, and other similar tasks. The information obtained in the consideration of these problems is also applied for planning logistics, development of the company strategy, and evaluation of financial risks. The efficiency of this problem solving depends directly on the quality of the data analysis and the forecast accuracy. The banking sector is one of the economic sectors where the application of business forecasting methods is required. Currently, most banks are expanding their ATM networks [1], and the question of minimizing network maintenance costs is becoming increasingly vexed. The mechanisms enabling to determine dates and amount of each ATM network replenishment are required to optimize cash collection. It is evident that this can be only possible if the definition of future cash withdrawals is made with a high accuracy. Currently, this problem is covered in different scientific articles, which gives evidence of its relevance [2-4]. Different forecasting methods help solving the problem of future demand determination. The Holt-Winters model can be distinguished as one of the most common problems among them [5-7]. It was designed as a fast method, using a small memory space, but at the same time quite efficient method of sales volume forecasting, taking into account the trend and seasonality [8]. Another interesting method of forecasting future values is to use a model based on the Markov chains. Assume that the cash withdrawal in ATMs is a random process. In this case, you can use the Markov chains to forecast future values in a series. This approach was also used in different studies [9, 10]. However, the experience has proven that the use of any method does not necessarily always give good results. The recent research often focuses on a combination of several models [11, 12]. The results of these studies show that this approach is efficient and produces better results than any method used separately. There are different forecasting methods using a combination of several models, for example, adaptive composition of models [8]. In the case of method composition, the forecast is made as a weighted sum of the forecasts calculated according to several individual models. In this case, the weight coefficients are calculated based on the past history of each method, depending on how adequate the data was in each model. This study describes the Holt-Winters model and the forecast method based on the Markov chains, analyzes their possible combinations, and describes the suggested method based on the adaptive method composition. The rest of the work is organized as follows. Section 2 consists of summary theoretical information on the methods forming the basis of the suggested method. Section 3 comprises a description of preliminary data finalization and a description of the proposed hybrid model. Section 4 summarizes the numerical results of the model operation and compares different methods. The study results are given in Section 5. THEORY Holt-Winters additive method The Holt-Winters method enables to forecast time series, taking into account the available trend and seasonal variations. There are two types of this approach, taking into account the additive or multiplicative seasonality. This study uses the additive season model as one of the techniques involved in the combination. Such choice has been made during the preliminary data analysis, including decomposition of the original series and evaluation of the selected components. The description of the model under the terms of cash collection forecast is given below. Given the time series y 1,, y t of cash withdrawals from the ATM for t days, y t is the last known value of cash withdrawals per day t. It is necessary to calculate the forecasty t+p for p days ahead. Ifs is a duration of the seasonal variation period, the Holt-Winters additive model may be written using the following four equations: 7577

L t = α(y t S t s ) + (1 α)(l t 1 T t 1 ), (1) T t = β(l t L t 1 ) + (1 β)t t 1, (2) S t = γ(y t L t ) + (1 γ)s t s, (3) y t+p = (L t + pt p )S t s+p, (4) where L t smoothed value of the current series level; T t trend assessment; S t seasonality assessment; α smoothing constant for the current series level(0 α 1); β smoothing constant for trend assessment(0 β 1); γ smoothing constant for seasonality estimation (0 γ 1). The model assumes that you know the length of the period of seasonal variationss. In the course of the spectral analysis of source time series, the frequency of each of them was determined as 7 days. So the period of seasonality is one week. The weightsα,β,γ can be selected either subjectively or by minimizing the forecast errors. The higher values of smoothing parameters are taken, the faster the change response will be. This means that the recent data will contribute more to the forecast result. The less the weights are, the weaker the model response to changes will be. In this case, random deviations will be smoothed, and the predicted values will be more stable. [13] In this study, the parametersα,β,γ are determined by minimizing the Mean Squared Error or MSE: n MSE = 1 n (y i y i) 2 i=1 Markov chains Markov chain is a random process, i.e. a sequence of random events, which is characterized by the fact that the probability of transition to the next state only depends on the current state that is on the present, and not on the past. An event in this case means transition from one state to another state. The process is in oneof n states at any time. A set of statess = {s 1,, s n } is finite or enumerable. The probability of transition from states i to state s j is defined asp ij. Matrix p 11 p 1n P = ( ) p n1 p nn means transition matrix or transition probability matrix. The following conditions shall be met: p ij 0, (7) (5) (6) i j p ij = 1. (8) In this case the matrixp is called stochastic. At any time, Markov chain t can be characterized as a row vector p (t) = (p (t) 1,, p (t) n ), where p (t) i is the likelihood that in step t the process is in states i. It can be also designated as a probability vector. Suppose p (i) is a probability distribution of the current step i, then it is necessary to multiply the current probability vector by the transition matrix in order to find out what condition the chain will be in next time, p (i+1) = p (i) P. (9) If you take into account the associativity of the matrix composition, you need to multiply p (i) by the transition matrix raised to the power t in order to find out the probability distribution in t steps: p (i+t) = p (i) P t. (10) Then if you introduce a process as the Markov chain, you can simulate it in time and build some forecast. But to do this, you first need to decide how you will define the process state sets, and how you will make the transition matrix, as well as some other issues. In this paper, the approach based on the Markov chain is used to forecast the future demand for the ATMs and it is described in Section 3.2. OFFERED METHOD Data pre-processing The transactional ATM statistics is the daily information on cash withdrawals or deposits, including the information on the currency in which the transaction was conducted, and the current balance. The primary statistics used to test the models is the daily data on cash withdrawals in 11 ATMs, owned by one of Ekaterinburg banks, for 8 months. The problem of missing values was solved in the ATM transaction statistics as part of data pre-processing. The lack of information appears to have resulted from the fact that one day the ATM could be turned off or not work for technical reasons, and the statistics is missing for that day. This situation is unusual. The time series of cash withdrawals consists of a small amount of omissions (on an average, 10 omissions fall on 244 values of the basic series). Therefore, it is possible to restore the missing data, replacing it with the predicted values. In this case, the forecast of the first missing value is based on the previous statistics. Any further omissions are treated in the same way, and all previous missed values have been already replaced with some values and included in the forecast calculation of the current missing value. In this study, the Holt-Winters additive model was used to determine the missing values. Markov chain-based model First, it is necessary to define a set of states that the process can accept in order to apply the method based on the Markov 7578

chains and forecast the ATM demand. Suppose the statistics of an ATM was selected for a period equal to the training set. Group the number of cash withdrawals to have a certain set of intervals. Denote: y min -minimum withdrawal from the ATM during the period; y max -maximum withdrawal from the ATM during the period. Choose a step. Then the set of chain states can be defined as follows: s 1 = [y min ; y min + ) s 2 = [y min + ; y min + 2 ) s i = [y min + (i 1) ; y min + i ) (11) s n = [y min + (n 1) ; y min + n ) where n = y max y min +1 (brackets mean rounding up to the nearest integer upward). Each state is a half-interval, and a set of all these intervals covers completely the basic series of cash withdrawals. To move from the series of cash withdrawals to the series of states, it is necessary to replace every withdrawal y t with a certain state. In this case y t is replaced with s i when y t s i. After the state sets have been determined, the next problem that has to be solved is how to build a matrix of transition probabilities. Build the transition matrix based on the training set as well. Calculate the number of transitions from state s i to state s j for i, j, and define it as m ij. Then calculate the probability of transition froms i to s j : p ij = m ij. n (12) If the original statistics does not have transitions to state s k, define p kj 1, j = 1,, n. n (13) This decision does not have an effect on the calculation of the transition probability in one or another state. It is required only to fulfill the obligatory conditions of the stochastic matrix: p ij 0, (14) i j p ij = 1. (15) Depending on the source statistics and the selected step, the transition matrix P, composed of p ij will be completed to a greater or lesser degree. The less the initial data and less the step are, the more unimportant lines (lines which you cannot go into) are, including the lines, which consist of one unit and zeros. Now, when you have determined the state system and have built the matrix of transition probabilities, we can predict, what condition the process will be in over time. If y t is the latter known value of cash withdrawals per day t, and p (i) is a current probability vector, the probability distribution in the chain states at the moment i + t is calculated according to the known formula: p (i+t) = p (i) P t. (16) Based on this approach, two models were built as follows: the first one takes into account the transitions from one state to another state per day, and the second one-per week. In the latter case, the basic series was divided into 7 series corresponding to each day of the week, and the whole process was repeated for them. The second approach is supposed to better predict the seasonal component of the series. Combining the Holt-Winters and the Markov chain models When several models have been built, it is necessary to determine their combination method. Given a time series y 1,, y t of the daily ATM cash withdrawals, there are k forecast models, andy j,t+d is a forecast of the j-th model for d days ahead. First, choose a criterion for selecting the best models in a step. In this study, it is suggested to use weight coefficients for evaluating the efficiency of the model operation. Suppose w jt is a weight coefficient of the-th model per dayt. One of the ways of choosing coefficientsw jt is an adaptive weight selection [8]: 1 (ε jt) w jt =, (ε st) 1 k s=1 (17) where ε jt is a forecast error of the-th model per day t: ε jt = y t y jt. (18) ε jt-exponentially smoothed forecast error of the-th model per day t: ε jt = γ ε jt (1 γ)ε j,t 1, (19) whereγ is a smoothing constant. The parameter γ needs to be selected, but it is recommendation to take γ = 0.01,,0.1 [8]. The point of the weight coefficient is that the less the smoothed error ε jt is, the more weight the-th model has per dayt. Thus, the coefficients are negative, and their sum for every model is equal to one any dayt: w jt 0, (20) k w jt = 1, t, j=1 (21) A validation sample was selected to implement the combined model in addition to the training and test sets. In this paper, its length was chosen to be 42 days. Taking into account the selected length of the validation sample, that study enabled to define γ = 0.02. The validation sample is used to the test the models: weaknesses and strengths of each model are determined and the weight coefficients are calculated on their basis w jt. This value is calculated depending on the adequacy of the model operation with this data. The weight coefficient shows what contribution every model shall make to the forecast results. 7579

RESULTS Any time series of the statistics used have a length of 244 days (8 months). The first 181 values were taken as a training sample, values from 182 to 223-as a validation sample, and values from 224 to 244-as a test set. Thus, the forecasting horizon is equal to 21 days. The Mean Absolute Percentage Error or MAPE was used to estimate the forecast accuracy: MAPE = 1 n n y i y i i=1. y i (22) This method is widely used in similar studies to assess the quality of the forecast model and varies, at an average, from 20% to 45% [14]. MAPE emphasizes the extent of forecast errors as compared to the actual series values. MAPE can be also used to compare the accuracy of one and the same or different methods for two absolutely different series. The test results of the models are given in Table 1 (Appendix A). It is worth making a note. The models in the table are shown as the following abbreviations: Holt-Winters additive model AHW; Markov chain-based model per day MCD; Markov chain-based model per week MCW. Column HW + MCD + MCW corresponds to a combined model. The error values of the hybrid model, which are less than the similar error produced in the Holt-Winters additive method, are highlighted in bold for every ATM. The average value MAPE using the Holt-Winters additive method is 40.45%. The model tries to identify a trend, and find seasonal variations, taking into account the seasonality specified interval, and build a forecast on its basis. However, due to the fact that the basic time series often have a not clearly pronounced seasonal component, the model does not always accurately repeat the seasonal variations in the test set. In general, the value MAPE is quite high. Considering the following models, it is worth noting that the table shows the minimum, maximum and average values MAPE due to nondeterministic methods based on the Markov chains per day and per week, and their combination with the Holt-Winters model. As the table shows, most of the approaches based on the Markov chains (per day and per week) work worse than the Holt-Winters method. It was noted in the course of the model analysis based on the Markov chains that the forecast built on their basis has a greater dispersion than the Holt-Winters method, so there is a greater number of big forecast errors. Nevertheless, it is worth noting that the mechanism of Markov chain models enables to predict some patterns of the basic time series (for example, peaks or holes) to a good approximation. The combination of two methods under consideration gives a better result in some cases. In 8 cases, the average MAPE is less than a similar error in the Holt-Winters method. And the maximum value MAPE of 4 ATMs is less than the similar error in the Holt-Winters method. The average value of MAPE the hybrid model is35.72%. However, this value is quite high. It is worth noting that on transition to the model combination, indeterminacy of the forecasting algorithm work decreases. Thus, the mean difference between the best and the worst forecast of the hybrid model is 9.31% (against 28.37% and 26.09% in the models based on the Markov chains per day and per week, respectively). A series of graphs illustrating the forecast built by each model in question for one of the ATMs to demonstrate the model operation is given below. A thin dotted line in the graphs shows the raw data of the test set. The thick blue line corresponds to the forecast. Picture 1: Forecast using the Holt-Winters additive model Picture 2: Model forecast based on the Markov chain per day Picture 3: Model forecast based on the Markov chain per 7580

week Picture 4: Forecast of the combined model CONCLUSION Two different approaches to forecasting an ATM demand were considered in this paper-holt-winters additive method and Markov chain-based method. The first method has been widely used for forecasting time series for a long time. The second approach is less traditional. In the second case the cash withdrawal is shown as a random process. Using the machinery of the Markov chain theory, two models were built, the first of which takes into account the transitions from one state to another state per day, and the second-per week. Despite the less accurate results as compared to the results of the Holt-Winters additive method, this model enables to predict some patterns of the basic time series. Based on both approaches, the hybrid model was built. The testing has shown that the combination of the models in question using the weight coefficients in some cases gives a better result than one single approach. Despite this, the high forecast accuracy could not be achieved. The current value of the average value MAPE 35.72% is still high. In the course of the further work it is planned to build a more complex model based on the Markov chains to increase the forecast accuracy, taking into account not only the daily and weekly state transitions; determine the model behavior based on the Markov chains using special mechanisms; complicate the weight system used in the method combinations, that is instead of using one weight coefficient for each model, use their more complex combination. [4] Y. Ekinci, J. C. Lu ande. Duman, Optimization of ATM cash replenishment with group-demand forecasts, Expert Systems with Applications, Vol 42, No 7, 2015, pp. 3480-3490. [5] L F. Tratar, Forecasting method for noisy demand, International Journal of Production Economics, Vol. 161, pp. 64-73. [6] G. Sudheer, anda. Suseelatha, Short term load forecasting using wavelet transform combined with Holt Winters and weighted nearest neighbor models, International Journal of Electrical Power & Energy Systems, Vol. 64, 2015, pp. 340-346. [7] A. M. De Livera, R. J. Hyndman andr. D. Snyder, Forecasting time series with complex seasonal patterns using exponential smoothing,journal of the American Statistical Association, Vol. 106, No. 496, 2011, pp. 1513-1527. [8] U. P. Lukashin, Adaptive methods of short-term forecasting, Statistics, 1979. [9] M. Rezaeianzadeh, A. Stein andj. P. Cox, Drought Forecasting using Markov Chain Model and Artificial Neural Networks,Water Resources Management, 2016 pp. 1-15. [10] S.M. Martinez et al., Wind Power Forecast Error Probabilistic Model Using Markov Chains,Environment, Energy and Climate Change II, Springer International Publishing, 2014, pp. 55-70. [11] G. Sudheer anda. Suseelatha Short term load forecasting using wavelet transform combined with Holt Winters and weighted nearest neighbor models,international Journal of Electrical Power & Energy Systems, Vol. 64, 2015, pp. 340-346. [12] O. Claveria, E. Monte ands. Torra, Combination forecasts of tourism demand with machine learning models,applied Economics Letters, 2015 pp. 1-4. [13] D. Hank, A.D. Writes and D.U. Wichern, Business forecasting, Williams, 2003. [14] Y. Ekinci, J.C. Lu, E. Duman, Optimization of ATM cash replenishment with group-demand forecasts, Expert Systems with Applications, Vol. 42, No 7, 2015, pp. 3480-3490. REFERENCES [1] D. Prytin. Lead underwriters according to the number of ATMs as of July 1, 2013 [electronic resource] //RBC.Rating. 01.07.2013. URL: http://rating.rbc.ru/article.shtml?2013/08/27/3401622 9 (reference date: 11.04.2016). [2] S. D. Teddy and S. K. Ng. Forecasting ATM cash demands using a local learning model of cerebellar associative memory network, International Journal of Forecasting,Vol. 27, No. 3, 2011,pp. 760-776. [3] K. Venkatesh et al., Cash demand forecasting in ATMs by clustering and neural networks, European Journal of Operational Research, Vol. 232, No 2, 2014, pp. 383-392. 7581

APPENDIX A Table 1: The Forecast Errors (MAPE) For Different ATMs. Model AHW MCD MCW HW + MCD + MCW (%) Min (%) Max (%) Mean (%) Min (%) Max (%) Mean (%) Min (%) Max (%) Mean (%) ATM 1 56.8 46.9 60.9 54.8 34.3 63.7 42.0 39.1 47.4 42.1 2 25.2 29.9 42.5 35.6 44.3 61.9 53.9 20.6 27.0 23.5 3 32.3 41.2 54.6 49.2 43.5 75.2 57.5 28.6 34.5 31.7 4 38.7 35.7 55.9 46.3 58.5 74.8 63.5 38.6 44.5 40.4 5 46.7 37.9 55.3 45.0 37.1 62.1 53.9 32.7 42.0 38.0 6 47.0 32.4 58.3 43.5 50.4 75.3 60.6 30.8 41.2 34.7 7 25.2 23.8 39.4 30.4 36.4 51.3 45.2 20.1 28.1 24.4 8 48.7 63.7 96.1 75.3 63.6 121.3 86.2 46.1 60.4 50.5 9 20.5 44.1 61.1 50.7 35.7 61.3 41.7 17.9 27.0 22.6 10 56.5 38.8 106.1 63.0 46.9 63.2 57.2 36.4 48.4 39.8 11 47.4 46.6 100.9 66.9 57.3 84.9 68.4 40.8 53.6 45.3 Average 40.45 40.09 68.46 51.64 46.18 72.27 57.28 31.97 41.28 35.72 7582