Ilmenauer Beiträge zur Wirtschaftsinformatik. Herausgegeben von U. Bankhofer; P. Gmilkowsky; V. Nissen und D. Stelzer

Size: px
Start display at page:

Download "Ilmenauer Beiträge zur Wirtschaftsinformatik. Herausgegeben von U. Bankhofer; P. Gmilkowsky; V. Nissen und D. Stelzer"

Transcription

1 Ilmenauer Beiträge zur Wirtschaftsinformatik Herausgegeben von U. Bankhofer; P. Gmilkowsky; V. Nissen und D. Stelzer Automatic Forecast Model Selection in SAP Business Information Warehouse BPS an Empirical Investigation Technical Report No. -, July Technische Universität Ilmenau Fakultät für Wirtschaftswissenschaften Institut für Wirtschaftsinformatik

2 Autor: Titel: Automatic Forecast Model Selection in SAP Business Information Warehouse BPS an Empirical Investigation Ilmenauer Beiträge zur Wirtschaftsinformatik No. -, Technical University of Ilmenau, July ISSN ISBN Institut für Wirtschaftsinformatik, TU Ilmenau Anschrift: Technische Universität Ilmenau, Fakultät für Wirtschaftswissenschaften, Institut für Wirtschaftsinformatik, PF 155, D-988 Ilmenau. - i -

3 Table of Contents 1 Introduction... 1 Methodology Results Initial Experiments Additional Experiments with Wrong Seasonal Factors... 7 Discussion Conclusion... 1 References... 1 Appendix: Results from Section 3.1 in Detail ii

4 Abstract: The feature of automatic model selection is empirically investigated in the context of SAP BW Business Planning & Simulation (BPS). The automatic model selection is a software feature routinely used by many companies in their ing, for instance in sales planning. Our empirical tests are based on data for time series showing trend, seasonal and trend-seasonal patterns. We use this data first in its 'pure', unaltered form. Afterwards, we add varying amounts of normally-distributed noise to see how this affects the quality of model selection and the ing result. We, thereby, mimic real-time data quality which very seldom comes in the form of undisturbed trends, seasonal or other patterns. Our findings indicate that automatic model selection should be used with care, and that some planners should reconsider their rather unreflective attitude towards automatic model selection in practical ing. Key Words: Data Warehouse, SAP Business Information Warehouse, Forecasting, Noise iii

5 1 Introduction Various SAP tools offer ing methods as part of their business functionality. Examples include SAP R/3 TM, SAP Advanced Planner & Optimizer (APO TM ), and SAP Business Information Warehouse (BW TM ). The ing component examined here is the automatic model selection in SAP BW Business Planning & Simulation (BPS). The automatic model selection fits a model to the available historical data while minimizing some error measure, which is Akaike s Information Criterion in the case of SAP BW [AkPa1998, SAP]. It is a software feature routinely used by many companies in their ing, for instance in sales planning. At first sight, this feature seems to relieve the planner from choosing an adequate statistical model for historical data himself, which can be a timeconsuming and difficult task. Because it is so often applied, one should aim for a good understanding of the strengths and weaknesses of automatic model selection in order to make a well-informed decision when to use it. The experiments outlined briefly in this paper are a contribution to this goal. Methodology Three basic functions were used in our ing experiments performed with SAP BW BPS release 3.5. The empirical tests are based on time series data showing trend, seasonal and trend-seasonal patterns. The total horizon were 9 of which 8 were treated as historical data and the other 8 were the planning horizon. Figure 1 gives an illustration of the time series patterns that were used. The data was initially used in an ideal deterministic pattern which should provide a rather trivial task for automatic model selection. Thereafter, varying amounts of normally-distributed noise are added to the historical data in order to see how this affects the quality of model selection and the ing result. The additive noise component mimics real-life data quality which can be characterized as a stochastic influence on trend, seasonal or other patterns. Practically, each historical data point x of the basic functions was modified in the following way: 1

6 (,1) ) x = x + a N, where N(,1) represents a standard-normally distributed random number and a is a noise factor that was varied according to a = [ ; 1 ; ; 5 ]. Thus, we conducted four sets of ing experiments for all functions. To remove arbitrary results, at each noise level and for each function 3 runs with different random number seeds for the time series modification were performed. (1) Trend Model base (history) 1 8 (ideal curve) () Seasonal Model (symmetric) 7 base (history) (ideal curve) (3) Trend-Seasonal Model (symmetric) base (history) (ideal curve) Fig. 1: The basic functions discussed here 3 Results 3.1 Initial Experiments Results are presented in the form of graphs and tables for each function and noise-level. The figures refer to the first run performed in each individual experiment to give a visual impression of the ing results. 1 The tables report on data averaged over all 3 runs for each experiment. The mean squared error (MSE) and the mean absolute percentage 1 It should be noted that a figure giving aggregated results is not sensible here.

7 error (MAPE) are employed as standard error measures. Furthermore, the percentage of correct models chosen by the automatic model selection is given in the tables. Set 1: Forecasting without noise (noise factor a = ) (1) Trend Model base (history) 1 8 () Seasonal Model 7 base (history) (3) Trend-Seasonal Model 18 base (history) Fig. : Results from the first run in each experiment Function MSE MAPE (%) correct model (%) (1) trend,, 1, () seasonal,, 1, (3) trend-seasonal,, 1, Table 1: Forecast results with original historical data (noise level = ), avg. from 3 runs Without noise added, the automatic model selection chose the correct model in every case. 3

8 Set : Forecasting with noise factor a = 1 (1) Trend Model 1 1 base (history) original function ht P b i () Seasonal Model 1 base (history) 8 - original function (3) Trend-Seasonal Model base (history) original function Fig. 3: Results from the first run in each experiment Function MSE (min, max) MAPE (%) (min, max) correct model (%) (1) trend 3,1 (, ;,) 8,5 (,5 ; 3,5) 8 () seasonal, (,1 ;,3) 19, (8,9 ; 3,) 9 (3) trend-seasonal,7 (, ; 1,3) 5,7 (,8 ; 1,) 1 Table : Forecast results with modified historical data (noise level = 1), avg. from 3 runs For the trend function, the system chose times the trend model, 5 times a trend-seasonal and once a constant model. For the seasonal function, the system chose a seasonal model 7 times and a trend-seasonal model 3 times. For the trend-seasonal function, the trendseasonal model was chosen in each case.

9 Set 3: Forecasting with noise factor a = 1 (1) Trend Model base (history) 1 1 () Seasonal Model base (history) original function - - original function 15 (3) Trend-Seasonal Model base (history) original function Fig. : Results from the first run in each experiment Function MSE (min, max) MAPE (%) (min, max) correct model (%) (1) trend 19,3 (, ; 1,3) 5, (1,5 ; 73,8) () seasonal 1,1 (,5 ;,) 33, (19, ; 9,9) 1 (3) trend-seasonal 5,5 (,7 ; 31,9) 1, (5,5 ;,) 8 Table 3: Forecast results with modified historical data (noise level = ), avg. from 3 runs For the trend function, the system chose 1 times a trend, times a trend-seasonal and 1 times a constant model. For the seasonal function, the system chose a seasonal model in all cases. For the trend-seasonal function, the trend-seasonal model was chosen times (with times the correct additive model and two times a multiplicative model), while a seasonal and a constant model where both chosen in two cases each. 5

10 Set : Forecasting with noise factor a = (1) Trend- Model base (history) original function () Seasonal Model base (history) original function (3) Trend-Seasonal Model base (history) original function Fig. 5: Results from the first run in each experiment Function MSE (min, max) MAPE (%) (min, max) correct model (%) (1) trend, (,5 ; 18,), (1, ; 9,3) () seasonal 5,1 (,9 ; 15,) 78, (55,5 ; 1,) 7 (3) trend-seasonal,3 (,5 ; 8,3) 35,5 (11,9 ; 7,5) 7 Table : Forecast results with modified historical data (noise level = 5), avg. from 3 runs For the trend function, the system chose times a trend, once a trend-seasonal, once a seasonal and times a constant model. For the seasonal function, the system chose a seasonal model 8 times and a constant model times. For the trend-seasonal function, the trend-seasonal model was chosen in two cases, a seasonal model was chosen times, a trend model 7 times, and a constant model 17 times.

11 3. Additional Experiments with Wrong Seasonal Factors In addition to the basic experiments described above, we wanted to investigate the influence of the seasonal factor in the seasonal ing models. The seasonal factor must be manually customized before using the respective models. This also applies for automatic model selection. Therefore, it is considered interesting to see how wrong choice of the seasonal factor influences quality. Two new asymmetric functions with seasonal patterns were introduced for this purpose which are depicted in figure. While the length of a season is 1 in the symmetric seasonal function used in section 3.1, the length of a season is 1 in both of the functions used here. However, the seasonal length was deliberately kept to 1 in customizing when the experiments with automatic model selection described below were conducted (b) Seasonal Model (asymmetric front) base (history) (ideal curve) 5 (c) Seasonal Model (asymmetric end) base -5 (history) (ideal curve) -1 Fig. : Seasonal function used in additional series of experiments Set 5: Forecasting with noise factor a = and wrong seasonal factor 1 (instead of 1) (b) Seasonal Model (asymmetric front) base (history) (c) Seasonal Model (asymmetric end) base (history) Fig. 7: Results from the first run in each experiment 7

12 Function MSE MAPE (%) correct model (%) (a) seasonal, symmetric (original) (b) seasonal, asymmetric front (c) seasonal, asymmetric end 1 8,8, 77, 85, Table 5: Forecast results with original historical data (noise level =, seasonal factor 1 instead of 1), avg. from 3 runs It suffices to conduct the experiments for the case without noise to recognize the dramatic influence of setting the seasonal factor correctly. While in the case of function (b) the automatic model selection frequently chooses the constant model, it is the negative trend model that is chosen for function (c). Discussion The performance of the automatic model selection is optimal when no noise is added to the underlying patterns. However, this situation never occurs in practical business ing as stochastic influence in historical data is abundant. On a general level, the correctness of the automatic model selection with respect to the underlying function pattern and the quality deteriorates with the amount of noise a result that can be expected. A closer look at the different noise levels and functions, though, reveals some interesting details. When the underlying pattern is a trend, the relative performance appears to be the worst. Even with the low noise factor a = 1, in % of the test cases the wrong model was chosen by the automatic model selection. Moreover, the error measures MSE and MAPE each display results within a wide interval. The maximum MAPE was as high as 3,5 % even at this comparatively low level of noise. This may be interpreted as a significant risk for a bad automatic. The situation is aggravated at higher noise levels. 8

13 To a somewhat lesser degree the same applies to the trend-seasonal function, where we find a wide range of MAPE- and MSE-values at the noise level a = and higher. Here, two MAPE-results higher than % are found, and in three out of 3 cases the MSE is higher than 5, thus again indicating a performance risk for a purely technical ing approach. When analyzing the performance for the seasonal function, one should bare in mind that the MAPE-value depends on the underlying range of correct function values. As these values on average are smaller for the seasonal than for the trend and the trend-seasonal function, a high MAPE should not be overinterpreted in the seasonal case. The data indicates that the MAPE-error is particularly high at lower turning points of the curve. For the seasonal function, the MSE gives a more reliable account of quality. Here, the performance appears reasonable up to a noise factor of a =. However, it is worth mentioning that the performance in the seasonal (and the trendseasonal) case very much depends on the customized seasonal length for the planning object in SAP BW. Our experiments with slightly wrong seasonal length factors in section 3. led to very bad results. Even without noise the automatic model selection then generally does not find the correct model. The necessity to enter the correct seasonal length in advance of the limits the practical value of any ing tool. This is not an SAP-specific problem, though. One should also consider that in business practice the length of a season may vary over time and often differs for different planning objects, once more demonstrating the necessity to closely monitor and supervise automated planning. For any pattern, the automatic model selection generally fails at the high noise factor a = 5, both in terms of the model chosen and the quality achieved. Frequently, the automatic model selection chooses the constant model in this situation as no particular pattern can be identified by the system. Great errors at all noise levels often occur when the automatic model selection chooses a model that does not correspond to the true underlying patterns in the historical data. When classifying this choice as wrong prior knowledge of the original function is required that the system does not have. From a purely statistical point of view, based on the available historical data in the individual run, the system s choice is therefore justified. This demonstrates the limits of a technical approach to ing when really business 9

14 knowledge is necessary to help deciding the right model, tune it s parameters or manually adapt the results. Many companies routinely use automatic model selection, for instance in sales planning, where the number of planning objects is very high. The argument is often that manual planning requires a high effort. A lack of statistical knowledge is another important reason why planers turn to automatic model selection instead of analyzing the data themselves. Our results remind us that a blind trust in the results of a tool can lead to suboptimal performance. 5 Conclusion The results presented here may be interpreted as a warning for practical planners not to automate where their individual business knowledge can help to improve the. Automatic model selection is useful where the variance of historical data around clear statistical patterns is relatively small and the importance of the results is not exceptionally high. In all other cases, the planers knowledge and skills are important input in ing. This suggests that some company planners should reconsider their rather unreflective attitude towards automatic model selection in practical ing. References [AkPa1998] Akaike, H.; Parzen, E.: Selected Papers of Akaike Hirotugu. Springer, New York, [SAP] SAP AG: SAP Help Portal: SAP BW BPS Automatic Model Selection, (1-18-) 1

15 Appendix: Results from Section 3.1 in Detail Results for noise factor a = 1 Run Trend Pattern MSE MAPE Model Selected 1 1,59,91 Trend,39 1,39 Trend 3,973 5,7 Trend-Saisonal(add) 5,585 1,7 Trend 5,575 5,55 Trend 11,1895,15 Trend 7,8979,5 Trend 8 1,7397 8,15 Trend 9,839,87 Trend 1 3,9793 3,51 Trend 11 7, , Trend 1,598,57 Trend 13,8,13 Trend-Saisonal(add) 1,117,73 Trend 15,38939,15 Trend 1,9957, Trend 17 5,833 15,8 Trend 18,7 1,3 Trend 19 3,57 13,5 Trend,181 1, Trend 1,737 1,33 Trend,8735,17 Trend 3,75 5,71 Trend-Saisonal(add),3189,3 Trend 5,1591 1,95 Trend 1,885 9,81 Trend-Saisonal(add) 7,31 5,1 Trend-Saisonal(add) 8 1,33 8,18 Trend 9 1,533,733 Konstant 3,7515 1, Trend 11

16 Run Seasonal Pattern MSE MAPE Model Selected 1 1,551 3,199 Trend Saisonal (add.),831 1, Saisonal 3,151 15,313 Saisonal,375 17,37 Saisonal 5,318 3,3 Trend Saisonal (add.),71 19,395 Saisonal 7,1998 9,17 Saisonal 8,93 17,57 Saisonal 9,1317 1, Saisonal 1,357 18,188 Saisonal 11,5859 1,988 Saisonal 1,33 1,8 Saisonal 13,7 1,91 Saisonal 1, 15,33 Saisonal 15,5378 7,93 Saisonal 1 1,97,35 Trend Saisonal (add.) 17,119 1,8 Saisonal 18,313 19,918 Saisonal 19, ,19 Saisonal,8 13,19 Saisonal 1, ,19 Saisonal,339 18,578 Saisonal 3,119 8,9 Saisonal, ,787 Saisonal 5,1978 1,33 Saisonal,515 31,919 Saisonal 7,119 13,59 Saisonal 8,335 19,3 Saisonal 9,99 17,53 Saisonal 3,53 17,85 Saisonal 1

17 Run Trend Seasonal Pattern MSE MAPE Model Selected 1,57,11 Trend-Saisonal,513,853 Trend-Saisonal 3,9871 7,77 Trend-Saisonal 1,13 8,7 Trend-Saisonal 5,595 5,315 Trend-Saisonal,1751,759 Trend-Saisonal 7 1,1 7,78 Trend-Saisonal 8,778 3,8 Trend-Saisonal 9,53,95 Trend-Saisonal 1 1, ,135 Trend-Saisonal 11,11 3,77 Trend-Saisonal 1,957,5 Trend-Saisonal 13,31833,1 Trend-Saisonal 1,7,81 Trend-Saisonal 15,3919,5 Trend-Saisonal 1,77 3,8 Trend-Saisonal 17,13 5,113 Trend-Saisonal 18,3 5,53 Trend-Saisonal 19,383 3,19 Trend-Saisonal 1,319 8,53 Trend-Saisonal 1,,7 Trend-Saisonal,17 3,391 Trend-Saisonal 3,153,91 Trend-Saisonal,9595 5,33 Trend-Saisonal 5,311, Trend-Saisonal,855 7,9 Trend-Saisonal 7 1,33 8,15 Trend-Saisonal 8,383,7 Trend-Saisonal 9, ,593 Trend-Saisonal 3 1,355 8,98 Trend-Saisonal 13

18 Results for noise factor a = Run Trend Pattern MSE MAPE Model Selected 1 1,81 7,3 Trend 13,17 1,58 Konstant 3 1, ,7 Trend 31,319 38,8 Konstant 5 31,883 38,79 Konstant,3937 8,93 Konstant 7,1 3,11 Konstant 8,998 33,191 Konstant 9 3,73 13,15 Trend 1 33,757,57 Konstant 11 9, ,9 Trend 1,1,98 Trend 13 1,87381,88 Trend-Saisonal (mult.) 1, 1,3 Trend 15, ,1 Konstant 1,8 3,51 Konstant 17 3,3333 3,55 Trend 18 7,89 35,7 Konstant 19 1,5858 7,5 Trend,7591,171 Trend 1 3,51 31,913 Konstant 3,517 1,31 Trend 3, ,55 Trend-Saisonal(add) 1,311 8,598 Trend 5 7,183 35,5 Konstant 7, ,51 Trend-Saisonal(add) 7,587 11, Trend-Saisonal(add) 8 5,371 17,1 Trend 9 15,998 3,58 Konstant 3 1,381 9,5 Konstant 1

19 Run Seasonal Pattern MSE MAPE Model Selected 1,977 3,339 Saisonal 1,331, Saisonal 3,831,593 Saisonal,9137 3,188 Saisonal 5,773,57 Saisonal 1, ,179 Saisonal 7, ,181 Saisonal 8 1,9 37,7 Saisonal 9, ,5 Saisonal 1,79 7,35 Saisonal 11 1,99 33,89 Saisonal 1 1,517,988 Saisonal 13,391 33,59 Saisonal 1 1,85 37,3 Saisonal 15 1,98 3,71 Saisonal 1,9855 7,78 Saisonal 17,8878 8,835 Saisonal 18 1,35 3,7 Saisonal 19 1,5199 9,881 Saisonal,9875 5,9 Saisonal 1 1,7778 3,8 Saisonal 1, ,9 Saisonal 3 1,7 3,83 Saisonal 1,1353 3,153 Saisonal 5,7,8 Saisonal 1,53 3,9 Saisonal 7 1,7155 5,8 Saisonal 8 1,13 3,13 Saisonal 9,553,31 Saisonal 3 1,5 37,77 Saisonal 15

20 Run Trend Seasonal Pattern MSE MAPE Model Selected 1,383 1,8 Trend-Saisonal,31 1,1 Trend-Saisonal 3 1,7957 1,3 Trend-Saisonal, ,15 Trend-Saisonal 5 31,815 1,985 Konstant,8157 5,515 Trend-Saisonal 7,873 15,51 Trend-Saisonal 8 1,9975 7,735 Trend-Saisonal 9,188 9,9 Trend-Saisonal 1,39 19,19 Trend-Saisonal (mult.) 11 1,58 7,15 Trend-Saisonal 1, ,13 Trend-Saisonal 13 1,773 8,119 Trend-Saisonal 1 1,7875 9,3 Trend-Saisonal 15, ,71 Trend-Saisonal (mult.) 1 1,9818,935 Trend-Saisonal 17,5559 1,5 Trend-Saisonal 18,3 9,91 Trend-Saisonal 19,9389 7,3 Trend-Saisonal 5,8 17,85 Trend-Saisonal 1 1,13 9,33 Trend-Saisonal,87,3 Trend-Saisonal 3 7,5 1,113 Saisonal 15,15981,38 Konstant 5 1,375 9, Trend-Saisonal 3,351 1,19 Trend-Saisonal 7 5,181 1,83 Trend-Saisonal 8 1,57 8,99 Trend-Saisonal 9 5,9 39,1 Saisonal 3 1,73 7,8 Trend-Saisonal 1

21 Results for noise factor a = 5 Run Trend Pattern MSE MAPE Model Selected 1,387 17,8 Trend 5,577 33,9 Konstant 3 55,55 9,917 Trend 37, ,8 Konstant 5 75,339 5,133 Konstant 3,38,5 Konstant 7 9, ,377 Konstant 8,58 1,593 Trend 9 1,375 7,881 Trend 1 31,5 38,7 Konstant 11 5,888 33,91 Konstant 1,913 7,8 Konstant 13,5818 8,85 Konstant 1 31,89 38,83 Konstant 15 33,111,3 Konstant 1 83,3 9,5 Konstant 17 11, ,5 Trend 18 3,185 37,8 Konstant 19 51,8 5,79 Konstant 13,553,33 TrendSaison(add) 1 18,137 9,38 Trend 3,75 39,7 Konstant 3 3,379,97 Konstant 57, ,998 Konstant 5 38,887 3,98 Konstant 3,157,3 Konstant 7 38,817,119 Konstant 8 3, ,139 Konstant 9 53,5518 5,57 Saisonal 3 1,7159,17 Konstant 17

22 Run Seasonal Pattern MSE MAPE Model Selected 1 3,55 87,8 Konstant 3,31 5,11 Konstant 3 3,9577 5,7 Konstant 3,889 57,79 Konstant 5,377 1,8 Konstant 5,773 81,13 Saisonal 7,913 7,789 Konstant 8, , Konstant 9 3, ,8 Konstant 1,837 97,7 Konstant 11 1,18 115,37 Saisonal 1 3,751 7,9 Saisonal 13 7,359 7,93 Saisonal 1 11, 7,39 Konstant 15, , Konstant 1 3,531 3,9 Konstant 17 5,875 77,1 Saisonal 18 15,3577 1, Saisonal 19 3,17 79,813 Konstant,319 8,885 Saisonal 1, ,98 Konstant 11, ,58 Konstant 3 5,113,387 Saisonal,853 55,58 Konstant 5 3,8979 5,351 Konstant 3,97 58,5 Konstant 7 3,113 79,935 Konstant 8 3, ,918 Konstant 9 3,15 9,39 Konstant 3 3, ,985 Konstant 18

23 Run Trend Seasonal Pattern MSE MAPE Model Selected 1 39,381 8,3 Konstant 17,78 8,39 Konstant 3 5,9198 5,53 Saisonal 8,15 7,85 Trend 5 35,5889 5,57 Konstant 9,77 5,31 Trend-Saisonal 7 3, ,937 Trend 8,318 13,3 Trend 9 5,91 17,8 Konstant 1 5,793 3,51 Konstant 11 19, ,71 Konstant 1 5,3985 5,5 Konstant 13 1,579 5,7 Saisonal 1,589 5,9 Saisonal 15 11,73,7 Konstant 1,338 33,88 Saisonal 17,975 5,111 Konstant 18 35,731 5,18 Trend 19,5538 1,5 Konstant 7,9779 5,31 Konstant 1 8, ,77 Konstant 8, ,3 Trend 3,5 11,89 Trend, ,5 Trend 5,5,58 Konstant 5,8 5,137 Konstant 7 3,151 3,5 Konstant 8 1,877 1, Konstant 9 9,5151,7 Trend-Saisonal 3 9,991,3 Konstant 19