16 Int. J. Engineering Systems Modelling and Simulation, Vol. 1, Nos. 2/3, 29 Optimising a help desk performance at a telecommunication company Fawaz AbdulMalek* and Ali Allahverdi Department of Industrial and Management Systems Engineering, College of Engineering and Petroleum, Kuwait University, P.O. Box 5969, Safat, Kuwait E-mail: abdulmalek@kuniv.edu. E-mail: ali.allahverdi@ku.edu.kw *Corresponding author Abstract: One of the problems faced by a mobile telecommunication company in Kuwait is the excessive waiting time to fix PC or software related problems of its over 8 employees. We successfully developed, validated and employed a simulation model to determine the number of technicians needed at the company to minimise total cost by considering both the cost of idle time of the employees and the wages of technicians. The simulation results indicated that the total cost can be reduced significantly by just hiring one more technician. Keywords: discrete event simulation; telecommunications; optimisation. Reference to this paper should be made as follows: AbdulMalek, F. and Allahverdi, A. (29) Optimising a help desk performance at a telecommunication company, Int. J. Engineering Systems Modelling and Simulation, Vol. 1, Nos. 2/3, pp.16 164. Biographical notes: Fawaz AbdulMalek is an Assistant Professor in the Industrial and Management Systems Engineering Department at the Kuwait University. He received his BS in Mechanical Engineering from the University of South Carolina and his MS and PhD in Industrial Engineering from the University of Pittsburgh. His areas of interest are production planning, supply chain management and simulation. Ali Allahverdi received his BS from Istanbul Technical University and his MSc and PhD from Rensselaer Polytechnic Institute, USA. He received the Distinguished Researcher Award and Research Performance Award from Kuwait University in 23 and 24, respectively, and Dissertation Prize from Rensselaer Polytechnic Institute in 1993. He has published 75 papers in well known international journals. He is the Editor of European Journal of Industrial Engineering and has served as Guest Editor for EJOR and IJOR. He is currently serving as an Associate Editor of IJOR, CEJOR and IJAM. He is also on the Editorial Board of IJPR, JOL and JCIIE. 1 Introduction Many companies seek the means to compete with their competitors by providing superior customer service. One industry where the competition is fierce is the call centre industry, in particular the mobile telecommunication industry. The application of operations research modelling (queuing, simulation, linear programming ) to the call centre industry is receiving significant attention recently (see, for example, Duder and Rosenwein, 21 and Mehrotra, 1997). It is important to note that the cost of providing trained agents online to answer customer enquiries accounts for over 5% of total operations costs (Duder and Rosenwein, 21). This is because it has been recognised that a critical factor in business success is being able to respond quickly to customer requests (Whitt, 1999). Hence, the costs of staffing tele call centres have become a significant part of business expense. Therefore, many companies have used operations research modelling to efficiently staff their call centres (Andrews and Parsons, 1989, 1993; Brigandi et al., 1994; Lin et al., 2; Cezik et al., 21; Tych et al., 22). In administering the help desk (HD) of a mobile telecommunication company in Kuwait, the management would like to determine the proper number of technicians, who are responsible to fix PC and software related problems for over 8 employees of the company. The company seeks ways to increase its market share and to improve its services. In order to do so, the company has to first solve its internal problems, one of which is to determine the optimum number of technicians needed. It has been found that currently an employee waits, on average, three and a half hours until the problem related to that employee s PC or software is resolved. This waiting time is excessive since it has been found that about 2% of the time an employee can not do his/her job when the employee s PC is not functioning. Therefore, it is important to find the appropriate number of technicians in order to minimise the Copyright 29 Inderscience Enterprises Ltd.
Optimising a help desk performance at a telecommunication company 161 cost of providing service (wage of technicians) and the cost of waiting time (idle time of employees). The problem is a queuing problem where there is a single queue and s servers (number of technicians) in parallel. After collecting an extensive amount of data for both interarrival times and service times, we found that neither interarrival times nor service times follow exponential distribution and hence, the commonly-accepted queuing model of M/M/s may not be appropriate. Therefore, we used simulation to model the problem. We first validated our developed simulation model. Then, we successfully employed the simulation model to determine the optimum number of technicians needed at the HD of the mobile telecommunication company. 2 Problem description The information technology (IT) department is one of the main departments at the telecommunication company. The IT department is divided into four sub departments. This research will focus on the HD division which falls under the operations sub department. The major task of HD is to receive technical requests of all company s employees and try to resolve them. The requests received by the HD staff are mostly related to PC s problems and its applications. Typically a request is received by the staff through a computer program called Remedy. Once the request is received by HD staff, it is treated as a case which is assigned a date, type, description, priority and others. On the other hand, if the requests are received via, fax or e-mail, the HD staff would log the case into Remedy and take the required action to transfer it into a case. Next, the HD staff tries to solve the case. If the case is resolved, all the information related to it must be completed and the case should be closed. However, if the case could not be resolved by HD staff, it is assigned to other specialised team within the IT department. Problems are assigned to other teams based on their relevance: system and database problems are sent to data centre team application problems such as billing screens are sent to application support team report problems are sent to operation team server problems are sent to network team. The HD staff tries to solve the problem by first. If the problem is persisted, a technician is sent to the department where the case exists. The HD technicians are involved with hardware, software and most of the network cases. If the problem is not yet resolved, then it will be assigned to other teams. The HD staff consists of employees who are primarily responsible for receiving cases and logging them into Remedy and technicians who are accountable for solving cases for all the company s employees which are over 8 workers. Currently, technicians are receiving large amount of jobs which consume significant time to be solved. It was found that an employee waits, on average, three and a half hours for his/her PC problem to be fixed. This waiting time is considered very high since it has been estimated that an employee would remain idle 2% of the time when his/her PC does not function. The company is trying to increase its market share and increase customer satisfaction. For that to happen, the company must take care of solving its internal problems first. Next, a simulation model will be developed in order to find the required number of technicians to achieve an acceptable waiting time which minimises the total cost of the system. 3 Building the HD model The simulation model for the HD was developed using Arena package, which offers flexibility in modelling and analysing systems with stochastic behaviour and other realistic characteristics. All the statistical distributions used in the simulation including call interarrival times, case process times, delay times and others were determined by using the Input Analyser of Arena. The Input Analyser uses three types of goodness of fit test namely, mean square error, Chi-square and Komogorov-Simernov hypotheses test. Based on these tests, the best fit was chosen (Table 1). Some of the data was obtained from the company since they already had collected a lot of data. The remaining required data was obtained through observation on the site. Table 1 Estimated process time distributions Process Process time distribution (min) Calls interarrival time.1 + WEIB(2.97,.947) Answering calls LOGN (1.57, 2.9) Reading Remedy EXPO (.694) Reading e-mails and faxes ERLA (.629, 2) Logging in/modifying Remedy LOGN (1.72, 1.97) Solving cases by LOGN (3.34, 5.41) Solving cases by sending 5 + WEIB (43.6,.858) technicians The model starts with a case arriving into the system according to the specified distribution. Upon its arrival, it goes to a decide module where it chooses one of the branches labelled:, Remedy and others to specify the media via which it was arrived. The percentage of case arrived by the different types of media was estimated by dividing the number of cases received by a given media by the total number of cases collected through the data collection phase. For example, cases arrived by was calculated by dividing 253 (cases arrived by ) by 341 (total number of cases arrived) and found to be 74.2%. All percentages used in the model can be found in Table 2. A call has a higher priority than other media types. An incoming call, first checks to see if there is a line
162 F. AbdulMalek and A. Allahverdi available, if there is the call will seize it. Currently there are three lines which were modelled as resources. If on the other hand, all lines were busy the call will be disposed as hang up call. If the case is seized by a line it must also be seized by an employee. Currently there are three employees which also were modelled as resources. If the employee is busy with reading Remedy or e-mails for example, the case will have to wait until he finishes. As we mentioned previously, the has a higher priority compared to Remedy and other media types. This will cause the employee to take the case as soon as he finishes the case he is currently handling. One must mention here that some of the calls are questions and follow up so these will be disposed immediately after they finish. The rest will be logged into Remedy and defined as being cases. Table 2 Percentages of different processes Process Percentage (%) Cases received via 74.2 Cases received via Remedy 22.3 Cases received via others 3.5 Follow up calls 59.3 New cases received via 4.7 Cases to be solved by HD 68.6 team Cases sent directly to 25.2 technicians Cases tried to be solved by 74.8 by tech. Cases solved by by 34.7 tech. Cases failed to be solved by 65.3 Cases solved by technician 95 Cases solved by other teams 5 Next, another decide module is used to decide whether to send the case directly to other teams or solve it by the HD team. For those cases solved by the HD team another decide module is used to decide whether the technician can solve the case by or he/she should report to department where the case exist. Currently there are six technicians modelled as resources in the model. For cases solved by they are disposed out of the system. Cases that could not be solved by technicians will be sent to other teams outside the scope of the model. Finally, cases that are solved by technician where he/she has report to them and cases that were sent directly to other teams will be disposed out of the system. Simulation was run for an eight hour-shift for 3 days (replications). Three hundred replications were deemed good enough according to criteria described by Law and Kelton (2). 4 Simulation verification and validation Verification is the process that makes certain that the simulation model mimics the real system (Law and Kelton, 2). The first method used to verify the model was a trace study. A careful trace study was carried out by tracing an entity once it is created until it is disposed from the system. The Step feature provided by Arena was used to control the execution of the model and each entity was stepped through the different modules in the system. The trace study verified the model logic and proper system behaviour. Next, a terminating condition (number of cases out) was applied to check if the percentages of the incoming call types are similar to the observed ones. Table 3 clearly verified that these two percentages are very close to each other. Finally, a detailed animation was used and it was verified that the model sufficiently replicated the real system. Table 3 Percentages Remedy others Table 4 Process Answering calls Comparison between percentages Calculated percentages Terminated run percentages 74.2% 71.56% 22.3% 24.77% 3.5% 3.67% Comparison between actual data collected and the results of the current model Actual data collected (minute) Result of current model (minute) 1.533 1.576 Reading Remedy.683.67 Reading e-mails and 1.25 1.258 faxes (others) Logging and modifying cases in Remedy 1.667 1.749 Solving cases by Solving cases by sending technicians to departments 2.417 3.36 52 53.55 Validation of the model calls for comparing outputs of the simulation to those from the actual system. In order to check the validity of the model, we compared the average process times of the actual data collected with the results of the current model. Table 4 shows the results where it is clear that the actual data collected validates the results of the current model.
Optimising a help desk performance at a telecommunication company 163 5 Experiments and results Five different scenarios were evaluated by using the simulation model. The first scenario represents the current situation of the HD with six technicians. In each of the following four scenarios, the number of technicians is incremented by one in order to find the number of technicians needed. Two performance measures are used for comparison, namely waiting times and service cost. These measures are converted into monetary terms so that the comparison is made easier. Figure 1 Averag e w aiting tim e ( m in u te s ) 25 2 15 1 5 Figure 2 U tiliz atio n 1.2 1.8.6.4.2 Average waiting times (see online version for colours) 217.85 27.24 9.47 4.9 1.96 1 2 3 4 5 Scenario Technicians utilisation (see online version for colours).955.849.756.693.63 1 2 3 4 5 Scenario The average waiting times of employees until their PC are fixed for each scenario is given in Figure 1. As expected the average waiting times decreases as the number of technician increases. It is clear that just by adding one more technician, the average waiting time decreases from three and a half hours to about half an hour, which is a reduction of 87%. If two, three, four more technicians are added, then a reduction of 95%, 98% and 99% is possible, respectively. Figure 2 shows the average utilisation for each scenario. The figure indicates that the utilisation decreases as the number of technicians increases as expected. The figure also shows that just by adding one more technician the utilisation is reduced to 84.9% from 95.5%. Of course, in order to make a decision on the number of technicians required, we have to consider both waiting time cost and service cost, which is analysed in the next section. 6 Cost analysis An employee some times cannot work and remains idle until the employee s PC is fixed when the employee is faced with a problem related to his/her computer. In other words, once an employee s PC has a problem it is not the case that the employee remains idle until the PC is fixed. In order to find this idle time, a questionnaire was prepared and distributed to each employee in the company. It was found that with an average of 3% of the time the employee would be idle until the employee s PC is fixed. The knowledge of employee idle time helps in computing the average waiting cost. The average salary of an employee was found to be $1,7 per month while the average salary of a technician was $1,36 per month. The number of working hours is eight and the number of working days in a month is about 21. Therefore, per minute average salaries of an employee and a technician are $.169 and $.135, respectively. Let E (WC) = Expected waiting cost per minute E (SC) = Expected service cost per minute E (TC) = Expected total cost per minute C = Number of technicians L = Average number of employees waiting The expected waiting cost, the expected service cost and the total costs are calculated as follows: E (WC) = $ (.3)*(.169)*L E (SC) = $.135*C E (TC) = E (WC) + E (SC) The value of the expected number of employees waiting (L) for each C is given in Table 5. As can be seen from the table, just adding one more technician reduces the average number of waiting employees from 15 to 1.85 which is a significant reduction. Of course, we are interested in the expected total cost. It is clear from Table 5 that when both
164 F. AbdulMalek and A. Allahverdi the waiting time cost and service cost are considered, the best solution is when there are seven technicians. Remember that currently there are six technicians and there is a need to add one more. The cost given in Table 5 is per unit time, i.e., per minute. Figure 3 illustrates the yearly expected waiting, service and total costs. It is clear that the total cost is minimised when there are seven technicians. The company can save about $65, yearly by just adding one more technician. Table 5 Expected total cost C L E (TC) 6 15.9 1.58 7 1.85 1.4 8.67 1.11 9.3 1.23 1.15 1.36 Figure 3 Exp ected y early cost ($ ) 2 16 12 8 4 Expected costs (see online version for colours) 6 7 8 9 1 Number of technicians E(WC) E(SC) E(TC) scenario is more economical when costs of both waiting times and providing service are taken into account. Moreover, if the number of technicians is increased by two or three, then the average waiting times can be reduced to about ten or four minutes, respectively. Although it is unclear that more technicians are needed, it is left for the company to decide the number of technicians to be hired. References Andrews, B. and Parsons, H. (1989) L.L. Bean chooses a tele agent scheduling system, Interfaces, Vol. 19, pp.1 9. Andrews, B. and Parsons, H. (1993) Establishing tele-agent staffing levels through economic optimization, Interfaces, Vol. 23, pp.14 2. Brigandi, A.J., Dargon, D.R., Sheehan, M.J. and Spencer, T. (1994) AT&T s call processing simulator (CAPS) operational design for inbound call centers, Interfaces, Vol. 24, pp.6 28. Çezik, T., Günlük, O. and Luss, H. (21) An integer programming model for the weekly tour scheduling problem, Naval Research Logistics, Vol. 48, pp.67 624. Duder, J.C. and Rosenwein, M.B. (21) Towards zero abandonments in call center performance, European Journal of Operational Research, Vol. 135, pp.5 56. Law, A.M. and Kelton, W.D. (2) Simulation Modeling and Analysis, 3rd ed., McGraw-Hill, New York, NY. Lin, C.K.Y., Lai, K.F. and Hung, S.L. (2) Development of a workforce management system for a customer hotline service, Computers & Operations Research, Vol. 27, pp.987 14. Mehrotra, V. (1997) Ringing up big business, OR/MS Today, Vol. 24, No. 4, pp.18 24. Tych, W., Pedregal, D.J., Young, P.C. and Davies, J. (22) An unobserved component model for multi-rate forecasting of tele call demand: the design of a forecasting support system, International Journal of Forecasting, Vol. 18, pp.673 695. Whitt, W. (1999) Dynamic staffing in a tele call center aiming to immediately answer all calls, Operations Research Letters, Vol. 24, pp.25 212. 7 Conclusions The problem of excessive waiting time until an employee s PC or software related problems is fixed in a mobile telecommunication company in Kuwait has been addressed. The company currently has over 8 employees. It was found that neither interarrival times nor service times were exponential random variables and hence, it was not appropriate to use the M/M/s queuing model. Therefore, a simulation model was developed, and next the model was verified and validated. It was found from the simulation model results that the average waiting time can be reduced to half an hour from the current three and a half hours by just hiring one more technician. It was shown that this