PARALLELIZATION OF HYBRID SIMULATED ANNEALING AND GENETIC ALGORITHM FOR SHORT-TERM PRODUCTION SCHEDULING

Size: px
Start display at page:

Download "PARALLELIZATION OF HYBRID SIMULATED ANNEALING AND GENETIC ALGORITHM FOR SHORT-TERM PRODUCTION SCHEDULING"

Transcription

1 PARALLELIZATION OF HYBRID SIMULATED ANNEALING AND GENETIC ALGORITHM FOR SHORT-TERM PRODUCTION SCHEDULING Karl Kurbel # Bernd Schneider # Kirti Singh + # Institute of Business Informatics, University of Muenster, Grevener Strasse 91, D Muenster, Germany + Institute of Computer Science, Electronics and Instrumentation, Devi Ahilya University, Khandwa Road, Indore, M.P , India ABSTRACT In short-term production planning, jobs are assigned to machines and scheduled, taking into consideration that operations must be performed in pre-defined sequences. Since the problem is NP-hard, heuristics have to be used. Simulated annealing, neural networks and genetic algorithms are some of the recent approaches. We have tried to improve those methods by taking a hybrid of simulated annealing and genetic algorithms called PRSA (parallel recombinative simulated annealing). This paper describes the ideas underlying PRSA and our implementation on a parallel computer (transputer system). KEYWORDS: Production planning and control, scheduling, genetic algorithms, simulated annealing. capable to approximate optimal solutions in reasonable time. In our earlier work, we developed solutions for scheduling based on simulated annealing and genetic algorithms. Results were encouraging [5, 6, 7]. In order to further improve performance of those methods, we then took a hybrid approach of simulated annealing and genetic algorithms and parallelized it on a transputer system. In section 2, a short introduction to simulated annealing is given. Section 3 describes genetic algorithms for scheduling. In section 4, we discuss general properties of the two heuristics and the features adopted for the hybrid algorithm. Implementation on a parallel machine will be outlined. Section 5 discusses parameter settings and how to tune them. Section 6 concludes with results and outlook. 1 INTRODUCTION Job-shop scheduling is an important part of short-term production planning. In a schedule, dates and sequences in which given jobs will be processed on a limited number of manufacturing resources (machines) are specified. A job generally consists of several operations to be performed in a specific and pre-defined order. Objective functions are, for example, minimization of order-flow times, total elapsed time, or machine idle times. In Operations Research, many exact models for jobshop scheduling and sequencing have been developed. However, most of them failed in practice because the problem is NP-hard. Instead, heuristics such as dispatching rules (shortest-operation-time rule, first-come-firstserve rule, etc.) have to be used. They are fast but solution quality is only moderate. More powerful techniques have emerged recently, especially simulated annealing, artificial neural networks and genetic algorithms. They are 2 SIMULATED ANNEALING Simulated Annealing is a stochastic heuristic method for optimization problems, motivated by the physical process of crystallization [1]. It is similar to traditional gradientdescent methods but it has a stochastic component to prevent termination in a local minimum. This component is controlled by an external parameter, called temperature. When the temperature is high, stochastic influence is strong, but as the temperature goes down the stochastic component becomes less important; so the process gradually turns from stochastic behavior to normal gradient descent. A significant feature of simulated annealing is that a formal proof of convergence exists [1, 7]. The algorithm starts with an initial consistent solution. This solution is modified by exchanging two randomly selected operations and then the new value of the objective function is calculated. If the new solution is - 1 -

2 better than the former one it will be accepted. If it is worse it might also be accepted, but only with probability 1 p = E old E new 1 + e T where E old and E new denote the respective values of the objective function. p depends on the temperature T. Iterations continue until a given termination criterion is satisfied, e.g. until T comes close to zero. The best solution encountered during the process will be kept in any case in analogy to elitism strategy in genetic algorithms [2, 3, 10] although it is not necessarily the optimal one. 3 GENETIC ALGORITHMS FOR SCHEDULING Genetic algorithms are heuristics derived from biological evolution. Whereas simulated annealing at a time just goes from one point of the solution space to another one, genetic algorithms deal with many points simultaneously. Each point represents one individual (schedule). All individuals together are called a population. New individuals are created by genetic operators (mutation and crossover), and their "fitness" values are computed. Then a new generation of the population is constructed by selection from newly generated and old individuals. The crucial point when genetic algorithms are applied to scheduling is to choose an appropriate representation and to develop specific cross-over and mutation operators. We adopted an approach described by Nakano and Yamada [9] and improved it. Their representation is based on relationships between pairs of jobs. As an example, consider a small problem of six jobs, each one consisting of six operations. There are also six machines, and each job is to be processed on each machine exactly once, taking into account that operations have pre-defined orders. Job Job Job Job Job Job Fig. 1: Machine sequences Figure 1 shows the machine sequences of the six jobs. Row 1 reads as follows: The first operation of job 1 has to be performed on machine 3, the second operation on machine 1, etc. These sequences are transformed into a binary representation by looking at relationships between pairs of jobs and expressing them by a priority function: 1, if op1 will be served before op2 prior( op1, op2) = 0, otherwise Figure 2 shows the transformed representation for job 1. Row 1 states that operations 1, 2 and 4 of job 1 will be served on machines 3, 1 and 4, respectively, before operations of job 2 (i.e. operations 2, 5 and 6 of job 2) are processed there. Machine Job 1 before job Job 1 before job Job 1 before job Job 1 before job Job 1 before job Fig. 2: Binary representation for job 1 "Individuals" can now be constructed by concatenating all binary strings of all pairs of jobs. They represent the order in which operations are assigned to machines. The length of an individual is: length = ½ * m * n * (n-1), where m is the number of machines and n the number of jobs. The example above would thus require a 90-bit string. The genetic algorithm starts with generating a number of individuals as initial population. Since this is done randomly, some individuals may represent infeasible solutions. Consistency checking and repair are done by two algorithms called local and global harmonizers [9]. Then fitnesses of all individuals are computed and the "genetic" part starts. Individuals are selected for mutation and cross-over. Before a new population can be formed, modified individuals have to undergo harmonization. Then their fitnesses can be calculated etc. This process continues until the population converges to a sufficiently good fitness value. The approach by Nakano and Yamada shows how genetic algorithms can be applied in principle, but their quadratic representation identical numbers of jobs, operations per job, and machines is not adequate for realworld problems. Furthermore, their assumption that operations are arbitrarily permutable on a machine only holds if no two operations of the same job have to be processed on the same machine. Otherwise, the pre-defined technological sequence of operations has to be observed, implying that those operations are no longer permutable as before. To overcome this restriction, information about jobs needing machines more than once has to be stored. This is done by means of a vector of flags. A flag is "0 if the - 2 -

3 corresponding bit can be mutated, i.e. the operation may be moved, otherwise it is "1". The mutation operator has to examine the vector of flags when an individual is selected. Mutation is carried out only if the flag is zero. Figure 3 shows an example. In the flag vector, positions two and six are flagged, so the corresponding bits of the original individual must not be changed. The mutation operator may choose bit five, for example, and change it to "1". Flags Original individual Mutated individual Fig. 3: Flags, original and mutated individuals The cross-over operator selects two individuals (parents) from the population, cuts them at the same positions (cross-over points) and constructs two new individuals (children). This is done by recombination of the parts in such a way that their former positions are retained [2, 3]. Figure 4 shows two individuals selected for cross-over. They are cut after the second and before the sixth bit (arrows). The parts cut out are then recombined as shown. Since solutions can become inconsistent by the genetic operations [9], two harmonizing functions are applied to the individuals after cross-over and mutation to eliminate inconsistencies as mentioned above. Parent 1 Parent 2 Child 1 Child Cross-over 4 PARALLEL RECOMBINATIVE SIMULATED ANNEALING 4.1 General Ideas Underlying the Hybrid Approach Simulated annealing and genetic algorithms both are powerful search techniques. They have some features in common and some are different. Both methods are iterative, starting with consistent solutions. They perturb solutions in some way to find better solutions. Or in other words, both techniques generate new points in the search space by applying operators to current points. They move probabilistically towards optimal regions of the search space. In either method, one has to choose a problemspecific representation and develop specific operators for perturbing solutions and generating new ones. Dissimilar features are that simulated annealing treats one solution at a time while genetic algorithms have the concept of population and deal with a number of solutions simultaneously. That is why genetic algorithms, in contrast to simulated annealing, exhibit explicit parallelism. The risk that simulated annealing terminates in a local minimum can be reduced by running the algorithm several times, or by running more than one algorithm on different processors of a parallel computer. In contrast to genetic algorithms, simulated annealing has an externally set steering mechanism, the temperature. Influence of the temperature is initially very strong but gradually goes back. One advantage of simulated annealing is that a formal proof of convergence exists whereas convergence of a genetic algorithm cannot be assured. Another difference is that genetic algorithms possess good "memory". Since they handle large numbers of solutions at a time they can use all this "knowledge" for further search. In simulated annealing, there is only one solution available at a time, so previous information may be lost and not regained. Although chances are that crucial information is lost in genetic algorithms, too, the probability that this happens can be reduced by decreasing the number of samples taken from above-average regions of the solution space. Fig. 4: Two selected individuals before and after cross-over 4.2 Basic Parallel Recombinative Simulated Annealing In parallel recombinative simulated annealing (PRSA) useful features from both methods are combined [8]: the convergence property from simulated annealing and explicit parallelism from genetic algorithms. PRSA starts with a very high temperature and generates a large number of random solutions (initial population). Then cross

4 over and mutation operators are applied to generate a new population. This is done as follows: In one iteration, a small sub-population is generated by cross-over among two randomly chosen individuals. Afterwards mutation is applied once to each child, and fitness values are calculated. Individuals for the new population are selected by competition among parents and children. Winners are determined by Boltzmann trials [1]. Whereas in "normal" genetic algorithms only new individuals of better fitness are accepted, here worse ones might also be accepted with probability: p= 1+ e 1 Fp Fc T F p and F c stand for the fitnesses of the parents and the children, respectively. Their exact interpretations depend on the strategy of selection. Strategies can be, for example: a) Each child competes against one of his personal parents. b) All children compete against parents in such a way that the best child meets the worst parent, the second best child meets the second worst parent, etc. c) Each child competes against one randomly selected individual from the set of parents. d) Parent-child teams compete against other parent-child teams. In this case, each winning individual gets a positive score. Individuals with the highest total will be adopted for the new population. The process of generating sub-populations in this way is repeated until all individuals were selected for cross-over exactly once. The algorithm has to ensure that no individual is missed nor is parent more then once. Subsequently, the temperature is lowered by a small amount, and creation of a new generation as described above starts. The termination criterion can be static, dynamic or both. A static criterion is based on the temperature, for example. If T is very low (e.g ) the system is "frozen" and no more worse solutions can be reached. In other words, a frozen system cannot leave a local minimum. To improve this behavior, dynamic criteria based on the state of the solution process may be adopted from genetic algorithms, e.g. the number of iterations without (significant) changes of the population. Figure 5 shows that PRSA resembles genetic algorithms inasmuch as multiple individuals are handled and new individuals are created by genetic operators. Instead of accepting new individuals only according to their fitnesses, a stochastic component as in simulated annealing is used. Set temperature T to a sufficiently high value. Build an initial population of size n by generating n individuals which represent consistent solutions, evaluating the fitness of each individual. Mark all individuals as "unused". REPEAT DO n/2 times: Select two "unused" individuals as parents. Create two children by applying cross-over to the parents. Apply mutation to each child once. Evaluate fitness of both children. Mark individuals as "parent" and "child", resp. DO n times: Select one "parent" and one "child" for Boltzmann trial. Keep the winner and mark it as "unused". Remove the looser. Lower T. UNTIL termination criterion is reached. Fig. 5: Basic PRSA algorithm 4.3 Implementation of PRSA on a Parallel Machine A transputer system, Parsytec MultiCluster/2, was used as hardware platform for implementation. It has 32 processor nodes, Inmos 805/25, each one equipped with 4 MB of RAM. Nodes are connected via links as a high-speed bus system. An efficient way to use this type of system is to maximize the number of tasks that can be executed in parallel and to minimize communication between tasks. PRSA is easy to parallelize because it has inherently parallel features like genetic algorithms. In our parallel implementation, all nodes of the transputer system are used to run their own PRSAs on local populations. Cross-over and mutation are applied there as before to generate new local populations. From time to time, individuals are exchanged between nodes. The number of individuals to be sent depends on an external parameter (migration rate). Migrating individuals will be integrated into the respective local populations. To keep population sizes constant, individuals received from other nodes replace the individuals sent off. A new population is thus built from external individuals and from local individuals. The algorithm of figure 6 ensures that no individual is missed nor that it becomes parent more then once. This is achieved by marking individuals as "unused", "parent" or "child", respectively. Selection of individuals for migration will follow some pattern (migration strategy). Various strategies are conceivable. The ones we used in our experiments are: 1) Choose the best individuals (i.e. with highest fitness)

5 2) Choose individuals at random. Whereas the first strategy seems intuitively favorable, the second one might be expected to conform better with the principle of merging sub-populations and getting good results by evolution in the long run. FOR all nodes DO in parallel: Set temperature T to a sufficiently high value. Build a population of size n by generating n individuals which represent consistent solutions, evaluating the fitness of each individual. Mark all individuals as "unused". REPEAT DO n/2 times: Select two "unused" individuals as parents. Create two children by applying cross-over to the parents. Apply mutation to each child once. Evaluate fitness of both children. Mark individuals as "parent" and "child", resp. DO n times: Select one "parent" and one "child" for Boltzmann trial. Keep the winner and mark it as "unused". Remove the looser. Lower T. Select individuals for migration and send them to other nodes. Accept individuals from other nodes. Replace sent individuals by received ones. UNTIL termination criterion is satisfied. Fig. 6: Parallel version of PRSA algorithm One important prerequisite for good parallelization is an appropriate communication topology. The communication topology defines how nodes are logically connected. To minimize processing overhead, communication paths should be as short as possible (no "via-routing"). Since each link can only serve one communication task at a time, communication between processes should be minimal. This means that the number of parallel processes running independently from each other should be maximized. Furthermore, a "natural" topology for communication reflecting the hardware topology both facilitates implementation and reduces communication overhead. Fig. 7: Grid communication topology Transputer node In our first approach, we started with a grid structure (see figure 7) because it is easy to implement, and debugging is also simpler. The grid consists of a set of processors arranged in rectangular form. Each processor is linked with its nearest neighbors. In this way, inner nodes are connected with four neighbors whereas outer nodes have links to three or two neighbors, respectively. Migration in the parallel method means that individuals are sent to the immediate neighbors of each node. 5 TUNING THE HYBRID APPROACH The quality of the results obtained by PRSA, and also by its parallel implementation, depends on several parameters which are set externally, such as cooling schedule, population size, selection strategy, migration rate, and migration strategy. The cooling schedule has a major impact on convergence and solution quality. If the temperature is decreased quickly then the algorithm converges fast, but results tend to get worse. On the other hand, slow cooling will make the algorithm slow but give better results. If the population size is set to a very high value then chances are that PRSA will converge soon with good results. However, this will require large amounts of memory, and the time to compute one generation will also be very long. If population size is small, little memory is needed and computation time per generation will be short, but it may take many generations to explore the solution space and arrive at a good solution. The migration rate (number of migrants sent from one node) influences convergence as follows: If a large number of individuals are sent/received then all local populations tend to become similar. So PRSA might converge fast but the solution is likely to be only moderate. Processing speed will also be slow because communication sending many individuals will use up a lot of time. On the other hand, if migration rate is low then all processors will search more or less independently and - 5 -

6 probably not find the global minimum within reasonable time. Migrating more individuals can guide the search into new regions and thus help to direct evolution of subpopulations towards the best solution. 6 RESULTS AND CONCLUSIONS There exists no comprehensive model of the interrelations between these parameters and of their effects on solution quality and on computation effort nor on their interaction. Effects can only be examined in an exploratory manner. For this purpose, we conducted a set of tests. In particular, we varied selection strategy (see 4.2), migration rate and migration strategy (see 4.3) as well as number of processing nodes. As a general result, we found that basic PRSA is much faster than simulated annealing and that solution quality (total elapsed time) is slightly better. Compared to a genetic algorithm, PRSA is also better, but it takes longer. The parallel version of PRSA showed further improvements with regard to both computing time and objective function. However, average improvement of solution quality as compared to simulated annealing was less than 2 %. Furthermore, parameter settings were found to have some impact on both performance and solution quality. Most combinations of selection strategies a) to c) with migration strategies 1) and 2) lead to comparable good results. Only constellation b)-2) selection from highfitness children competing against low-fitness parents, and choice of migration individuals at random extended total elapsed time by an average of more than 30 %. [5] Kurbel, K., Rohmann, T.: Ein Vergleich von Verfahren zum Problem der Maschinenbelegungsplanung: Simulated Annealing, Genetische Algorithmen, mathematische Optimierung (to appear). [6] Kurbel, K., Ruppel, A.: Integrating Intelligent Jobscheduling into a Real-world Production-scheduling System (to appear). [7] Kurbel, K.: Production Scheduling in a Leitstand System Using a Neural-net Approach; in: Balagurusamy, E., Sushila, B. (eds.): Artificial Intelligence Technology - Applications and Management; New Delhi, New York et al. 1993, pp [8] Mahfoud, W. S., Goldberg, D. E.: Parallel recombinative simulating annealing: a genetic algorithm; Technical Report No , University of Illinois, [9] Nakano, R., Yamada, T.: Conventional Genetic Algorithm for Job Shop Problems; in: Belew, R. K., Booker, L. B. (eds.): Proceedings of the Fourth International Conference on Genetic Algorithms, San Mateo, CA 1991, pp [10] Schöneburg, E., Heinzmann, F., Feddersen, S.: Genetische Algorithmen und Evolutionsstrategien; Bonn et al References [1] Aarts, E.H.L., Korst, J.H.M.: Simulated Annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing; New York [2] Davis, L.: Handbook of Genetic Algorithms; New York [3] Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning; Reading, MA [4] Kirkpatrick, S., Gelatt (Jr.), C. D., Vecchi, M. P.: Optimization by Simulated Annealing; Science 220 (1983) 5, pp