Introduction Evolutionary Algorithm Implementation

Size: px
Start display at page:

Download "Introduction Evolutionary Algorithm Implementation"

Transcription

1 Introduction Traditional optimization methods fail when there are complex, nonlinear relationships between the parameters and the value to be optimized, the goal function has many local extrema, and resources are limited. Modern heuristic optimization methods are employed in such cases. Evolutionary Algorithm In artificial intelligence, an evolutionary algorithm or EA for short is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions. Evolution of the population then takes place after the repeated application of the above operators. Evolutionary algorithms often perform well approximating solutions to all types of problems because they ideally do not make any assumption about the underlying fitness landscape. Techniques from evolutionary algorithms applied to the modeling of biological evolution are generally limited to explorations of micro-evolutionary processes and planning models based upon cellular processes. In most real applications of EAs, computational complexity is a prohibiting factor. In fact, this computational complexity is due to fitness function evaluation. Fitness approximation is one of the solutions to overcome this difficulty. Implementation This slide shows the description of how EAs work. Step 1) Generate the initial population of individuals randomly. Step 2) Evaluate the fitness of each individual in that population. Step 3) Repeat the following re-generational steps until termination: Step 3-1) Select the best-fit individuals for reproduction. (Parents) Step 3-2) Breed new individuals through crossover and mutation operations to give birth to offspring. Step 3-3) Evaluate the individual fitness of new individuals. Step 3-4) Replace least-fit population with new individuals.

2 Type Similar techniques differ in genetic representation and other implementation details, and the nature of the particular applied problem. Genetic algorithm is the most popular type of EA. One seeks the solution of a problem in the form of strings of numbers. Traditionally binary string representation is used although the best representations are usually those that reflect something about the problem being solved, by applying operators such as recombination and mutation. This type of EA is often used in optimization problems. Next instantiation would be Genetic programming. The solutions are in the form of computer programs, and their fitness is determined by their ability to solve a computational problem. Evolutionary programming is Similar to genetic programming, but the structure of the program is fixed and its numerical parameters are allowed to evolve. Gene expression programming or GEP for short is something like genetic programming. GEP also evolves computer programs but it explores a genotype-phenotype system, where computer programs of different sizes are encoded in linear chromosomes of fixed length. Evolution strategy works with vectors of real numbers as representations of solutions, and typically uses self-adaptive mutation rates. Differential evolution is based on vector differences and is therefore primarily suited for numerical optimization problems. Neuro-evolution is similar to genetic programming but the genomes represent artificial neural networks by describing structure and connection weights. The genome encoding can be direct or indirect. The last one is Learning classifier system or LCS for short. In this approach, the solution is a set of classifiers, rules or conditions. A Michigan-LCS evolves at the level of individual classifiers whereas a Pittsburgh-LCS uses populations of classifier-sets. Initially, classifiers were only binary, but now include real or neural net. Fitness is typically determined with either a strength or accuracy based reinforcement learning or supervised learning approach. Related techniques Because EAs got significant attention from the research community, it has many variants. Particle swarm optimization is based on the ideas of animal flocking behaviour. Also primarily suited for numerical optimization problems. Ant colony optimization is based on the ideas of ant foraging by pheromone communication to form paths. Primarily suited for combinatorial optimization and graph problems. Cuckoo search is inspired by the brooding parasitism of the cuckoo species. It also uses Lévy flights, and thus it suits for global optimization problems.

3 Firefly algorithm is inspired by the behavior of fireflies, attracting each other by flashing light. This is especially useful for multimodal optimization. Harmony search is based on the ideas of musicians' behavior in searching for better harmonies. This algorithm is suitable for combinatorial optimization as well as parameter optimization. The runner-root algorithm or RRA for short is inspired by the function of runners and roots of plants in nature. Artificial bee colony algorithm is based on the honey bee foraging behaviour. Primarily proposed for numerical optimization and extended to solve combinatorial, constrained and multi-objective optimization problems. Bees algorithm is based on the foraging behaviour of honey bees. It has been applied in many applications such as routing and scheduling. Memetic algorithm is a hybrid method and emphasizes the exploitation of problem-specific knowledge, and tries to orchestrate local and global search in a synergistic way which inspired by Richard Dawkins' notion of a meme. It commonly combines a population-based algorithm and a learning procedures performing local refinements. Summary EAs transpose the notions of natural evolution to the world of computers and imitate natural evolution. EAs evolve solutions to a problem by maintaining a population of potential solutions. EAs are based on the principle of Survival of the fittest meaning that Fit individuals live to reproduce, weak individuals die off. The population of a Genetic Algorithm is maintained by using unary search operators called mutation and binary search operators called crossover, and reproduction and selection through several generations in the hope of finding good enough solutions. Briefly, previously evolved good parts of solutions can be transferred to subsequent generations through crossover. Among many EAs, genetic algorithm is the most popular algorithm owing to its simplicity and memetic algorithm is known as effective for solving many problems including TSP. Thus, we will focus on genetic algorithm and memetic algorithm more detailedly. Genetic Algorithm In computer science and operations research, a genetic algorithm or GA for short is a metaheuristic inspired by the process of natural selection that belongs to the larger class of EA. Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection.

4 Overview In a genetic algorithm, a population of candidate solutions called individuals, creatures, or phenotypes to an optimization problem is evolved toward better solutions. Each candidate solution has a set of properties, its chromosomes or genotype which can be mutated and altered. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals, and is an iterative process, with the population in each iteration called a generation. In each generation, the fitness of every individual in the population is evaluated. The fitness is usually the value of the objective function in the optimization problem being solved. The more fit individuals are stochastically selected from the current population, and each individual's genome is modified i.e. recombined and possibly randomly mutated to form a new generation. The new generation of candidate solutions is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. Genetic Algorithm Design: Representation A typical genetic algorithm requires a genetic representation of the solution domain, and a fitness function to evaluate the solution domain. The representation is the first step of designing a GA. The representation together with the genetic operators bound the exploration of the search space. A standard representation of each candidate solution is as an array of bits. Accordingly, the basic representation of GA is to represent the individuals or candidate solutions as fixed length bit strings. For example, Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size. Variable length representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in genetic programming and graph-form representations are explored in evolutionary programming; a mix of both linear chromosomes and trees is explored in gene expression programming. Incorporating domain knowledge into the representation helps guiding the evolutionary process toward good solutions.

5 Genetic Algorithm Design: Fitness Function In case of GAs, the evaluation value of an individual is also called fitness. We can define and incorporate a fitness function into GA for forcing the direction of evolution. The fitness function is defined over the genetic representation and measures the quality of the represented solution. The fitness function is always problem dependent. For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The fitness of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise. In some problems, it is hard or even impossible to define the fitness expression. In these cases, a simulation may be used to determine the fitness function value of a phenotype or interactive genetic algorithms that a human evaluator assigns a fitness value manually. In addition, fitness can also be assigned by comparing the individuals in the current population. Genetic Algorithm Design: Initialization The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Often, the initial population is generated randomly, allowing the entire range of possible solutions or the search space. Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found. Genetic Algorithm Design: Genetic Operators Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions. These less fit solutions ensure the genetic diversity of the

6 subsequent generation of children. The most common crossover is the one-point crossover. For example one-point crossover is used for the basic representation. It is described by the figure in the right side of this slide. Consider the bit strings Parent 1 = and Parent 2 = Apply a one-point crossover between the fifth and sixth bit. What are the children or offsprings of this crossover? Child 1 is obtained by taking the first fifth bits from Parent1 and the next fifth bits from Parent2. Child 2 is obtained by taking the first fifth bits from Parent2 and the next fifth bits from Parent1. Mutation consists of applying minor changes to one individual, for example flipping a bit. A mutation of is , obtained by flipping the fourth bit. Although crossover and mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms. It is worth tuning parameters such as the mutation probability, crossover probability and population size to find reasonable settings for the problem class being worked on. A very small mutation rate may lead to genetic drift which is non-ergodic in nature. A recombination rate that is too high may lead to premature convergence of the genetic algorithm. A mutation rate that is too high may lead to loss of good solutions, unless elitist selection is employed. Genetic Algorithm Design: Natural Selection In GAs, only selected individuals of a population are allowed to have offspring. Specifically, during each generation, a portion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions are typically more likely to be selected. Although the selection is conducted based on fitness, there is several ways to chose chromosomes to be survived. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as the former process may be very time-consuming. The first one is fitness proportional selection. Individuals are selected based on the value of their fitness. Individuals with a higher fitness will be more likely to be selected. If is the fitness of individual in the population, its probability of being selected is =, where N is the number of individuals in the population. The fitness proportional selection is not good in the cases when the fitnesses are too different. For example, if individual has a probability of 90% for being selected, i.e., = 0.9, then the other individuals will have very limited chance to be selected less than 10% because = 1. In this case, a ranked selection is needed. The fitness of the individuals are ranked and they are

7 selected with a probability which is proportional to their rank. The drawback of this selection method is that it is slower because the best individuals are not too different from the other ones. Last, from the Tournament selection, a few individuals are chosen randomly. In other words, tournaments are organized and the best one is selected for crossover. The process is repeated several times. Thus, several tournaments are organized and the winner of each one is selected for crossover. In order to avoid selecting weak individuals tournament size can be increased. For diversifying the search, to increase exploration and have a higher chance to find good individuals, chances can be given for individuals weaker then the best one in each tournament. For example, the best individual in a tournament can be selected with a high probability, for example = 0.9, the second best with the probability (1 ), the third best one with the probability (1 ), and in general the th best one with a probability (1 ). Genetic Algorithm Design: Termination This generational process is repeated until a termination condition has been reached. Common terminating conditions are: A solution is found that satisfies minimum criteria Fixed number of generations reached Allocated budget (computation time/money) reached The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results Manual inspection Combinations of the above Genetic Algorithm Design: Heuristics In addition to the main operators above, other heuristics may be employed to make the calculation faster or more robust. For example, we can employ 2-interchange or 2-swap operator to refine solutions stochastically created by genetic reproduction process. This inherently leads to the Memetic Algorithm.

8 Memetic Algorithm Memetic algorithms or MAs for short, represent one of the recent growing areas of research in evolutionary computation. The term MA is now widely used as a synergy of evolutionary or any population-based approach with separate individual learning or local improvement procedures for problem search. Inspired by both Darwinian principles of natural evolution and Dawkins' notion of a meme, the term "Memetic Algorithm" was introduced in 1989 where he viewed MA as being close to a form of population-based hybrid GA coupled with an individual learning procedure capable of performing local refinements. Memetic Algorithm: 1st Generation The first generation of MA refers to hybrid algorithms, a combination of a population-based global search, often in the form of an evolutionary algorithm coupled with a cultural evolutionary stage. This first generation of MA although encompasses characteristics of cultural evolution in the form of local refinement, in the search cycle, it may not qualify as a true evolving system according to Universal Darwinism, since all the core principles of inheritance/memetic transmission, variation, and selection are missing. The pseudocode of MA would look like this. Procedure Memetic Algorithm Initialize: Generate an initial population; while Stopping conditions are not satisfied do Evaluate all individuals in the population. Evolve a new population using stochastic search operators. Select the subset of individuals, Ω, that should undergo the individual improvement procedure. for each individual in Ω do Perform individual learning using meme(s) with frequency or probability of f, for a period of t. Proceed with Lamarckian or Baldwinian learning. end for end while

9 Memetic Algorithm: 2nd Generation Multi-meme, Hyper-heuristic and Meta-Lamarckian MA are referred to as second generation MA exhibiting the principles of memetic transmission and selection in their design. In Multi-meme MA, the memetic material is encoded as part of the genotype. Subsequently, the decoded meme of each respective individual/chromosome is then used to perform a local refinement. The memetic material is then transmitted through a simple inheritance mechanism from parent to offsprings. On the other hand, in hyper-heuristic and meta-lamarckian MA, the pool of candidate memes considered will compete, based on their past merits in generating local improvements through a reward mechanism, deciding on which meme to be selected to proceed for future local refinements. Memes with a higher reward have a greater chance of being replicated or copied. Memetic Algorithm: 3rd Generation Co-evolution and self-generating MAs may be regarded as 3rd generation MA where all three principles satisfying the definitions of a basic evolving system have been considered. In contrast to 2nd generation MA which assumes that the memes to be used are known a priori, 3rd generation MA utilizes a rule-based local search to supplement candidate solutions within the evolutionary system, thus capturing regularly repeated features or patterns in the problem space. Memetic Algorithm: Design Issues The frequency and intensity of individual learning directly define the degree of evolution or exploration against individual learning or exploitation in the MA search, for a given fixed limited computational budget. A more intense individual learning provides greater chance of convergence to the local optima but limits the amount of evolution that may be expended without incurring excessive resources. Thus, care should be taken when setting these two parameters to balance for achieving maximum search performance. When only a portion of the population individuals undergo learning, the issue of which subset of individuals to improve need to be considered to maximize the utility of MA search. How often should individual learning be applied? One of the first issues pertinent to memetic algorithm design is to consider how often the individual learning should be applied; i.e., individual learning frequency. In one case, the effect of individual learning frequency on MA search performance was

10 considered where various configurations of the individual learning frequency at different stages of the MA search were investigated. Conversely, it was shown that it may be worthwhile to apply individual learning on every individual if the computational complexity of the individual learning is relatively low. On which solutions should individual learning be used? On the issue of selecting appropriate individuals among the EA population that should undergo individual learning, fitness-based and distribution-based strategies were studied for adapting the probability of applying meme on the population of chromosomes. How long should individual learning be run? Individual learning intensity, t, is the amount of computational budget allocated to an iteration of individual learning. In other words, the maximum computational budget allowable for individual learning to expend on improving a single solution. What meme should be used for a particular problem or individual? In the context of continuous optimization, individual learning exists in the form of local heuristics or conventional exact enumerative methods. Examples of individual learning strategies include the hill climbing, gradient method, and other local heuristics. In combinatorial optimization, on the other hand, individual learning methods commonly exist in the form of heuristics that are tailored to a specific problem of interest. For example, when we design a MA for TSP, then 2-interchange and 2-swap can be used. GA for TSP Now let us return to GA which is one of the most fundamental and also widely-used search method among EAs. What kind of GA should be applied to our TSP? According to the representation and variation operations, a series of GAs can be implemented. Although the evaluation function for TSP is very simple, but the length of a tour, the ways of encoding tours and the variation operators corresponding to these representations are less obvious. Because the most natural representation of a tour is a permutation listing the cities in order of their appearance in the tour, we only consider Path representation. Note that the terms genetic operator / variation operator / evolutionary operator have the same meaning, usually standing for a crossover or a mutation, although more complicated operators are also possible.

11 GA for TSP: Partially-mapped crossover Let talk about how to design a crossover operator for our TSP. The first one is Partially-Mapped Crossover, or shortly PMX. The description of this crossover is through an example shown in above figure. Consider two cut points represented as vertical lines. For the o1 offspring consider the middle part from the p2 parent, while for the o2 offspring consider the middle part from the p1 parent. Here x denotes the positions which have not been filled in yet. Map the corresponding elements from the middle parts of the parents to each other: 1 4, 8 5, 7 6, 6 7. For the o1 offspring fill in the no-conflict unfilled positions x from p1, the ones would not lead to premature cycles, while for the o2 offspring fill in the no-conflict unfilled positions x from p2. Use the mappings for the remaining s, we can obtain two legal offsprings. GA for TSP: Order crossover Let us consider the order crossover. Firstly, consider two cut points. Similar to the example of PMX, x denotes the positions which haven t been filled in yet. Starting from the second cut point the cities from the other parent, i.e., p2 for o1 and p1 for o2, are copied in the same order, omitting the ones already present. This is the procedure of creating two offsprings using Order crossover. A huge amount of effort was spent on inventing suitable representations and a recombination operator that preserves partial tours. For us, among these two operators, it is unknown which one is better than the other for our TSP problem. Unfortunately, GA alone cannot compete with the Lin-Kernighan algorithm, neither quality nor time. However, Local search can be incorporated into the genetic algorithm. Specifically, before evaluation, a local optimization is applied to each individual. Many studies reported that this hybridization leads to better results.