bination in Evolutionay Algoithms (EAs). This agument, whih we all the dual-ole of eombination, help us in undestanding the ole of this omplex opeatio

Size: px
Start display at page:

Download "bination in Evolutionay Algoithms (EAs). This agument, whih we all the dual-ole of eombination, help us in undestanding the ole of this omplex opeatio"

Transcription

1 On Reombination and Optimal Mutation Rates Gabiela Ohoa, Inman Havey, Hilay Buxton Cente fo Computational Neuosiene and Robotis Shool of Cognitive and Computing Sienes The Univesity of Sussex Falme, Bighton BN1 9QH, UK Abstat We pesent empiial evidene, fom a wide ange of poblem haateistis, suggesting that the value of optimal mutation ates in GAs dies aoding to whethe eombination is used o not. Without eombination, a egime that stats with a high mutation ate, deeasing it towads the end of the un, appeas to be optimal. With eombination, howeve, the optimal stategy poves to be a onstant, suiently low mutation ate. Moeove, when eombination is used, the hoie of an exessively high mutation ate might degade the algoithm's pefomane onsideably. These esults ae suppoted by eent knowledge fom the eld of moleula evolution about the eet of eombination on the so alled eo thesholds. We onlude by poposing a novel agument favoing the use of eombination in GAs. This agument, whih we all the dual-ole of eombination, sheds new light on the ole of this opeato in geneti seah. 1 INTRODUCTION It has long been aknowledged that a GA's pefomane depends heavily on the hoie of its main paametes: mutation ate, ossove ate, and population size. These paametes typially inteat with one anothe in a nonlinea fashion, so they annot be independently optimized. Optimal paamete settings have been the subjet of numeous studies in the GA liteatue, but thee is no onlusive ageement on what is best most people use what has woked well in peviously epoted ases. Patiula emphasis has been plaed on nding optimal mutation ates [Fogaty, 1989, Muhlenbein, 1992, Hesse and Manne, 1991, Bak, 1993]. Most theoetial studies aimed at nding optimal mutation values, howeve, neglet eombination in ode to simplify the analysis [Bak, 1993, Hesse and Manne, 1991, Muhlenbein, 1992]. On the othe hand, lassial empiial studies aimed at nding optimal paamete settings, use a xed set of test poblems [DeJong, 1975, Gefenstette, 1986, Shae et al., 1989]. One weakness with these lassial studies is that thei esults may not genealize beyond the test poblems used. Aoding to Speas [Speas, 1998] thee ae two ways to stengthen the esults obtained fom empiial studies. The st is to emove the oppotunity to handtune algoithms to a patiula set of poblems. The seond is to always show esults ove the unning time of a GA (see setion 3). In this pape we use these methodologial guidelines to show that the hoie of an optimal mutation sheme depends on whethe eombination is used o not. Fo a GA without eombination, the optimal stategy appeas to be the geneally aknowledged heuisti of stating with a elatively high mutation ate, eduing it ove the ouse of a single un [Fogaty, 1989, Muhlenbein, 1992, Bak, 1991, Bak, 1993]. Howeve, when eombination is used, a xed, suiently low mutation ate poves to be the optimal stategy. Moeove, with eombination, the GA pefomane is moe sensitive to the use of an inappopiately high mutation ate. These ae moe than just empiial esults: theoetial knowledge fom the eld of moleula evolution suppot them. The agument, explained in moe detail in setion 2, is that the notion of optimal mutation ates is elated to the so alled \eo thesholds". And thus the eets of eombination on eo thesholds ae e- eted on optimal mutation ates. This explanation and futhe insight, lead us to poposing a new agument favoing the use of eom-

2 bination in Evolutionay Algoithms (EAs). This agument, whih we all the dual-ole of eombination, help us in undestanding the ole of this omplex opeation in EAs (see setion 5). In the eminde of the pape we summaize the knowledge fom moleula evolution elevant to ou agument, we desibe the empiial methodology used, we pesent the expeimental esults obtained, and we disuss the insight gained. 2 ERROR THRESHOLDS The eo theshold a notion fom moleula evolution is a itial mutation ate beyond whih stutues obtained by the evolutionay poess ae destoyed moe fequently than seletion an epodue them. With mutation ates above this itial value, an optimal solution would not be stable in the population, i.e., the pobability that the population loses these stutues is not negligible. The notion of eo theshold, then, seems to be intuitively elated to the idea of an optimal balane between exploitation and exploation in geneti seah. Too low amutation ate implies too little exploation in the limit of zeo mutation, suessive geneations of seletion emove all vaiety fom the population, and one the population has onveged to a single point in genotype spae all futhe exploation eases. On the othe hand, lealy, mutation ates an be too exessive in the limit whee mutation plaes a andomly hosen allele at evey lous on an osping genotype, then the evolutionay poess has degeneated into andom seah with no exploitation of the infomation aquied in peeding geneations. Any optimal mutation ate must lie between these two extemes, but its peise position will depend on a numbe of fatos inluding, in patiula, the stutue of the tness landsape unde onsideation. It an, howeve, be hypothesized that a mutation ate just below the eo theshold is the optimal mutation ate fo the landsape unde study. The lose oespondene between eo thesholds and optimal mutation ates may be assessed empiially. Given that mutation ates should not be above eo thesholds, it annot be immediately assumed that optimal mutation ates ae elated to this uppe bound howeve, expeiments whee the eo theshold and the optimal mutation ates ould be assessed independently showed that thee was suh a elationship. These expeiments will be epoted in detail elsewhee. Some biologial evidene suppots the elationship between eo thesholds and optimal mutation ates. Eigen and Shuste [Eigen and Shuste, 1979] have pointed out that viuses whih ae vey eiently evolving entities live within and lose to the eo thesholds given by the known ates of nuleotide mutations. This oespondene has also been notied befoe in the GA ommunity: Hesse and Manne [Hesse and Manne, 1991], devised an heuisti fomula fo optimal setting of mutation ates inspied by Nowak and Shuste's wok on eo thesholds [Nowak and Shuste, 1989]. Moeove, Kauman [Kauman, 1993] (p. 107), talking about an optimum mutation ate, suggests that \That ate is likely to ou when populations ae just beginning to melt fom peaks". 2.1 RECOMBINATION AND ERROR THRESHOLDS A elatively eent wok fom the evolutionay biology liteatue [Boelijst et al., 1996], epots inteesting esults about the ole of eombination on evolving population of viuses. In patiula, they study the effet of eombination on the magnitude of the eo theshold. A mathematial model with innite populations was used. Thei esults may be summaized as follows: fo low mutation ates, eombination an fous the population aound a tness optimum and thus enhane oveall tness. Fo high mutation ates, howeve, eombination an push the population ove the eo theshold, and theefoe ause a loss of geneti infomation. In othe wods, eombination shifts the eo theshold to lowe mutation ates, and, in addition, makes this tansition shape. The explanation given by the authos to this phenomenon is as follows [Boelijst et al., 1996] (p. 1581): Nea the eo theshold, without eombination, the ttest stain only makes up a small peentage of the total population [Eigen and Shuste, 1979]. Unde suh onditions eombination ats as a diveging opeation, diving the population beyond the eo theshold. Thee an be seletion fo eombination if tness is oelated and if the mutation ate is suiently small. In [Ohoa and Havey, 1998] we epodue, using GAs and hene nite populations some of the esults obtained by Boelijst et al. GA simulation esults wee stikingly simila qualitatively to those obtained analytially. Thus, the main esults desibed above fo innite populations also hold fo an evolving (nite) population of bit-stings using a standad GA.

3 3 METHODS Reently, De Jong, Speas, and Potte poposed a new empiial methodology fo studying the behaviou of EAs [DeJong et al., 1997, Speas, 1998]. This appoah employs the so alled poblem geneatos. A poblem geneato is an abstat model apable of poduing andomly geneated poblems on demand. The advantages of using poblem geneatos ae two-fold. Fist, they allow us to epot esults ove a andomly geneated set of poblems athe than a few handhosen examples, ineasing in this way the peditive powe of the esults fo the poblem lass as a whole. Seondly, poblem geneatos ae quite easy to paameteize, allowing the design of ontolled expeiments whee patiula featues of a lass of poblems an be vaied systematially to study the eets on the EA behavio. Fo ou study, we adopted this methodology and seleted two poblems geneatos: (i) the NK- Landsape geneato (setion 3.1), and (ii) the Multimodal geneato (setion 3.2). 3.1 THE NK-LANDSCAPE GENERATOR Kauman [Kauman, 1989], desibes a family of tness landsapes detemined by two paametes: N and K. The points of the NK-Landsape ae bit stings of length N. The paamete K epesents the degee of epistati inteation between the bits, that is, the numbe of linkages eah lous has to othe loi in the same sting. To ompute the tness of the entie sting s, the tness ontibution fom eah lous is aveaged as follows: f(s) = 1 N P N i=1 f(lous i), whee the tness ontibution of eah lous,f(lous i ), is detemined by using the (binay) value of gene i togethe with values of the K inteating loi as an index into a table T i of size 2 k+1 of unifomly distibuted andom numbes ove [0:0 1:0]. Fo a given lous i, the set of K linked loi may be andomly seleted o onsist of the immediately adjaent loi. An inteesting popety of the NK-landsapes is that the uggedness of the tness landsape an be tuned by hanging the paamete K. Fom a patial pespetive, howeve, the NK-landsape pesents some diulties (in patiula the lage spae equied to stoe the tables to ompute the tness) whih estit its use to elative small models. 3.2 THE MULTIMODAL GENERATOR The multimodal geneato was poposed eently by De Jong, Potte, and Speas [DeJong et al., 1997]. The idea is to geneate P andom N-bit stings, whih epesent the loation of the P peaks in the spae. To evaluate any bit sting s, st loate the neaest peak (in Hamming spae). Then the tness of s is the numbe of bits s has in ommon with that neaest peak, divided by N. f(s) = 1 N maxp i=1 (N ; Hamming(s P eak i)) Poblems with a small/lage numbe of peaks ae weakly/stongly epistati. The multimodal geneato is vey eient in tems of memoy stoage (only the P peaks need to be stoed). Howeve, the omputation of tness beomes vey slow as the numbe of peaks is ineased. 4 EXPERIMENTAL RESULTS Following the guidelines of De Jong et al., the expeimental methodology used was as follows [DeJong et al., 1997]: fo eah of the seleted settings of the poblem geneato paametes, 20 poblems wee andomly geneated. The GA was un one pe poblem, and the esults wee aveaged ove those 20 poblems. Fo all the expeiments, a standad geneational GA with tness popotional seletion was employed. Population size and homosome length wee set to 100. Two-point ossove and the standad bit mutation opeation wee used. Fo the GA with eombination, a ossove ate of 0.6 was seleted. These ae quite typial settings fo GAs. Expeiments wee un fo a maximum of 1000 geneations. Toseehow the mutation ate value aets the GA pefomane with and without eombination, we seleted thee mutation ates (,, and ), and an the algoithm in two modes. In the st mode (GA) both mutation and eombination wee used. In the seond mode (GA-m) only mutation was used. Table 1 summaizes the GA paamete setting used fo the expeiments. The pefomane meti we monitoed is well-known { namely \best-so-fa" uves that plot the tness of the best individual that has been seen thus fa by geneation n. Eah uve plots the aveage best-so-fa values of 20 uns. Fo the sake of laity, the standad deviations fo these uves wee not plotted. Howeve, they all showed to be quite low in the ange of [, 0.02].

4 Chomosome length 100 Population size 100 Cossove ate 0.6 (GA), 0.0 (GA-m) Mutation ate,, Geneations 1000 No. of Poblems 20 Table 1: GA paametes 4.1 NK EXPERIMENTS 1 Given that we seleted elatively long homosomes, the stoage equiements fo the NK tables make it diult to exploe lage values of K. Thus, we tested NK landsapes fo K = 0,and K = 2. Fo moe omplex landsapes we elied on the multimodal poblem geneato esults (setion 4.2). The NK model with K = 0, podues a vey tivial \Mount Fuji" landsape. We used it, howeve, as a baseline ompaison befoe moving on to moe inteesting landsapes. Figues 1 and 2 illustate esults fo GA and GA-m on the NK landsape with K equals zeo. 0:62 0:61 0:59 0:58 0:57 0:56 0:55 GA. NK (K = 0) 0:54 Geneations Figue 1: Aveage best-so-fa uves. GA with distint mutation ates on the NK landsape (K =0) Figues 3 and 4 show the aveage best-so-fa uves fo a level of epistasis K of two, with and without eombination. When eombination is used, it an be lealy notied that the lowest mutation ate exploed (0:001) po- 1 Fo the NK expeiments, we used the feewae implementation due to M. Potte 0:59 0:58 0:57 0:56 0:55 GA-m. NK (K = 0) 0:54 Geneations Figue 2: Aveage best-so-fa uves. GA-m with distint mutation ates on the NK landsape (K =0) 0:68 0:66 0:64 0:62 0:58 GA. NK (K = 2) 0:56 Geneations Figue 3: Aveage best-so-fa uves. GA with distint mutation ates on the NK landsape (K =2) dues the best esults ove the entie algoithm un (Figues 1 and 3). This is moe evident fo the moe epistati landsape when K equals two (Figue 3). On the othe hand, fo the GA without eombination (GA-m), the highe mutation ates (0:005 and 0:01) speed up notieably the seah poess at the beginning and intemediate stages of the seah (Figues 2 and 4), howeve, by the nal stages of the un the lowest mutation ate uve (0:001) appoahes the othe

5 0:64 0:63 0:62 0:61 0:59 0:58 0:57 GA-m. NK (K=2) 0:56 Geneations 0:95 0:90 0:85 0:75 0:65 GA. Multimodal (1 Peak) Geneations Figue 4: Aveage best-so-fa uves. GA-m with distint mutation ates on the NK landsape (K =2) Figue 5: Aveage best-so-fa uves. GA with distint mutation ates on the multimodal landsape (1 peak) two, and nally eahes the highest tness values. 4.2 MULTIMODAL GENERATOR EXPERIMENTS 2 Expeiments wee un fo 1, 100, and 500 peaks poblems. Figues 5 and 6 show the aveage best-so-fa uves fo a GA with and without eombination on 1 peak poblems. Figues 7 and 8 show the aveage best-so-fa uves fo fo GA and GA-m on 100 peaks poblems, wheeas Figues 9 and 10 do so on 500 peaks poblems. Again, when eombination is used, the lowest mutation ate exploed (0:001) podues the best pefomane ove the entie algoithm un fo 1, 100 and 500 peaks poblems (Figues 5, 7, and 9). Moeove, it an be lealy seen that while ineasing the numbe of peaks, the eet is moe ponouned. In othe wods, the dieene between the best-so-fa uves is moe notieable. Without eombination, again, the highe mutation ates exploed ( and ) ineased pefomane at ealy stages (Figues 6, 8 and 10). Note, howeve, that eventually the pefomane uves fo the lowest mutation ate (0:001) pik up in late geneations. This ous ealie fo the moe omplex landsapes (those with 100 and 500 peaks Figues 8 and 10). What appeas to be happening is that at late stages 2 Fo the multimodal geneato expeiments, we used the feewae implementation due to W. Speas GA-m. Multimodal (1 Peak) 0:82 0:78 0:76 0:74 0:72 0:68 0:66 0:64 0:62 Geneations Figue 6: Aveage best-so-fa uves. GA-m with distint mutation ates on the multimodal landsape (1 peak) of the seah, only a few bits need to be hanged, and a high mutation ate might have a disuptive eet. 5 DISCUSSION In this pape we used the so-alled poblem geneatos to empiially exploe optimal mutation ates fo GAs with and without eombination. The main onlusion

6 0:95 GA. Multimodal (100 Peaks) 0:95 GA. Multimodal (500 Peaks) 0:90 0:85 0:75 0:65 Geneations 0:90 0:85 0:75 Geneations Figue 7: Aveage best-so-fa uves. GA with distint mutation ates on the multimodal landsape (100 peaks) Figue 9: Aveage best-so-fa uves. GA with distint mutation ates on the multimodal landsape (500 peaks) 0:82 0:78 0:76 0:74 0:72 GA-m. Multimodal (100 Peaks) 0:68 Geneations 0:84 0:82 0:78 0:76 0:74 0:72 GA-m. Multimodal (500 Peaks) Geneations Figue 8: Aveage best-so-fa uves. GA-m with distint mutation ates on the multimodal landsape (100 peaks) Figue 10: Aveage best-so-fa uves. GA-m with distint mutation ates on the multimodal landsape (500 peaks) holds fo all the senaios studied: the optimal mutation sheme fo a geneti algoithm dies aoding to whethe eombination is used o not. Fo a GA with mutation and seletion only, the seah poess bene- ts fom stating with a elatively high mutation ate, deeasing it towads the nal stages of the seah. These esults ae in ageement with pevious obseva- tions epoted in the liteatue that a time-dependent vaiation of the mutation ate may impove GA pefomane [Fogaty, 1989, Muhlenbein, 1992, Bak, 1991, Bak, 1993]. On the othe hand, when eombination is used, a onstant, elatively low mutation ate seems to be

7 the optimal stategy. In this ase, seleting an exessively high mutation ate ove the eo theshold onsideably degades the algoithms pefomane. This eet was shown to be moe ponouned fo moe \omplex" landsapes (i.e. with highe levels of epistasis o multimodality o both). The poposed explanation fo the obseved eet of eombination on optimal mutation ates, is as follows: the notion of optimal mutation ates is elated to the notion of eo thesholds. Thus, the eets of eombination on eo thesholds, desibed in some detail in setion 2.1, ous as well on optimal mutation ates. Hee, we highlight ouintepetation of the obseved esults, whih poposes a novel agument about the impotant ole of eombination in EAs. Reombination pefoms a dual-ole in geneti seah aoding to the level of geneti onvegene of the population. At the beginning of the algoithm's un, when the population is satteed ove the seah spae, eombination ats as a diveging opeation, thus ineasing the algoithm's seah powe and speeding up the poess. In this ole, it an be said that eombination ats as a sot of \mao-mutation" opeato. Towads the nal stages of the seah, howeve, when the population is moe genetially homogeneous, eombination an fous the population aound the tness optimum. In this seond ole, eombination ats as an eo epai mehanism, helping in getting id of deleteious mutations. We onlude, then, that thee is no need of implementing a time-dependent mutation egime when eombination is used: eombination impliitly does this job fo us. This onfes a geat advantage and enouages, in ou opinion, the use of eombination in EAs. About the geneality of these esults, we must add that some othe moe taditional test funtions wee also investigated: the one-max funtion, the oyal oad funtion, and some funtions fom lassial optimization test suites. Due to spae limitations, we an only biey state that simila esults wee obtained. We know, howeve, that despite all these eots, it annot be ategoially assued that these esults apply to all poblem domains. It would be inteesting to test these ideas on some eal-wold appliations. Anothe senaio whee this ideas should be exploed is on landsapes with neutality (the extent to whih distint genotypes have the same o vey simila tness values) [Banett, 1997]. The onept of eo thesholds an be extended to suh landsapes, and futue wok will investigate whethe thee is a simila oelation with optimal mutation ates in this senaio. Peliminay expeiments suggest that these esults also hold fo othe ossove opeatos, suh as onepoint and unifom ossove. Highe ossove ates wee also tested, and esults suggest that the main onlusions not only hold but ae moe ponouned. What emains to be studied is the eet of hanging both population size and homosome length. We stongly believe that optimal mutation ates depend on the values of the above two paametes. Howeve, the main onlusions pesented hee, most pobably hold qualitatively. In the light of these esults, we popose two geneal heuistis fo setting GA paametes: When eombination is used, the mutation ate must be suiently small and onstant ove the entie un. When eombination is not used, a egime that stats with a high mutation ate, deeasing it towads the end of the un, may aeleate the seah poess. We ague that these heuistis have to be speially onsideed when empiially ompaing the elative impotane of mutation and eombination in geneti seah. To be fai, ompaisons should be made seleting the optimal mutation sheme fo eah stategy. Some nal wods about methodology ae woth mentioning. We stongly suppot the use of \test-poblem geneatos" as an empiial methodology, due to its advantages mentioned above. In patiula, we stongly agee that fom both an engineeing and sienti standpoint, it is uial to onside the dynami aspets of EAs by inluding esults thoughout thei entie un. Aknowledgements Thanks ae due to A. Meie and M. Sodo fo help and suppot duing this eot. Thanks to L. Mauo fo valuable suggestions and itial eading. Thanks also to M. Potte and W. Speas fo making thei soue ode available though the Intenet. Refeenes [Bak, 1991] Bak, T. (1991). Self-adaptation in geneti algoithms. In Vaela, F. J. and Bougine, P., editos, Poeedings of the Fist Euopean Confeene on Atiial Life. Towad a Patie of Autonomous Systems, pages 263{271, Pais, Fane. MIT Pess, Cambidge, MA.

8 [Bak, 1993] Bak, T. (1993). Optimal mutation ates in geneti seah. In Foest, S., edito, Poeedings of the 5th Intenational Confeene on Geneti Algoithms, pages 2{8, San Mateo, CA, USA. Mogan Kaufmann. [Banett, 1997] Banett, L. (1997). Tangled webs: Evolutionay dynamis on tness landsapes with neutality. Maste's thesis, Shool of Cognitive and Computing Sienes. [Boelijst et al., 1996] Boelijst, M. C., Bonhoee, S., and Nowak, M. A. (1996). Vial quasi-speies and eombination. Po. R. So. London. B, 263:1577{ [DeJong, 1975] DeJong, K. A. (1975). An Analysis of the Behavio of a Class of Geneti Adaptive Systems. PhD thesis, Univesity of Mihigan, Ann Abo, MI. Dissetation Abstats Intenational 36(10), 5140B, Univesity Miolms Numbe [DeJong et al., 1997] DeJong, K. A., Potte, M. A., and Speas, W. M. (1997). Using poblem geneatos to exploe the eets of epistasis. In Bak, T., edito, Poeedings of the 7th Intenational Confeene on Geneti Algoithms, pages 338{345, San Faniso. Mogan Kaufmann. [Eigen and Shuste, 1979] Eigen, M. and Shuste, P. (1979). The Hypeyle: A Piniple of Natual Self- Oganization. Spinge-Velag. [Kauman, 1993] Kauman, S. A. (1993). The Oigins of Ode: Self-Oganization and Seletion in Evolution. Oxfod Univesity Pess. [Muhlenbein, 1992] Muhlenbein, H. (1992). How geneti algoithms eally wok: I. mutation and hilllimbing. In Manne, B. and Mandeik, R., editos, Paallel Poblem Solving fom Natue, 2: Poeedings of the Seond Confeene on Paallel Poblem Solving fom natue, Bussels, pages 15{25. Noth- Holland. [Nowak and Shuste, 1989] Nowak, M. and Shuste, P. (1989). Eo thesholds of epliation in nite populations: Mutation fequenies and the onset of Mulle's athet. J. Theo. Biol., 137:375{395. [Ohoa and Havey, 1998] Ohoa, G. and Havey, I. (1998). Reombination and eo thesholds in nite populations. In Banzhaf, W. and Reeves, C., editos, Foundations of Geneti Algoithms (FOGA-5), San Faniso, CA. Mogan Kauman. [Shae et al., 1989] Shae, J., Cauana, R., Eshelman, L., and Das, R. (1989). A study of ontol paametes aeting online pefomane of geneti algoithms fo funtion optimization. In Shae, J. D., edito, Poeedings of the 3d ICGA, San Mateo CA. Mogan Kaufmann. [Speas, 1998] Speas, W. M. (1998). The Role of Mutation and Reombination in Evolutionay Algoithms. PhD thesis, Geoge Mason Univesity, Faifax, Viginia. [Fogaty, 1989] Fogaty, T. C. (1989). Vaying the pobability ofmutation in the geneti algoithm. In Shae, J. D., edito, Poeedings of the 3d Intenational Confeene on Geneti Algoithms, pages 104{109, Geoge Mason Univesity. Mogan Kaufmann. [Gefenstette, 1986] Gefenstette, J. J. (1986). Optimisation of ontol paametes fo geneti algoithms. IEE Tans SMC, 16(1):122{128. [Hesse and Manne, 1991] Hesse, J. and Manne, R. (1991). Towads an optimal mutation pobability fo geneti algoithms. In Shwefel, H.-P. and Manne, R., editos, Paallel Poblem Solving fom Natue. Spinge-Velag, Letue Notes in Compute Siene Vol [Kauman, 1989] Kauman, S. (1989). Adaptation on ugged tness landsapes. In Stein, D., edito, Letues in the Sienes of Complexity, pages 527{618. Addison-Wesley, Reading, MA.