Analysis of Emergent Properties in a Hybrid Bio-inspired Architecture for Cognitive Agents

Size: px
Start display at page:

Download "Analysis of Emergent Properties in a Hybrid Bio-inspired Architecture for Cognitive Agents"

Transcription

1 Analyss of Emergent Propertes n a Hybrd Bo-nspred Archtecture for Cogntve Agents Oscar J. Romero and Angélca de Antono Poltécnca de Madrd Unversty, LSIIS Department, Boadlla del monte, 28660, Madrd, Span ojrlopez@hotmal.com, angelca@f.upm.es Abstract. In ths work, a hybrd, self-confgurable, multlayered and evolutonary archtecture for cogntve agents s developed. Each layer of the subsumpton archtecture s modeled by one dfferent Machne Learnng System MLS based on bo-nspred technques. In ths research an evolutonary mechansm supported on Gene Expresson Programmng to self-confgure the behavour arbtraton between layers s suggested. In addton, a co-evolutonary mechansm to evolve behavours n an ndependent and parallel fashon s used. The proposed approach was tested n an anmat envronment usng a mult-agent platform and t exhbted several learnng capabltes and emergent propertes for self-confgurng nternal agent s archtecture. Keywords: Subsumpton Archtecture, Hybrd Behavour Co-evoluton, Gene Expresson Programmng, Artfcal Immune Systems, Extended Classfer Systems, Neuro Connectonst Q-Learnng Systems. 1 Introducton In the last decades, Cogntve Archtectures have been an area of study that collects dscplnes as artfcal ntellgence, human cognton, psychology and more, to determne necessary, suffcent and optmal dstrbuton of resources for the development of agents exhbtng emergent ntellgence. One of the most referenced s the Subsumpton Archtecture proposed by Brooks [1]. Accordng to Brooks [1], the Subsumpton Archtecture s bult n layers. Each layer gves the system a set of pre-wred behavours, where the hgher levels buld upon the lower levels to create more complex behavours: The behavour of the system as a whole s the result of many nteractng smple behavours. Another characterstc s ts lack of a world model, whch means that ts responses are always and only reflexve as proposed by Brooks. However, Subsumpton Archtecture results n a tght couplng of percepton and acton, producng hgh reactvty, poor adaptablty, no learnng of new envronments, no nternal representaton and the need that all patterns of behavours are pre-wred. The present research focuses on developng a Hybrd Multlayered Archtecture for Cogntve Agents based on Subsumpton theory. Addtonally ths work proposes an Evolutonary Model whch allows the Agent to self-confgure and evolve ts processng layers through the defnton of nhbtory and suppressor processes, between behavour layers nstead of havng a pre-confgured structure of them. On the E. Corchado et al. (Eds.): Innovatons n Hybrd Intellgent Systems, ASC 44, pp. 1 8, sprngerlnk.com Sprnger-Verlag Berln Hedelberg 2007

2 2 O.J. Romero and A. de Antono other hand, nstead of usng an Augmented Fnte Machne System as Subsumpton theory states [1] where no nternal representaton s done, n ths paper we propose that each behavour layer s drven by a dfferent bo-nspred machne learnng system (chosen from a repertore where behavour co-evoluton occurs) whch learns from the envronment and generates an nternal world-model by means of an unsupervsed and renforced learnng. The remander of the paper s organzed as follows. The evolutonary approach for self-confgurable cogntve agents s detaled n Secton 2. Secton 3 dscusses the expermental results and emergent propertes obtaned. Fnally concludng remarks are shown n Secton 4. 2 Proposed Hybrd, Self-confgurable and Evolutonary Approach for Cogntve Agents In order to desgn a hybrd, self-confgurable, scalable, adaptable, and evolutonary archtecture for cogntve systems whch exhbts emergent behavours and learnng capabltes, the proposed work s explaned as follows. Consder a vrtual envronment where there exsts several agents nteractng wth objects, food, each others, etc.; t arses some mayor questons and constrants: Change of envronmental condtons,.e. about objects: quantty, type of object, locaton, sze, etc., about other agents: ntentons and desres, goals, etc. There are a varable number of desred behavours: avodng-obstacles, wanderng, feedng, huntng, escapng-from-depredators, etc. How many behavours can be ntegrated nto a sngle agent? And how can agents arbtrate these behavours? When does a sngle agent know f t has to nhbt or suppress a behavour wthout usng a pre-establshed applcablty predcate or rule? How can a behavour, that drves one of the layers n a sngle multlayered agent, generate a model of the world, couple wth the envronment va the agent s sensors and actuators, learn from ts own nteracton wth the envronment and receve a renforcement of ts actons, so the nternal state of the behavour evolve? These questons address the followng proposed approach of a hybrd, selfconfgurable and bo-nspred archtecture for cogntve agents. The Fg. 1 shows a hybrd archtecture from whch all the questons mentoned above can be solved. An nternal archtecture based on subsumpton prncples but wth few varatons can be observed n every agent: Each processng layer s connected randomly wth one dfferent bo-nspred learnng machne system (Extended Classfer System XCS [2], Artfcal Immune System AIS [3], Neuro Connectonst Q-Learng System NQL [4], Learnng Classfer System LCS [5] and scalable to others) whch replaces the typcal Augmented Fnte State Machnes proposed by Brook s archtecture [1]. After beng traned n the agent, every behavour s sent to a behavour repertore accordng to ts category, where a co-evolutonary mechansm based on [6] s appled so that every behavour not only wll learn n a local way nsde of each

3 Analyss of Emergent Propertes n a Hybrd Bo-nspred Archtecture 3 agent but also wll evolve n a global way, and afterwards t wll be selected by another agent n the next generaton. There s an evolutonary process drven by a Gene Expresson Programmng Algorthm GEP [7], whch s n charge of self-confgurng the agent s behavour arbtraton: defnng the number of layers, behavours, connectons and herarches between them (nhbton, suppresson, aggregaton, etc.). GEP Algorthm generates a set of applcablty predcates per agent where t s determned whch behavour wll be actvated at a certan stuaton and ts tme of actvaton. Subsumpton Archtecture Agent Applcablty predcates, behavour arbtraton, layers herarchy Evolutonary Process GEP Processng Layers Behavour n Behavour 4 Behavour 3 Behavour 2 Behavour 1 others LCS NQL AIS XCS Machne Machne Learnng Learnng Systems Systems Behavour Co-evoluton Mechansm Mult-Agent System Fg. 1. Hybrd and Evolutonary Archtecture for Cogntve Agents 2.1 Hybrd Archtecture: Behavours Drven by Machne Learnng Systems Every behavour layer n the multlayered archtecture s assocated to a Machne Learnng System MLS, that allows the archtecture beng hybrd and not only reactve snce each behavour wll be able to exert delberatve processes usng the acqured knowledge. Besdes, ths mechansm gves plastcty to the archtecture because every behavour learns n an unsupervsed, ndependent and parallel way through ts nteracton wth the envronment, generatng nternal representatons, rules and both specfc and generalzed knowledge. Ths mechansm s favored by the MLSs characterstcs: robustness, fault tolerance, use of bo-nspred technques, adaptablty and t does not requre a prevous defnton of knowledge (unsupervsed learnng). There are two prncples formulated by Stone [8] that have motvated ths proposed layered learnng approach: Layered learnng s desgned for domans that are too complex for learnng a mappng drectly from an agent s sensory nputs to ts actuator outputs. Instead the layered learnng approach conssts of breakng a problem down nto several behavoral layers and usng MLSs at each level. Layered learnng uses a bottom up ncremental approach to herarchcal task decomposton. MLS s used as a central part of layered learnng to explot data n order to tran and or adapt the overall system. MLS s useful for tranng behavours that are dffcult to fne-tune manually.

4 4 O.J. Romero and A. de Antono To make the hybrdzaton process consstent, a common nterface for all MLSs (XCS [2], AIS [3], NQL [4], LCS [5], etc.) s proposed so, although each MLS has a dfferent nternal process, they all have a smlar structure that t lets the system to be scalable ntroducng new MLSs f s requred and connectng them n an easy way wth each behavour layer n the agent s multlayered archtecture. 2.2 Hybrd Behavour Co-evoluton: Evolvng Globally A co-evolutonary mechansm [6] s proposed to evolve each type of behavour separately n ts own genetc pool. Most evolutonary approaches use a sngle populaton where evoluton s performed; nstead, the behavours are dscrmnated n categores and make them evolve n separate behavour pools wthout any nteracton. Frst, each agent defnes a specfc set of behavours that bulds ts own multlayered structure. For each requred agent s behavour, a behavour nstance s chosen from the pool (ths nstance s connected wth one MLS). Subsequently each agent wll nteract wth the envronment and each agent s behavour wll learn a set of rules and generate an own knowledge base. After certan perod of tme a co-evolutonary mechansm s actvated. For each behavour pool s appled a probablstc selecton method of behavours where those behavours that had the best performance (ftness) wll have more probablty to reproduce. Then, a crossover genetc operator s appled between each par of selected behavours: a porton of knowledge acqured by each agent s behavour (through ts MLS) s nterchanged wth the other one. Fnally, new random rules are generated untl complete the maxmum sze of rules that behavours can have n ther own knowledge base, so a new par of behavours s created and left n the correspondng behavour pool to be selected by an agent n the next generaton. 2.3 Self-confgurable Archtecture: Behavour Arbtraton If each agent has an arbtrary set of behavours, how to determne: the nteracton between them, the herarchy levels, the Subsumpton process (nhbton and suppresson) and the necessary layers to do an adequate processng? These questons are solved next. The nternal multlayered structure of each agent s decomposed n atomc components whch can be estmated and used to fnd the optmal organzaton of behavours durng the agent s lfetme [8]. The man goal s that the agent n an automatc way self-confgures ts own behavours structure. The GEP algorthm proposed by Ferrera [7] s used to evolve nternal structures of each agent and generate a vald arbtraton of behavours. GEP uses two sets: a functon set and a termnal set [7]. In our work, the proposed functon set s: {AND, OR, NOT, IFMATCH, INHIBIT, SUPRESS}. The AND, OR and NOT functons are logc operators used to group and exclude subsets of objects, behavours, etc. The condtonal functon IFMATCH s a typcal applcablty predcate that matches wth a specfc problem stuaton. Ths functon has four arguments; the frst three arguments belong to the rule s antecedent: the frst ndcates what object s sensed, second one s the actvated sensor and the thrd argument s the current behavour runnng on the agent. If the frst three arguments are applcable then the fourth argument, the rule s consequent, s executed. The fourth argument should be a INHIBIT or SUPPRESS functon, or maybe and AND/OR functon f more elements are necessary. The

5 Analyss of Emergent Propertes n a Hybrd Bo-nspred Archtecture 5 INHIBIT/SUPPRESS functons have two arguments (behavoura, behavourb) and ndcate that behavoura nhbts/suppresses behavourb. On the other hand, the termnal set s composed by the behavour set, the envronmental element set (objects, agents, food, etc.) and an agent s sensor set. Addtonally wldcard elements are ncluded so whchever sensor, behavour or object can be referenced. Each agent has a chromosome wth nformaton about ts self structure,.e. the agent A can have the followng chromosome (applcablty predcate): IFMATCH: There s a wall AND s Actvated lookng-for-food behavour AND Readng by sensor1 THEN: Avodng-obstacle INHIBIT wanderng AND lookng-for-food behavours, Analyzng ths rule we can nfer that the agent has three behavour layers: avodngobstacle, wanderng and lookng-for-food, and the two last ones are nhbted by the frst one when sensor1 dentfes a wall n front of the agent. However, these chromosomes don t have always a vald syntax, so the GEP mechansm s used to evolve the chromosome untl t becomes n a vald syntactc rule. Each gene s become to a tree representaton and then a genetc operator set s appled between genes of the same agent and genes of other agents [7]: selecton, mutaton, root transposton, gene transposton, two-pont recombnaton and gene recombnaton. After certan number of evolutonary generatons, vald and better adapted agent s chromosomes are generated. 2.4 Emergent Propertes of the Archtecture Brooks postulates n hs paper [1] the possblty that ntellgence can emerge out of a set of smple, loosely coupled behavours, and emergent propertes arse (f at all) due to the complex dynamcs of nteractons among the smple behavours and that ths emergence s to a large extent accdental. Accordng to that, the proposed archtecture artculates a behavour set that learns about envronmental condtons n an ndependent and parallel manner, and on the other hand evolve nsde a categorzed pool. Each smple behavour can be appled to a subset of specfc stuatons but not to the whole problem space, however the ndvdual level nteracton between behavours (nsde each agent) allows coverng a wder range of problem states and some characterstcs are generated: robustness, knowledge redundancy, fault tolerance and a bg plastcty level, so emergent propertes n the ndvdual and nsde of the socety (Mult-agent systems) appear. Then, the emergent propertes arse from three ponts of vew n a bottom-up approach: Atomc: n each behavour of the multlayered archtecture, when the assocated MLS learns from the envronment, n an automate way. Indvdual: the agent self-confgures ts nternal structure (chromosome), herarchy and arbtraton of behavours through an evolutonary process drven by GEP. Socal: a hybrd behavour co-evoluton mechansm s appled to all agent s behavours, so behavours learn not only themselves va the MLS assocated but also cooperatng and sharng the acqured knowledge between them. It s mportant to notce that emergence n dfferent levels, from atomc to socal pont of vew, provokes an overall emergence of the system, where some knd of

6 6 O.J. Romero and A. de Antono ntellgence we hope to arse. The expermentaton focused on dscoverng some characterstcs of dentty n the anmats,.e. we expected to see some anmat agents behavng lke depredators and others behavng lke preys after the system has evolved several generatons. Depredators should nclude some behavours lke avodngobstacles, lookng-for-water, persecutng-preys, roundng-up, huntng-preys, etc. and Preys should nclude some behavours lke avodng-obstacles, lookng-for-food, lookng-for-water, hdng, escapng, etc. Nevertheless, expected emergent propertes can vary accordng to the envronment and the behavour set. 3 Expermentaton An artfcal lfe envronment called Anmat (anmal + robot) [5] s proposed to test the experments. The envronment smulates vrtual agents competng for gettng food and water, avodng obstacles, huntng, escapng from depredators, etc. Ths anmat envronment was selected because s more frendly to see emergent behavours but t s not the only envronment where the proposed archtecture could be applcable. Safe Zone Wall (obstacle) Sensed object Danger Zone Anmat Anmats Food Tree (obstacle) Anmats Water depost Fg. 2. Smulated Anmat Envronment and sensor anmat dstrbuton Each anmat controlled by an agent dsposes a set of 14 proxmty sensors (see Fg. 2) smulatng a lmted sght sense. 12 sensors read a safe zone and 2 sensors read a danger zone (to avod collsons), as proposed by D. Romero [9]. Addtonally, an envronment wth objects, food, water deposts, anmats, obstacles, traps, etc. s smulated. Several performance experments of all algorthms (XCS, AIS; NQL, LCS and GEP), analyss of learnng convergence, ftness progresson and analyss of syntactcally well-formed genes n GEP, were done n our research, however, we only gong to present the relevant results of analyzed emergent propertes next. Analyss of evolved archtectures: after the whole system has evolved durng a specfc number of generatons, we have analyzed the fnal structures of the best adapted agents where emergent propertes arose. Fg. 3 shows the genotype (Expresson Trees - ET n GEP) and phenotype respectvely of an ntal archtecture of a random agent wthout any evolutonary phase; n contrast, fg. 4 shows the genotype and phenotype of the evolved archtecture of the same selected agent. In fg. 3.b the chromosome represents four behavours: lookng-for-water, lookng-for-food, avodng-obstacles and hdng, where l-f-w nhbts l-f-f and hdng and l-f-w suppresses a-o, but there s a

7 Analyss of Emergent Propertes n a Hybrd Bo-nspred Archtecture 7 contradctory process when l-f-f tres to suppress l-f-w and l-f-f has been already nhbted by l-f-w (see fg. 3.b). Ths s solved wth the evolved archtecture n fg. 4.b, whch proposes a new structure addng escapng-from-depredators behavour and excludng hdng behavour. Generaton 0 Agent 116 SUPRESS hdng Subsumpton Conflct Lookng-for-water s Lookng-for-food s Avodng-obstacles AND Hdng a) b) INHIBIT Lookng- AND for-water IFMATCH Tree Lookng- sensor7 for-food Lookngfor-food Lookngfor-water Avodng -obstacles Fg. 3. Genotype and Phenotype of an ntal Agent s Archtecture Generaton 326 Agent 116 OR IFMACTH OR sensor1 INHIBIT Avodng-obstacles Escapng-from-depredators Wall Tree Lookngfor-food Escapng Avodng -obstacles AND Lookng-for-food Escapng AND Lookng-for-water Lookng- Lookng- a) for-water for-food b) Fg. 4. Genotype and Phenotype of the Agent s Archtecture after 326 evolutonary generatons As depcted n fg. 4.b, the ntal contradctory nhbtory/suppressor processes n the agent s archtecture (see fg. 3.b) are solved, and only herarchcal nhbtory processes are proposed by the evolved archtecture. Furthermore, we can deduce too that evolved archtecture has collected a set of specfc behavours becomng the agent to an anmat wth a dentty of prey. It s mportant to notce n evolved archtecture that escapng-from-depredators behavour nhbts lookng-for-food and lookng-forwater behavours but f the anmat s escapng and ts sensor7 reads a wall or a tree, then escapng-from-depredators behavour s nhbted by avodng-obstacles behavour untl the obstacle s not n front of the anmat anymore, and after that the anmat contnues ts getaway, so we can say that emergent behavour arses. Fnally, the expermentaton demonstrate that specfc parameter confguratons n MLSs, GEP and Co-evolutonary mechansm are requred to reach certan robustness, learnng and adaptaton capactes n the overall system. Nevertheless, emergent propertes ddn t arse always or n a quck way, n several experments anmats ded quckly and they couldn t learn to survve.

8 8 O.J. Romero and A. de Antono 4 Conclusons The ntegraton of multple Machne Learnng Systems n controllng the behavours layers of a hybrd Subsumpton Archtecture approach, nstead of usng the typcal Augmented Fnte State Machnes, have demonstrated mportant advantages n learnng about the world of the agent, makng nternal knowledge representatons and adaptng to envronmental changes. The evolutonary and learnng mechansms used n ths work, provded a plastcty feature allowng the agent to self-confgure ts own multlayered behavour-based archtecture; thus t can avod creatng exhaustve and extensve knowledge bases, pre-wred behavour-based structures and pre-constraned envronments. In our future work we expect to contnue workng on desgnng more adaptve and self-confgurable archtectures, usng fuzzy technques n the MLSs to mprove the sensors readngs. In the future, one concrete applcaton of ths research wll be the development of a Cogntve Module for Emotve Pedagogcal Agents where the agent wll be able to self-learn about ts own perspectves, beleves, desres, ntentons, emotons, sklls and perceptons. Acknowledgments. Supported by the Programme Alban, the European Unon Programme of Hgh Level Scholarshps for Latn Amerca, scholarshp No. E05D056455CO. References 1. R.A. Brooks, A Robust Layered Control System For A Moble Robot, IEEE Journal Of Robotcs And Automaton, RA-2 (1986), S.W. Wlson, State of {XCS} Classfer System Research, Lecture Notes n Computer Scence, 1813 (2000), L. N. de Castro, J. Tmms, Artfcal Immune Systems: A New Computatonal Intellgence Approach, Ed. Sprnger (2002) 4. V. Kuzmn, Connectonst Q-learnng n Robot Control Task, Proceedngs of Rga Techncal Unversty (2002), J.H. Holland, Inducton, Processes of Inference, Learnng and Dscovery, Mch:Addson- Wesley (1953). 6. Farahmand, Hybrd Behavor Co-evoluton and Structure Learnng n Behavor-based Systems, IEEE Congress on EC, Vancouver (2006), Ferrera, Gene Expresson Programmng: A new adaptve algorthm for solvng problems, Complex Systems, forthcomng, (2001). 8. P.Stone, Layered Learnng n Multagent Systems, Thess CS (1998) 9. D. Romero, L. Nño, An Immune-based Multlayered Cogntve Model for Autonomous Navgaton, IEEE Congress on EC, Vancouver (2006),