Bayesian Inference Driven Behavior Network Architecture for Avoiding Moving Obstacles *

Size: px

Start display at page:

Download "Bayesian Inference Driven Behavior Network Architecture for Avoiding Moving Obstacles *"

Augustus Flynn
5 years ago
Views:

1 Bayesan Inference Drven Behavor Network Archtecture for Avodng Movng Obstacles * Hyeun-Jeong Mn and Sung-Bae Cho Dept. of Computer Scence, Yonse Unversty 134 Shnchon-dong, Sudaemoon-ku, Seoul , Korea {solusea,sbcho}@cs.yonse.ac.kr Abstract. Ths paper presents a technque for an ntellgent robot to adaptvely behave n unforeseen and dynamc crcumstances. Snce the tradtonal methods utlzed the relatvely relable nformaton about the envronment to control ntellgent robots, they were robust but could not behave adaptvely n complex and dynamc world. On the contrary, behavor-based approach s sutable for generatng autonomous behavors n the envronment, but t stll lacks of the capabltes to nfer dynamc stuatons for hgh-level behavors. Ths paper proposes a 2-layer control archtecture to generate adaptve behavors, whch perceve and avod dynamc movng obstacles as well as statc obstacles. The frst level s to generate reflexve and autonomous behavors wth the behavor network, and the second level s to nfer dynamc stuaton of moble robots wth Bayesan network. Expermental results wth varous stuatons have shown that the robot reaches the goal ponts whle avodng statc or movng obstacles wth the proposed archtecture. 1 Introducton It s dffcult to avod movng obstacles because a moble robot has to perceve stuatons by only utlzng ts sensors n real-tme. The predctablty s central to the collson avodance n non-statonary condtons. If there s no need to predct, we can rely entrely on what s sensed, resultng n a purely reactve approach [1]. However, t s dffcult to predct n the real world, even n the best of crcumstances. In MIT AI Lab., they proposed a method of learnng each crcumstance for avodng movng obstacles [2]. Hashmoto utlzed the evolutonary computaton and fuzzy system for avodng movng obstacles [3], and Inoue presented a behavor learnng method based on Bayesan networks and experence of nteracton between human and robots of avodng movng obstacles [4]. Ncolescu and Matarc dealt wth changng stuatons by learnng and constructng herarchcal structure of prevous behavors for solvng the problem of avodng movng obstacles [5]. The behavor network proposed by Maes [6] can acqure global goals as well as autonomously select behavors as bestowng goals on the behavor-based system. Although behavor network s proposed for solvng problems wth goals n uncertan envronments, t s nsuffcent to generate adaptve behavors n changng and complex stuatons. To cope wth ths problem we propose a control archtecture n whch * Ths research was performed for the Intellgent Robotcs Development Program, one of the 21st Century Fronter R&D programs funded by the Mnstry of Commerce, Industry and Energy of Korea R. Khosla et al. (Eds.): KES 2005, LNAI 3682, pp , Sprnger-Verlag Berln Hedelberg 2005

2 Bayesan Inference Drven Behavor Network Archtecture 215 Bayesan network controls behavor network. Ths control archtecture selects a behavor of the hghest weght whch s computed by the nference of Bayesan network n behavor network. Bayesan network has advantage that t s ndependently appled to envronment because t s desgned by only usng the nformaton from sensors. 2 Bayesan Inference Drven Behavor Network The varables requred for the mathematcal notatons of the proposed method for Bayesan nference drven behavor network are as follows. These varables are also appled to the functon for the Bayesan nference [7], the computaton of an actvaton level of each node, and the selecton of an actve node n the behavor modules. α : Actvaton level θ : The ntal value of the global threshold whch s reduced by 10% f no executable node has an actvaton greater than t φ : The weght of envronmental sensor nputs and successor lnks γ : The weght of goal nputs and predecessor lnks δ : The weght of protected goal nputs and conflctor lnks t : Current tme B : A set of behavor nodes B PS : A set of precondtons of behavor node b (see table 1) B AS : A set of add lsts of behavor node b (see table 1) B DS : A set of delete lst of behavor node b (see table 1) The functon for selectng the actvaton node b n tme t s defned as follows. α b θ (1) 1 f executable( b = 1 (2) b α b α, j (1) and (2) b j 1 f b s executable at tme t executable ( b The functon for the actvaton level α n behavor node b b s defned as follows. 0 f t = 0 α b D( b + β ( b From ths functon, the actvaton value of each behavor D(b s defned as follows. ( = S( b + G( b P( b + ( SPEW ( b j, bk + D b SP SP EW FW b j, b k 1 ( bj, bk 1 ( bj, bk f σ B SP PS f σ B AS FW ( b, b SP j, σ B, σ B AS PS k PG ( b, b ) j k (3)

3 216 Hyeun-Jeong Mn and Sung-Bae Cho 1 f σ BDS, σ BPS SPPG ( b j, bk In the above actvaton functon (b,t) s the weght affectng the actvaton n behavor nodes after the agent nfers stuaton by Bayesan network, and t s defned as follows. ξ f b rk,, I, Ψ( k, ) = σ β ( b (4) = { s s effect_nod e( k), s : th state} r k, I = { x x SI,1 x # (Effect Nodes)} SI : cause at tmet Ψ ( k, ) = P( bk c1 j Lcm ) : condtonal probablty, m :#(cause nodes) σ = Max{ Ψ( k, ) 1 #(states n r ), k : effect node} In equaton (3), S(b,t) s the actvaton value from sensors at tme t, G(b,t) s the actvaton value from goals at tme t, and P(b,t) s the actvaton value to be deleted from protected goal at tme t. In addton to these equatons, SP EW (b j,b k,t) s the backward spreadng from node b j to b k and t means both the precondtons of b j and the add lst of b k and SP FW (b j,b k,t) s the forward spreadng from node b j to b k, and t means both the add lst of b j, and the precondton of b k. SP PG (b j,b k,t) s to take away from node b j to b k and t means both the delete lst of b j and the precondton of b k. Bayesan network represents all the events wth DAG (Drected Acyclc Graph) and each node n DAG models probablstc ndependency. It conssts of the relaton of cause and effect nodes from ther probabltes and can nfer the results from condtonal probablty of cause nodes. In Bayesan network, each node corresponds to probablstc varables C, C j, and E, whch are the causes of the effect node and each lnk s assocated wth condtonal probabltes between lnked varables. In equaton (4), r k, means the result node of Bayesan network at the th state n the kth effect, and I s the set of ndexes of result nodes where precondtons are satsfed at tme t. Ψ ( k, ) s the condtonal probablty at the th state n the kth result node, and σ s the state of the hghest probablty n the kth result node. IF b = 1 THEN execute b ELSE θ = θ 0. 9 and repeat the procedure (5) Table 1. Internal lnks and external lnks Internal Lnk Predecessor Lnk (ρ=false) ^ (ρ precondtons of node A) ^ (ρ add lsts of B) Successor Lnk (ρ=false) ^ (ρ add lsts of node A) ^ (A s executable) ^ (ρ precondtons of B) Conflctor Lnk (ρ=true) ^ (ρ precondtons of node A) ^ (ρ delete lsts of B) External Lnk Sensors (ρ=true) ^ (ρ precondtons of node A) Goals (ρ 0) ^ (ρ add lsts of node A) 3 Expermental Results For experments, we use the YAKS whch s the 3D robot smulator, and the expermental envronment s shown as Fgure 1 (a). The envronment of ths smulator has

Bayesan Inference Drven Behavor Network Archtecture 217 the moble robot (number 0) whch generates behavors usng the proposed method and two robots (number 1 and 2) whch act as the movng obstacles.

4 Bayesan Inference Drven Behavor Network Archtecture 217 the moble robot (number 0) whch generates behavors usng the proposed method and two robots (number 1 and 2) whch act as the movng obstacles. These two robots lke obstacles can only detect and avod pens, and cannot avod a robot when they collde wth other robots. Ths smulator may change the angle of the vew wthn ths smulator. In Fgure 1 (a), the whte cylnders represent the statc obstacles, and they have changeable count n the experments. The robot randomly starts from any postons for the comparatve analyss when usng the only behavor network and the proposed method, and also starts from the statc poston for the varous analyses of the proposed method; we compared the success rates to avod movng obstacles usng the only behavor network and the proposed method respectvely, and we nvestgated an avodng drecton n each case when usng the proposed method. The goals n these experments are to avod the movng obstacles and to reach the goal poston marked wth the lght area n the corner of pens. We put randomly 5% of nose n the value from sensors because we may not predct the stuaton accurately n real world. 3.1 Behavor Network and Bayesan Network Fgure 1 (b) shows the behavor network archtecture used n ths paper, whch has prmtve behavors lke Follow Lght, Go Straght, Turn Left, and Turn Rght. In the network, the predecessor lnks of Follow Lght are Go Straght and Near Lght, and the successor lnk s Reach Goal. The predecessor lnks of Go Straght are also Nothng, Turn Left, and Turn Rght, and the successor lnks are Reach Goal, Avod Obstacle, and Follow Lght. The predecessor lnks of Turn Rght are Shade Area, Near Obstacle, and Go Straght, and the successor lnks are Avod Obstacle and Go straght. Near Lght Follow Lght Nothng Reach Goal Near Obstacle Go Straght Avod Obstacle Shade Area Turn Rght Turn Left Successor Lnk Predecessor Lnk Fg. 1. (a) Expermental envronment (left) (b) Behavor network (rght) Bayesan network desgned for avodng movng obstacles as well as statc obstacles s as follows. The cause nodes from sensors are dstance 0, dstance 1, dstance 2, dstance 3, dstance 4, dstance 5, dstance 6, and dstance 7, and the cause nodes to nfer the changng stuatons from prevous behavors are dstance of obstacles, poston of obstacles, prevous behavor, and the drectonal change of obstacles lke rear_object, front_object, left_object, and rght_object. The ntal probabltes of each node are normalzed as 1/n for the n condtons n each node. Accordng to ths, the ntal probabltes of dstance 0 are P(Near0)=0.5 and

5 218 Hyeun-Jeong Mn and Sung-Bae Cho P(Far0)=0.5. There are some mportant factors to the nodes named dstance of obstacles, poston of obstacles, prevous behavor, left_object, rght_object, front_object, and rear_object to nfer dynamc stuatons. For example, f left sensors of robot detect the obstacle n current state, the probabltes of P(Near0) and P(Near1) n the nodes dstance 0 and dstance 1 get hgher. The probabltes of P(Approach) and P(GoAway) n the node dstance of obstacles and P(Front2Left), P(Left), and P(Rear2Left) n the node left_object are decded from the prevous nformaton dong by a robot. If P(Approach) > P(GoAway) n the node dstance of obstacles, the hghest probablty n those of P(NoTurn), P(TurnLeft), and P(Rear2Left) n the node left_turn s selected. The node left_object conssts of P(Front2Left), P(Left), and P(Rear2Left) whch mean the change of obstacle from front to left, on left, and from rear to left, respectvely. The condtonal probabltes of the effect node left_turn are determned wth the above probabltes of cause nodes. 3.2 Comparson After the experments to avod movng obstacles usng only behavor network and the proposed method, we have observed that a robot may not avod movng obstacles when a movng obstacle comes from the opposte drecton to a robot. Snce a moble robot avods collsons as the drecton and poston of an obstacle from the robot, Turn Left or Turn Rght s an mportant behavor. When startng at the same poston and drecton n the only behavor network and the proposed method respectvely, the robot colldes wth movng obstacles n the only behavor network, but avods the obstacle and reaches the goal n the proposed method. We have analyzed the behavor for avodng movng obstacles whch get nearer from varous postons and drectons usng the behavor network and the proposed method. We have verfed the success of 52% and 90% n those of 60 trals usng the behavor network and the proposed method, respectvely. The cases of falure n the proposed method are when the obstacle gets nearer on the front sde or changes ts drecton abruptly. Fgure 2 shows the success rates for avodng movng obstacles n the only behavor network and the proposed method. In ths fgure, y-axs s the success rate that ranges from 0 to Fg. 2. The success rates for avodng movng obstacles (Y-axs: Success rate, Dark bar: proposed) Fgure 3 shows the comparson of the behavor network wth the proposed method. (a) s the comparson of the behavor sequences and (b) s that of the angles followed by the behavors. In ths fgure, the sold lne represents the behavors and angles n

6 Bayesan Inference Drven Behavor Network Archtecture 219 the behavor network and the dashed lne represents the behavors and angles n the proposed method. In the y-axs, 1 s Go Straght, 2 s Turn Left, and 3 s Turn Rght. In the fgure, the robot selects Turn Left or Turn Rght when usng the proposed method, but Turn Rght when usng the only behavor network. 3.3 Analyss of Results As mentoned before, we have analyzed the behavor selecton of the robot at tme t from the moved angles between the robot and obstacles usng the proposed method. Fgure 4 shows that the robot reaches the goal whle avodng movng obstacles, pens, and statc obstacles of 2, 3, and 2 tmes, respectvely. In ths fgure, (a) shows the trajectores of the robot and movng obstacles, and (b) shows the behavor selecton of the robot to reach the goal and has the marked crcles representng the collson avodances wth movng obstacles, statc obstacles, and pens. In (a), blue lne represents the trajectory of the moble robot and black and red lne represent movng obstacles, respectvely. We have subsequently analyzed behavors of the robot n cases of the same drecton and the dfferent drecton for comparng behavor selecton as the changes of the angles between the robot and obstacles. Fgure 5 shows the result of avodng movng obstacles. In ths fgure, (a) and (b) are the trajectores n the cases of colldng wth an obstacle 2 n the same drecton on the left and on the rght of the robot, and (c) represents the behavor sequences n (a) wth respect to tme. Lastly, (d) represents the behavor sequences n (a) wth respect to tme. In (c) and (d), x-axs s tme lne and y-axs s the selected behavor wth respect to tme. The numbers of y-axs represent that 1 s Go Straght, 2 s Turn Left, and 3 s Turn Rght. Fgure 6 shows the results to avod movng obstacle wth a real Khepera II moble robot. In ths fgure, each of (a, c) and (b, d) shows the stuatons before and after avodng a whte box n front of the moble robot. angle tme behavor tme behavor-network proposed method behavor-network proposed method (a) (b) Fg. 3. The behavors (a) and angles (b) n the behavor network and the proposed method 4 Conclusons The experments have valdated that moble robot goes to the goal wth avodance behavors on movng obstacles usng the Bayesan nference drven behavor net-

220 Hyeun-Jeong Mn and Sung-Bae Cho work.

of sensors, goals and behavors n behavor network, and adaptve behavors of

the nference of each stuaton usng Bayesan network.

the changes of envronment lke the systems wth hybrd learnng or plannng,

fuzzy rules and fuzzy nference [2]. (a) (b) Fg. 4.

obstacles n the proposed method (a) (b) (c) (d) Fg. 5.

left sde of robot (b, d) (a) (b) (c) (d) Fg. 6.

7 220 Hyeun-Jeong Mn and Sung-Bae Cho work. Autonomous behavors of a moble robot are generated by the nter-relatons of sensors, goals and behavors n behavor network, and adaptve behavors of a moble robot are also generated by expandng the behavor network through the nference of each stuaton usng Bayesan network. The proposed method does not have lmtaton of re-constructng the system as the changes of envronment lke the systems wth hybrd learnng or plannng, and overcomes the lmtaton to re-defne many rules such as the systems usng fuzzy rules and fuzzy nference [2]. (a) (b) Fg. 4. The trajectory (a) and each behavor (b) by the tme of avodng movng obstacles n the proposed method (a) (b) (c) (d) Fg. 5. Avodng movng obstacles as the same drecton wth the robot usng the proposed method. Each fgure shows the avodance n the rght sde of robot (a, c) and n the left sde of robot (b, d) (a) (b) (c) (d) Fg. 6. Avodng movng obstacle wth the real robot. Each shows the avodance n the mddle (a,b) and at the left (c,d) of the robot References 1. R. C. Arkn, Behavor-Based Robotcs, MIT Press, W. D. Smart, Makng Renforcement Learnng Work on Real Robots, Ph.D.Thess, 2002.

8 Bayesan Inference Drven Behavor Network Archtecture S. Hashmoto, F. Kojma, and N. Kubota, "Perceptual system for a moble robot under a dynamc envronment," Proc IEEE Int. Symposum on Computatonal Intellgence n Robotcs and Automaton, pp , T. Inamura, M. Inaba, and H. Inoue, "User adaptaton of human-robot nteracton model based on Bayesan network and ntrospecton of nteracton experence," Proc. of the 2000 IEEE/RSJ Int. Conf. on Intellgent Robots and Systems, pp , M. N. Ncolescu and M. J. Matarc, "A herarchcal archtecture for behavor-based robots," Autonomous Agents and Mult-Agent Systems, pp , P. Maes, How to do the rght thng, Connecton Scence Journal, vol. 1, no. 3, pp , J. Pearl, Probablstc Reasonng n Intellgent Systems: Networks of Plausble Inference, Morgan Kaufmann, 1988.