Implicit Coordination in Crowded Multi-Agent Navigation

Size: px
Start display at page:

Download "Implicit Coordination in Crowded Multi-Agent Navigation"

Transcription

1 Implct Coordnaton n Crowded Mult-Agent Navgaton Julo Godoy and Ioanns Karamouzas and Stephen J. Guy and Mara Gn Department of Computer Scence and Engneerng Unversty of Mnnesota 2 Unon St SE, Mnneapols MN {godoy, oanns, sjguy, gn}@cs.umn.edu Abstract In crowded mult-agent navgaton envronments, the moton of the agents s sgnfcantly constraned by the moton of the nearby agents. Ths makes plannng paths very dffcult and leads to neffcent global moton. To address ths problem, we propose a new dstrbuted approach to coordnate the motons of agents n crowded envronments. Wth our approach, agents take nto account the veloctes and goals of ther neghbors and optmze ther moton accordngly and n real-tme. We expermentally valdate our coordnaton approach n a varety of scenaros and show that ts performance scales to scenaros wth hundreds of agents. Introducton Decentralzed navgaton of multple agents n crowded envronments has applcaton n many domans such as swarm robotcs, traffc engneerng and crowd smulaton. Ths problem s challengng due to the conflctng constrants nduced by the other movng agents; as agents plan paths n a decentralzed manner, they often need to recompute ther paths n real-tme to avod colldng wth the other agents and statc obstacles. The problem becomes even harder when the agents need to reach ther destnatons n a tmely manner whle stll guaranteeng a collson-free moton. A varety of approaches have been proposed to address ths problem. Recently, velocty-based approaches have ganed popularty due to ther robustness and ther ablty to provde collson-free guarantees about the agents motons. Such approaches allow each agent to drectly choose a new collson-free velocty at each cycle of a contnuous sensng-actng loop. However, n crowded envronments, veloctes that are locally optmal for one agent are not necessarly optmal for the entre group of agents. Ths can result n globally neffcent and unrealstc behavor, long travel tmes and n worst case, deadlocks. Ideally, to reach globally effcent motons, a centralzed entty needs to compute the best velocty for each agent. In a decentralzed doman, where agents have only partal observaton of the envronment, such entty s not present. Agents can only use ther lmted knowledge of the envronment to compute ther local motons. In ths paper, we Copyrght c 216, Assocaton for the Advancement of Artfcal Intellgence ( All rghts reserved. (a) ORCA (b) C-Nav Fgure 1: Two groups of 9 agents each move to the opposte sde of a narrow corrdor. (a) ORCA agents get stuck n the mddle. (b) Usng our C-Nav approach, agents create lanes n a decentralzed manner. hypothesze that f an agent could account for the ntended moton of ts nearby agents, then t could choose a velocty n a way that benefts not only tself but also ts neghborng agents. Consder, for example, the two groups of agents n Fgure 1a. Here, two groups of agents try to move past each other n a narrow hallway. The agents navgate usng a predctve collson-avodance technque, but stll end up gettng stuck n congeston. To address ths, we seek to develop a navgaton method that encourages coordnaton to emerge through agents nteractons. We accomplsh ths, by allowng agents to account for ther neghbors ntended veloctes durng ther plannng. Fgure 1b shows an example of such coordnated moton. Followng ths dea, we propose C-Nav (short for Coordnated Navgaton), a dstrbuted approach to mprove the global moton of a set of agents n crowded envronments by mplctly coordnatng ther local motons. Ths coordnaton s acheved usng observatons of the nearby agents moton patterns and a lmted one-way communcaton, allowng C-Nav to scale to hundreds of agents. Wth our approach, agents choose veloctes that help ther nearby agents to move to ther goals, effectvely mprovng the tmeeffcency of the entre crowd. Ths work makes three man contrbutons. Frst, we propose a framework for mult-agent coordnaton that leads to tme-effcent motons. Second, we show that accountng for nearby agents when selectng an optmal velocty promotes mplct coordnaton between agents. Thrd, we expermentally valdate our approach va smulatons and show that t leads to more coordnated and effcent motons n a var-

2 ety of scenaros as compared to a state-of-the-art collson avodance framework (van den Berg et al. 211), and a recently proposed learnng approach for mult-agent navgaton (Godoy et al. 215). Related Work Mult-Agent Navgaton. In the last two decades, a number of models have been proposed to smulate the moton of agents. At a broad level, these models can be classfed nto flow-based methods and agent-based approaches. Flow-based methods focus on the behavor of the crowd as a whole and dynamcally compute vector-felds to gude ts moton (Treulle, Cooper, and Popovć 26; Naran et al. 29). Even though such approaches can smulate nteractons of dense crowds, due to ther centralzed nature, they are prohbtvely expensve for plannng the movements of a large number of agents that have dstnct goals. In contrast, agent-based models are purely decentralzed and plan for each agent ndependently. After the semnal work of Reynolds on bods (1987), many agent-based approaches have been ntroduced, ncludng socal forces (Helbng and Molnar 1995), psychologcal models (Pelechano, Allbeck, and Badler 27) as well as behavoral and cogntve models (Shao and Terzopoulos 27). However, the majorty of such agent-based technques do not account for the veloctes of ndvdual agents whch leads to unrealstc behavors such as oscllatons. These problems tend to be exacerbated n densely packed, crowded envronments. To address these ssues, velocty-based algorthms (Forn and Shller 1998) have been proposed that compute collson-free veloctes for the agents usng ether samplng (Ondřej et al. 21; Moussaïd, Helbng, and Theraulaz 211; Karamouzas and Overmars 212) or optmzaton-based technques (van den Berg et al. 211; Guy et al. 29). In partcular, the Optmal Recprocal Collson Avodance navgaton framework, ORCA (van den Berg et al. 211), plans provably collson-free veloctes for the agents and has been successfully appled to smulate hgh-densty crowds (Curts et al. 211; Km et al. 212). However, ORCA and ts varants are not suffcent on ther own to generate tmeeffcent agent behavors, as computng locally optmal veloctes does not always lead to globally effcent motons. As such, we buld on the ORCA framework whle allowng agents to mplctly coordnate ther motons n order to mprove the global tme-effcency of the crowd. Coordnaton n Mult-Agent Systems. The coordnaton of multple agents sharng a common envronment has been wdely studed n the Artfcal Intellgence communty. When communcaton s not constraned, coordnaton can be acheved by castng t as a dstrbuted constrant optmzaton problem, where agents send messages back and forth untl a soluton s found (Mod et al. 23; Ottens and Faltngs 28). Guestrn et al. (22) assume that agents can drectly communcate wth each other by usng coordnaton graphs, whch also rely on a message-passng system. Other works such as (Frdman and Kamnka 21) assume smlar behavor between the agents and use cogntve models of socal comparson. Smlarly, our work allows agents to compare moton features, but no socopsychologcal theory s used. Coordnaton can also be acheved usng learnng methods, such as n (Melo and Veloso 211), whch learn the value of jont actons when coordnaton s requred, and use Q-learnng when t s not. The approaches of (Martnez- Gl, Lozano, and Fernández 214; Torrey 21) use renforcement learnng for mult-agent navgaton, allowng the agents to learn polces offlne that can then be appled to specfc scenaros. These works consder a few agents (up to 4), whle our work focuses on envronments wth hundreds of nteractng agents. More recently, Godoy et al. (215) use an onlne learnng approach for adaptng the motons of multple agents wth no communcaton wthout the need for offlne tranng. However, the resultng behavor of ths approach does not scale well to hundreds of agents, as opposed to the technque we present here. Problem Formulaton In our problem settng, there are n ndependent agents A 1... A n, each wth a unque start and goal poston. For smplcty, we assume that the agents move on a 2D plane where statc obstacles O, approxmated as lne segments, can also be present. We model each agent A as a dsc wth a fxed radus r. At tme t, the agent A has a poston p and moves wth velocty v that s subject to a maxmum speed υ max. Furthermore, A has a preferred velocty v pref (commonly drected toward the agent s goal g wth a magntude equal to υ max ). We assume that an agent can sense the rad, postons and veloctes of at most numn eghs agents wthn a lmted fxed sensng range. We further assume that agents are capable of lmted one-way communcaton. Specfcally, each agent uses ths capablty to broadcast ts unque ID and preferred velocty. Ths type of communcaton scales well, as t s not affected by the sze of the agent s neghborhood, unlke two-way communcaton whose complexty ncreases proportonally to the number of neghbors. Our task s to steer the agents to ther goals wthout colldng wth each other and wth the envronment, whle reachng ther goals as fast as possble. More formally, we seek to mnmze the arrval tme of the last agent or equvalently the maxmum travel tme of the agents whle guaranteeng collson-free motons: mnmze s.t. max(t met ogoal(a )) p t p t j > r + r j,, j [1, n] j dst(p t, O j ) > r, [1, n], j = 1... O v t υ max, [1, n] (1) where T met ogoal(a ) s the travel tme of agent A from ts start poston to ts goal and dst( ) denotes the Eucldean dstance. Snce the agents navgate ndependently wth only lmted communcaton, Eq. 1 has to be solved n a decentralzed manner. Therefore, at each tme nstant, we seek to fnd for each agent a new velocty that respects the agent s

3 C D E B A (a) Goal Goal Fgure 2: Actons. (a) Agent-based actons: A. Towards the goal at maxmum speed, B. towards the goal at reduced speed, C. stop, D. and E. towards the goal at a fxed angle. (b) Neghborhood-based actons: follow a specfc neghbor agent or the goal-orented moton at maxmum speed. geometrc and knematcs constrants whle progressng the agent towards ts goal. To obtan a collson-free velocty, we use the ORCA navgaton framework (van den Berg et al. 211). ORCA takes as nput a preferred velocty v pref and returns a new velocty v new that s collson-free and as close as possble to v pref, by solvng a low dmensonal lnear program. Whle ORCA guarantees a locally optmal behavor for each agent, t does not account for the aggregate behavor of all the agents. As ORCA agents typcally have only a goal-orented v pref, they may get stuck n local mnma, whch leads to large travel tmes, and subsequently, globally neffcent motons. To address the aforementoned ssues, n addton to the set of preferred veloctes defned n (Godoy et al. 215) (Fgure 2a), we also consder veloctes wth respect to certan key neghbors (Fgure 2b). Ths creates an mplct coordnaton between the agents and enables our approach to compute globally effcent paths for the agents as well as scale to hundreds of agents. The ALAN framework (Godoy et al. 215) acheves good performance n several envronments, but the absence of communcaton lmts the ablty of agents to coordnate ther motons. (b) The C-Nav Approach Wth our approach, C-Nav, the agents use nformaton about the moton of ther neghbors n order to make better decsons on how to move and mplctly coordnate wth each other. Ths reduces the travel tme of all the agents. Algorthm 1 outlnes C-Nav. For each agent that has not reached ts goal, we compute a new acton,.e., preferred velocty, on average every.1 seconds (lne 3). In each new update, the agent computes whch of ts neghbors move n a smlar manner as tself and whch neghbors are most constraned n ther motons (lne 4 and lne 5, respectvely), and uses ths nformaton to evaluate all of ts actons (lne 7). After ths evaluaton, the best acton s selected (lne 9). Fnally, ths preferred velocty v pref s broadcasted to the agent s neghbors (lne 1) and mapped to a collson-free velocty v new va the ORCA framework (lne 12) whch s used to update the agent s poston (lne 13) and the cycle repeats. Algorthm 1: The C-Nav framework for agent 1: ntalze smulaton 2: whle not at the goal do 3: f UpdateActon(t) then 4: compute most smlar agents 5: compute most constraned agents 6: for all a Actons do 7: R a SmMoton(a) 8: end for 9: v pref arg max a Actons R a 1: broadcast ID and v pref to nearby agents 11: end f 12: v new CollsonAvodance(v pref ) 13: p t p t-1 + v new t 14: end whle Agent neghborhood nformaton Wth nformaton obtaned by sensng (rad, postons and veloctes) and va communcaton (IDs and preferred veloctes) from all the neghbors wthn the sensng range, each agent estmates the most smlar nearby agents and the most constraned ones. Neghborhood-based actons. Each agent computes how smlar the motons of ts neghbors are to ts own moton (see Algorthm 2). Ths allows the agent to locate neghbors that are movng faster than tself and n a smlar drecton. By followng such neghbors, the tme-effcency of the agent can be ncreased. Specfcally, the preferred veloctes of the nearby agents are used to frst select neghbors whch goals n the same drecton as the agent (lne 4). The actual velocty of each of these neghbors s then compared to s goal-orented vector to quantfy the smlarty between the agents (lne 5). Algorthm 2 sorts these smlarty values n a descendng order and returns the correspondng lst of the neghbors ndces. Algorthm 2: Compute most smlar neghbors of 1: Input: lst of neghbors Neghs() 2: Output: Sm rank, lst of ndces of the most smlar neghbors 3: for all j Neghs() do 4: f v pref g j p g > then p 5: SmV al j vj new 6: end f 7: end for 8: Sm rank Sort(SmV al) g p g p Once an agent knows whch of ts nearby agents have a smlar moton, t can use ths nformaton to choose a velocty wth maxmum speed towards one of these neghbors (Fgure 2b), n addton to the goal-orented moton. These actons, unlke the ones n Fgure 2a, do not have a fxed angle wth respect to the agent s goal, as they depend on the poston of ndvdual neghbors.

4 Constraned neghborhood moton. Each agent uses Algorthm 3 to evaluate how constraned the motons of ts neghbors are and, thus, determne agents that are more lkely to slow down the overall progress of the crowd. By yeldng to those constraned neghbors, the global tmeeffcency of the system ncreases. To avod crcular dependences whch can gve rse to deadlocks, the agent only consders neghbors that are closer than tself to ts goal (lne 4). Ths ensures that no two agents wth the same goal wll smultaneously defer to each other. We estmate how constraned a neghbor s based on the dfference between ts preferred and actual velocty (lne 5). The larger the dfference, the more lkely t s that ts moton s mpeded. The agent keeps a lst C wth the evaluaton of how constraned s the moton of each of ts neghbors. After all neghbors have been evaluated, Algorthm 3 sorts C n descendng order, and returns a lst C rank of the ndces of the sorted neghbors (lne 8). The agent uses ths nformaton when evaluatng each of ts avalable actons. Algorthm 3: Compute most constraned neghbors of 1: Input: lst of neghbors Neghs() 2: Output: C rank, lst of ndces of the most constraned neghbors 3: for all j Neghs() do 4: f g p j < g p then 5: C j v pref j vj new 6: end f 7: end for 8: C rank Sort(C) Acton evaluaton and selecton Agents can choose a preferred velocty from two sets of actons, an agent-based (Fgure 2a) and a neghborhood-based (Fgure 2b) acton set. In the latter, agents consder up to s neghbors ( s numneghs). To estmate the ftness of the actons, the agent smulates each acton for a number of tmesteps and evaluates two metrcs: ts potental progress towards ts goal, and ts effect n the moton of ts k most constraned neghbors ( k numneghs). Algorthm 4 outlnes ths procedure. Moton smulaton. As a frst step towards the selecton of an acton, an agent smulates the evoluton of ts neghborhood dynamcs (lne 4), that s, t updates the veloctes and postons of tself and ts neghbors for each tmestep wthn a gven tme horzon T (lne 3), for each possble acton. Two tmesteps s the mnmum tme horzon requred to observe the effect of the agent s chosen moton. It should be noted here that n very crowded areas, agents often have no control over ther motons, as they are beng pushed by other agents n order to avod collsons. Hence, smulatng the dynamcs of all the agent s neghbors often results n the same velocty for all smulated actons. Ths does not help, because the agent would not be able to select a velocty that mproves the moton of ts most constraned neghbors. Therefore, the agent consders n ts smulaton Algorthm 4: SmMoton(a) for agent 1: Input: nteger a Actons, lst of neghbors Neghs() 2: Output: R a, estmated value of acton a 3: for t =,..., T 1 do 4: smulate evoluton of neghborhood dynamcs 5: f t > then 6: for all j Neghs() do 7: f rank(j C rank ) < k then 8: R c a R c a + υ max 9: end f 1: end for 11: end f 12: R g a R g a + v new 13: end for 14: R g a Rg a T υ max, R c a g p g p v pref j R c a (T 1) k υ max 15: R a (1 γ) R g a + γ R c a vj new only the neghbors that are closer to ts goal than tself, gnorng the agents that are behnd t. Even f the best valued acton s not currently allowed, we expect that the neghborng agents wll eventually try to relax the constrants that they mpose on the agent. Neghborhood nfluence. After smulatng a specfc acton for the gven tme horzon, the agent can estmate how ths acton affects each of ts k most constraned neghbors. It computes ths based on the dfference between j s predcted collson-free velocty vj new and ts communcated preferred velocty v pref j (lne 8). Moton evaluaton. To decde what moton to perform next, the agent ams at mnmzng the amount of constrants mposed by ts neghbors, whle also ensurng progress towards ts goal. Our reward functon balances these two objectves, by takng a lnear combnaton of a goal-orented, and a constraned-reducton component (Eq. 2). Each component has an upper bound of 1 and a lower bound of -1 and s weghted by the coordnaton-factor γ. R a = (1 γ) R g a + γ R c a (2) The goal-orented component R g a computes, for each tmestep n the tme horzon, the scalar product of the collson-free velocty v new of the agent wth the normalzed vector whch ponts from the poston p of the agent to ts goal g. Ths component encourages preferred veloctes that lead the agent as quckly as possble to ts goal. Formally: R g a = T 1 t= ( v new T υ max ) g p g p The constraned-reducton component R c a averages the amount of constrants ntroduced n the agent s k most constraned neghbors. Ths component promotes preferred veloctes that do not ntroduce constrants nto these k agents. (3)

5 Bdrectonal Crcle Overhead 2me (s) ORCA ALAN C- Nav Crowd Crcle Bdrec2onal Crowd Fgure 3: Scenaros. Bdrectonal: two groups of agents move to the opposte sde of a narrow corrdor. Crowd: agents ntal and goal postons are placed randomly n a small area. Crcle: agents move to ther antpodal postons. Fgure 4: Performance comparson between ORCA, ALAN and our C-Nav approach. In all scenaros, agents usng our coordnaton approach have the lowest overhead tmes. The error bars correspond to the standard error of the mean. More formally: R c a = T 1 t=1 j C rank ( υ max v pref j (T 1) k υ max v new j ) If an agent only ams at maxmzng R g a, ts behavor would be selfsh and t would not consder the constrants that ts actons mpose on ts most constraned neghbors. On the other hand, f the agent only tres to maxmze R c a, t mght have no ncentve to move towards ts goal, whch means t mght never reach t. Therefore, by maxmzng a combnaton of both components, the agent mplctly coordnates ts goal-orented moton wth that of ts neghbors, resultng n lower travel tmes for all agents. Experments We evaluated C-Nav on three dfferent scenaros usng a varyng number of agents (see Fgure 3). Each result corresponds to the average over 3 smulatons (see moton.cs.umn.edu/r/cnav/ for vdeos). The scenaros are as follows: Crcle: Agents are placed along the crcumference of a crcle and must reach ther antpodal postons. Bdrectonal: Agents are clustered n two groups that move to the opposte sde of a narrow corrdor formed by two walls. Crowd: Agents are randomly placed n a densely populated area and are gven random goals. To evaluate our approach, we measure the tme that the agents spent n order to resolve nteractons wth each other and the envronment. We estmate ths tme by computng the dfference between the maxmum travel tme of the agents and the hypothetcal travel tme f agents were able to just follow ther shortest paths to ther goals at full speed: ( ) shortestp ath(a ) max(t met ogoal(a )) max υ max We call ths metrc nteracton overhead. A theoretcal property of ths metrc s that an nteracton overhead of represents a lower bound on the optmal travel tme for the agents, (4) and t s the best result that an optmal centralzed approach could potentally acheve. Usng the nteracton overhead metrc, we compared C- Nav to vanlla ORCA (greedy goal-orented moton) and the ALAN approach of Godoy et al. (215). In all our experments, we used ORCA s default settngs for the agent s rad (.5 m), sensng range (15 m) and maxmum number of agents sensed (numn eghs=1). We set T =2 tmesteps, υ max =1.5 m/s, γ=.9, k=3 and s=3. The tmestep duraton, t, s set to 25 ms. The number of tmesteps T was chosen because t s the mnmum needed to observe the effects of the moton and t produces the best results, though even wth larger values our approach stll outperforms both ALAN and ORCA. All smulatons ran n real tme for all evaluated methods. Results We evaluated the nteracton overhead tme n the Crcle scenaro wth 128 agents, the Bdrectonal scenaro wth 18 agents, and the Crowd scenaro wth 3 agents. Results can be seen n Fgure 4. Agents usng ALAN outperform ORCA agents n the Bdrectonal scenaro, and are on par wth them n the Crcle scenaro. However, ALAN s performance fals to scale to 3 agents n the Crowd scenaro. The nteracton overhead of C-Nav s lower than ORCA and ALAN n all cases, whch ndcates that by consderng nformaton about ther neghborhood, agents can coordnate ther moton and mprove ther tme-effcency. In terms of qualtatve results, we observe emergent behavor n the Bdrectonal and Crcle scenaro, where agents gong n the same drecton form lanes. Such lanes reduce the constrants n other agents leadng to more effcent smulatons. Scalablty. We analyzed the scalablty of our approach n the Bdrectonal and Crcle scenaros by varyng the number of smulated agents. The results are depcted n Fgure 5. In both scenaros, the dfference between the overhead tmes of ORCA and C-Nav ncreases as more agents are added. However, the overhead tme ntroduced by each added agent n the system s lower n our approach than n ORCA. Acton sets. We evaluated how the use of only agent-based or neghborhood-based acton sets compares to the combned acton set that C-Nav employs. The results for the

6 Overhead.me (s) ORCA C- Nav 1 2 # Agents (a) Bdrectonal Overhead /me (s) ORCA C- Nav # Agents (b) Crcle Fgure 5: Scalablty results n the Bdrectonal and Crcle scenaros, n terms of nteracton overhead tme. In Bdrectonal, the number of agents vared from 5 to 2. In the Crcle, the number of agents vares from 64 to 512. Overhead 5me (s) Goal- orented velocty (ORCA) Neghborhood- based Agent- based Combned (C- Nav) Crcle Crowd Bdrec5onal Fgure 6: Performance of the four acton selecton methods n terms of nteracton overhead tme, for three scenaros. three scenaros are shown n Fgure 6. For completeness, we also report the performance of ORCA,.e., the overhead tme when only a sngle goal-orented acton s used. We can observe that the combned set of actons s ether better or no worse than usng just the neghborhood-based or the agent-based acton set. The only excepton s n the Bdrectonal scenaro, where the neghborhood-based set outperforms the combned one (the dfference n overhead tme s statstcally sgnfcant). Wth only neghborhood-based actons, an agent wll devate from ts goal-orented velocty only when t can follow a neghbor whch s already movng n a less constraned manner towards the agent s goal. Effect of the coordnaton-factor (γ). We evaluated how the balance between the goal-orented and the constraned- Overhead 4me (s) Bdrec4onal Crcle Crowd Coordna4on- factor (γ) Fgure 7: Performance of C-Nav agents, wth dfferent values of the coordnaton-factor γ. reducton components of our reward functon (Eq. 2) controlled by the coordnaton-factor γ, affects the performance of our C-Nav approach. We can observe n Fgure 7 that usng any of the two extremes of γ,.e., ether a pure goalorented reward (γ=) or a pure constraned-reducton reward (γ=1), can reduce the performance. Accountng only for goal-orented behavor forces agents to choose veloctes that n many cases prevent other agents from movng to ther goals, reducng the overall tme-effcency n ther navgaton. On the other hand, gvng only preference to reducng the other agents constrants does not promote progress to the goal n low-densty envronments. Because C-Nav agents consder only neghbors that are lkely to slow down the global moton, a hgh value of the coordnaton-factor (γ=.9) helps both the agent and ts neghbors to move to ther respectve goals, whch results n the best performance. Effect of number of constraned neghbors k. We also evaluated how the number of constraned neghbors (k) n the constraned-reducton component of the optmzaton functon (Eq. 4) affects the performance of our approach. In general, the nteracton overhead tme decreases whle k ncreases. As agents account for more neghbors upon computng a new velocty, ther moton becomes more coordnated and the travel tme of the entre system of agents s reduced. In the Crowd scenaro, the nteracton overhead tme decreases nearly lnearly as k ncreases, whle n the other two scenaros the overhead tme decreases exponentally wth ncreasng k. We note, though, that n all of our experments, consderng more than 3 neghbors does not lead to any sgnfcant performance mprovement. Conclusons and Future Work We have proposed C-Nav, a coordnaton approach for large scale mult-agent systems. C-Nav agents use ther sensng nput and a lmted one-way communcaton to mplctly coordnate ther motons. Each agent takes advantage of the moton patterns of ts nearby neghbors to avod ntroducng constrants n ther motons, and temporarly follow other agents that have smlar moton. By dong ths, agents n dense envronments are able to reach ther goals faster than usng a state-of-the-art collson-avodance framework and an adaptve learnng approach for mult-agent navgaton. Our mplementaton assumes that agents can broadcast ther preferred veloctes. If ths s not the case (.e. noncommuncatve agents), C-Nav would stll work, though agents would only optmze ther motons based on ther own goal progress. To address ths lmtaton, we would lke to explore methods to predct the agents preferred veloctes from a sequence of observed veloctes, usng, e.g., a hdden Markov model. Adaptng C-Nav to physcal robots movng n human populated envronments s another exctng avenue for future work. The recent work of (Cho, Km, and Oh 214) and (Trautman et al. 215) can provde some nterestng deas n ths drecton. Acknowledgment: Support for ths work s gratefully acknowledged from the Unversty of Mnnesota Informatcs Insttute.

7 References Cho, S.; Km, E.; and Oh, S Real-tme navgaton n crowded dynamc envronments usng gaussan process moton control. In Proc. IEEE Int. Conf. on Robotcs and Automaton, Curts, S.; Guy, S. J.; Zafar, B.; and Manocha, D Vrtual tawaf: A case study n smulatng the behavor of dense, heterogeneous crowds. In Proc. Workshop at Int. Conf. on Computer Vson, Forn, P., and Shller, Z Moton plannng n dynamc envronments usng Velocty Obstacles. Int. J. Robotcs Research 17: Frdman, N., and Kamnka, G. A. 21. Modelng pedestran crowd behavor based on a cogntve model of socal comparson theory. Computatonal and Mathematcal Organzaton Theory 16(4): Godoy, J. E.; Karamouzas, I.; Guy, S. J.; and Gn, M Adaptve learnng for mult-agent navgaton. In Proc. Int. Conf. on Autonomous Agents and Mult-Agent Systems, Guestrn, C.; Venkataraman, S.; and Koller, D. 22. Context-specfc multagent coordnaton and plannng wth factored MDPs. In Proc. AAAI Conference on Artfcal Intellgence, Guy, S. J.; Chhugan, J.; Km, C.; Satsh, N.; Ln, M.; Manocha, D.; and Dubey, P. 29. Clearpath: hghly parallel collson avodance for mult-agent smulaton. In Proc. ACM SIGGRAPH/Eurographcs Symposum on Computer Anmaton, Helbng, D., and Molnar, P Socal force model for pedestran dynamcs. Physcal revew E 51(5):4282. Karamouzas, I., and Overmars, M Smulatng and evaluatng the local behavor of small pedestran groups. IEEE Trans. Vs. Comput. Graphcs 18(3): Km, S.; Guy, S. J.; Manocha, D.; and Ln, M. C Interactve smulaton of dynamc crowd behavors usng general adaptaton syndrome theory. In Proc. of the ACM SIG- GRAPH Symposum on Interactve 3D Graphcs and Games, Martnez-Gl, F.; Lozano, M.; and Fernández, F Marlped: A mult-agent renforcement learnng based framework to smulate pedestran groups. Smulaton Modellng Practce and Theory 47: Melo, F. S., and Veloso, M Decentralzed MDPs wth sparse nteractons. Artfcal Intellgence 175(11): Mod, P. J.; Shen, W.-M.; Tambe, M.; and Yokoo, M. 23. An asynchronous complete method for dstrbuted constrant optmzaton. In Proc. Int. Conf. on Autonomous Agents and Mult-Agent Systems, volume 3, Moussaïd, M.; Helbng, D.; and Theraulaz, G How smple rules determne pedestran behavor and crowd dsasters. Proc. of the Natonal Academy of Scences 18(17): Naran, R.; Golas, A.; Curts, S.; and Ln, M. C. 29. Aggregate dynamcs for dense crowd smulaton. ACM Trans. Graphcs 28(5):122. Ondřej, J.; Pettré, J.; Olver, A.-H.; and Donkan, S. 21. A synthetc-vson based steerng approach for crowd smulaton. ACM Trans. Graphcs 29(4):123. Ottens, B., and Faltngs, B. 28. Coordnatng agent plans through dstrbuted constrant optmzaton. In Proc. of the ICAPS-8 Workshop on Multagent Plannng. Pelechano, N.; Allbeck, J.; and Badler, N. 27. Controllng ndvdual agents n hgh-densty crowd smulaton. In Proc. ACM SIGGRAPH/Eurographcs Symposum on Computer Anmaton, Reynolds, C. W Flocks, herds and schools: A dstrbuted behavoral model. ACM Sggraph Computer Graphcs 21(4): Shao, W., and Terzopoulos, D. 27. Autonomous pedestrans. Graphcal Models 69(5-6): Torrey, L. 21. Crowd smulaton va mult-agent renforcement learnng. In Proc. Artfcal Intellgence and Interactve Dgtal Entertanment, Trautman, P.; Ma, J.; Murray, R. M.; and Krause, A Robot navgaton n dense human crowds: Statstcal models and expermental studes of human robot cooperaton. The Int. J. of Robotcs Research 34(3): Treulle, A.; Cooper, S.; and Popovć, Z. 26. Contnuum crowds. ACM Trans. Graphcs 25(3): van den Berg, J.; Guy, S. J.; Ln, M.; and Manocha, D Recprocal n-body collson avodance. In Proc. Internatonal Symposum of Robotcs Research. Sprnger