Big Data computation for workshop-based planning support

Size: px
Start display at page:

Download "Big Data computation for workshop-based planning support"

Transcription

1 Big Data computation for worhop-baed planning upport Jianguang Tu International School of Software Wuhan Univerity Wuhan, P.R.China Jianquan Cheng * School of Science and the Environment Liangxiu Han School of Computing, Mathematic and Digital Technology Mancheter Metropolitan Univerity (MMU) Mancheter, UK J.cheng@mmu.ac.u and L.Han@mmu.ac.u Abtract Involving taeholder in deciion-maing at planning worhop require a computer ytem to repond quicly to variou cenario propoed by participant. Due to the increaing complexity of urban ytem, uch planning upport often face the challenge of big data and poor computational performance. Thi paper propoe a new approach uing parallel proceing technique (i.e. MPI Meage Paing Interface) to upport worhop participant in interactively building planning cenario and viualizing output of job acceibility acro the Greater Mancheter. MPI-baed parallel algorithm have been run on a cluter of computer for reducing computational time cot. The teted reult and computational performance are critically evaluated and recommendation for future wor provided. Particularly thi paper alo addree everal ey reearch quetion related to a theoretical framewor of applying big data analytic for urban planning upport. Keyword- Big Data, GIS, MPI, Parallel computation, Planning upport ytem. I. INTRODUCTION Planning i a future-oriented activity, trongly conditioned by the pat and preent. Spatial planning mean the method influencing the future ditribution of people and activitie at variou patial cale, ranging from national, regional, urban to local or neighborhood level. Reaonable and cientific patial planning i a pre-requiite for achieving utainable city and community. Planner have alway ought tool to enhance their analytical, problem-olving and deciion-maing capabilitie. Planning upport can be any geopatial tool or ICT environment to upport whole or partial tage of any profeional patial planning ta [1]. The popular planning upport ytem (PSS) include What-if [2], UrbanSim [3] and CommunityViz, each of which with varied trength and weanee in term of planning tage and target upported, patial cale, data requirement, model tructure, ytem function and oftware development model. Due to the increaing complexity of urban ytem, uch planning upport ytem often face the challenge of big data (high volume, high velocity and highly variable) [4] and poor computational performance. Accordingly, there i an increaing demand for methodological olution to enable efficient viualization of large data et and fat computation of complicated patial problem. Parallel computation technique (e.g. MapReduce and Meage Paing Interface - MPI) have been extenively applied for olving complex patial problem [5], but not for planning upport particularly when facing the challenge of big data. The next ection explain the theoretical framewor of planning upport ytem and job acceibility. Section three introduce the tudy area, data ource, method of meauring job acceibility, and MPI approach for parallelizing analyi algorithm (e.g. networ analyi and acceibility meaurement) over a cluter of computer. Section four preent reult from a tet of worhop-oriented cenario building uing a cae tudy of Greater Mancheter. The paper finihe with general concluion, evaluation of performance, recommendation for future wor and quetion for applying big data for planning upport. II. PLANNING SUPPORT SYSTEM In geopatial term, any S (Spatial) PSS ytem can be generalized a having five interrelated component: patial planning proce, data, model, ytem, and uer (Fig.1). Firtly, the planning proce decribe which tage, which cale, which type of planning the PSS hould upport. Planning procee vary from country to country, from period to period, from neighborhood to urban region, and from land ue planning to tranport planning. Secondly, data cover which data et hould be collected and integrated, in order to upport a pecific planning proce. Data in particular vary with cale (patial, temporal and deciion-maing) and level (aggregate, diaggregate and individual). Thirdly, a the core element of a PSS, a model refer to which analytical function the PSS hould be

2 equipped with. The model varie with the objective of planning upport (e.g. aement, prediction and imulation) and i either data-driven, nowledge-driven or methoddriven. Fourthly, a ytem decribe which environment, platform or interface the PSS hould promie to offer for the communication between the PSS and uer. The ytem need to integrate the module of both data and model and erve the planning proce. Finally, a uer i compoed of divere actor taeholder, planner, deciion-maer, reident, developer and manager, who contribute their input (e.g. nowledge, opinion and comment) to different tage and ta of planning. Proce Model Uer Fig. 1 Five component of a PSS PSS Tranport Infratructure; weather, facility, traffic, travel, Influence Input Data Sytem In thi paper, acceibility-baed planning upport i taen a an example. Acceibility can be defined a the potential of opportunitie for interaction or the eae of reaching preferred place. Job acceibility involving the patial and no-patial interaction between tranport, worer and job ytem (Fig.2) [6] not only relate to commuting but alo ha impact on patial inequity. Acceibility planning, focuing on promoting ocial incluion of diadvantaged group and improving their acce to opportunitie including employment, education and health care (in the UK) or on integrated tranport and land ue planning (in the Netherland), involve the ta or tage of identifying iue, viioning (trategic deign), generating alternative olution and evaluating the alternative. Engaging or involving taeholder in thee tage through online public participation or at worhop, require a computer ytem to repond quicly to a wide range of cenario propoed by divere participant. According to the theoretical framewor of planning upport (Fig.1), a ucceful planning upport ytem need to undertand the requirement from the five component and olve the challenge from each of them. A a preliminary tudy, thi paper i only focued on the ytem and model component. III. DATA AND METHODS A. Data Source and Challenge The tudy area i Greater Mancheter (GM). Conidering edge effect of patial analyi and requirement of job acceibility meaurement (e.g. competition), the patial extent of thi application ha been expanded to cover a large area of England (Fig.3) a people living in other countie may commute to the GM region. Worer Individual characteritic (age, income ), attitude and preference, car ownerhip, temporal flexibility Job Scale, diverity of pot, requirement, wage rate, temporal flexibility Non-patial interaction Spatial interaction Fig 2. A conceptual framewor of job acceibility Fig.3. Study area and patial data coverage

3 The data et include the ordnance urvey (OS) provided ITN (Integrated Tranport Networ) data, commuting data from 2011 cenu via Cider, and demographic data from 2011 cenu via Office for National Statitic (ONS). A car networ data et i built from the road networ hape file converted from the ITN data. The total number of job at output area level are aggregated from the commuting data with detination ite in the tudy area. The total woring population aged at output area level i joined with the job data into a point layer. The total number of node on the networ i 1.5 million and the total number of point i 90,000 for Origin (reidence) and Detination (wor) each (Figure 3). The matrix of travel time between pair of 90,000 point i 30G when repreented a an integer data type. Obviouly, thi i a typical example of Big Data application in term of high data volume. O i n P jz n Ej( i) u 1 Ej f ( tzj) E m 1 f ( t j m j j n z 1 Oi Oi zm ) E P P ji jz W W (4) z i f ( tij) f ( tzj) (6) (5) B. Meaurement of Job Acceibility Job acceibility, a the interface between tranport, worer and job ytem, i thu very much dependent on the degree of their interaction, including both patial and non-patial interaction [6]. The patial interaction between worer and job ytem reult in patial dimenion of acceibility uch a competition between worer or between employer. However, non-patial interaction, which are often neglected in the literature, reult from the varied degree of match or imbalance between the demand and upply ide. To accurately meaure job acceibility, ditance decay, competition on the demand and upply ide, and diverity need to be incorporated into meaurement of job acceibility [7, 8]. For planning upport, place or location baed meaurement i adopted a patial eparation i the main concern. To overcome the hard-to-interpret iue of gravity-baed method, which i a relative meaurement, an abolute meaurement job opportunity (potential number of job) i propoed by conidering ditance decay and two-way competition into Equation 1-6. It i aumed that the total number of reidence ite, job ite and job type (imilarly worer type) are m, n and repectively. Wi Wi 1 Ej Ej f ( t ab ) 1 e β ( ) tab (1) (2) (3) Where W i denote the total woring age population at ite i, among which W i being the population for worer type. E j denote the total number of job at ite j, among which E j being the population for job type. A negative exponential function in Equation 3 i ued to quantify travel time friction at urban level and t ab i the travel time by car from reidence ite a to wor place b. Equation 4 indicate competition for worer between employer and Equation 5 integrate the two competition into the meaurement. In Equation 6, O i i the job opportunity allocated from job type to reidence ite i o Oi i the aggregated overall job opportunity for the ite i. The algorithm complexity in Equation 1-6 i O (m*n 2 ), which i much higher than other meaurement. C. Parallel Computation The big-data challenge in thi tudy include quic viualization of car networ data et and fat computation of travel time between pair of origin and detination and job acceibility (Equation 1-6). The latter i the focu of thi paper a there have been extenive tudie on the former including GPU technique. The cluter of computer ued i compoed of 192 proceor 12 node and 16 proceor at each node. Each proceor (Intel Xeon CPU E GHz) ha up to 64G RAM. The calculation of quicet route t ab in Equation 3 i implemented by the traditional Dijtra algorithm that wa compared with Moore algorithm. A circle with a radiu of 60 m i ued to limit the earch pace, which i alo applied to the calculation of job acceibility baed on the aumption that worer may travel by train intead of by car if the driving ditance i longer than 60 m. Two parallel computation trategie (Fig.4), baed on a parallel approach [9], are deigned to calculate t ab and their election i baed on the data volume of networ and origin and detination point. The calculated matrix of t ab, totally about 30G in torage pace, i ditributed to each proceor with an equal number of origin and detination point. The parallel computation of

4 acceibility meaurement involve four ta of allocating the decompoed calculation in Equation 4-5 to each proceor and then aggregating the reult into a mater proceor uing the MPI technique. For example, Ta 1 i focued on competition for worer between employer in Equation 4, which wa calculated imultaneouly by all proceor. Ta 2 tend to ummarize the calculated reult from each proceor and catter it to each proceor again uing the MPI function. Ta 3 i focued on integration of two competition a hown in Equation 5, calculated imultaneouly by all proceor. Ta 4 i to gather all the reult uing the MPI function. be built and viualized in the ame way and with ame effect. Fig.4. Two trategie of parallelization IV. RESULTS Due to unavailability of job type data, only ditance decay and competition are conidered in the meaurement of job acceibility. A couple of day are needed to wor out the job acceibility in the tudy area on a ingle PC (Intel (R) Core (TM) i5 CPU 4 Core) but the wor ha been maively reduced to 11 minute over the cluter (ee Fig.5). The job opportunity (equation 6) and it ernel denity hown in Figure 6 are eay for communication with taeholder a the calculated job opportunity i an abolute value. Next, Mancheter Airport (Fig. 6) i taen a an example of worhop baed planning upport, in which participant attempt to build a cenario: a new patial pattern of job opportunity after 10,000 job are added to the airport, timulated by the airport expanion plan. Fig. 7 how the increaed job opportunity acro the city and around the zoomed airport area. The cenario building over the cluter only tae 15 minute, which i within the acceptable range of worhop practice. Other cenario uch a change of road networ and traffic will Fig.5. Spatial ditribution (upper) and denity (lower) of job opportunity calculated Fig.6. Mancheter Airport and output area unit

5 1) How do we define and ditinguih big and mall planning procee? 2) What are the difference between big and mall data et for planning? 3) What are big and mall model for planning analyi? 4) What are big and mall ytem for planning application? 5) What are big and mall uer for planning participation? ACKNOWLEDGMENT Funded by the School of Science and the Environment, MMU (taff reearch incentiviation award). Than to Dr Nic Gould for commenting on the draft. Fig.7. Outcome of cenario for planning upport: added job opportunity (upper) and zoomed area around the airport (lower) V. DISCUSSION AND CONCLUSION Thi paper ha demontrated that parallel computation can provide methodological olution to big-data challenge faced by the worhop baed planning upport, particularly in extenive computational time. The method can be alo applied to other complex patial problemolving. In the future, a multiple-mode tranport ytem, temporal element and diverity hould be further conidered into the meaurement of job acceibility. However, thi paper only examined the computational olution to planning upport, which i a mall part of all apect (ee Fig 1). Other form of planning upport uch a more cenario, application ytem development, geo-imulation (job earching behavior), viualization (at building level), and Web-baed public participation hould be further explored in the future. Big data analytic i providing great opportunitie for planning upport but the following quetion hould be anwered theoretically, which will be further explored in the future: REFERENCES [1] S. Geertman and J. Stillwell, Planning Support Sytem: an Inventory of Current Practice, Computer, Environment and Urban Sytem, vol. 28, 2004, pp [2] R.E. Kloterman, The What if? Collaborative Planning Support Sytem, Environment and Planning B: Planning and Deign, vol. 26, 1999, pp [3] P. Waddell, UrbanSim: Modeling Urban Development for Land Ue, Tranportation and Environmental Planning. Journal of the American Planning Aociation vol.68(3), 2002, pp [4] M. Batty, Big data, Smart Citie and City Planning, Dialogue in Human Geography, vol.3(3), 2013, pp [5] L. Yin, S. Shaw, D. Wang, E. Carr, M. Berry, L. Gro, E. J. Comiey, A Framewor of Integrating GIS and Parallel Computing for Spatial Control Problem a Cae Study of Wild-fire Control, International Journal of Geographical In-formation Science, vol.26(4), 2012, pp [6] J. Cheng and L. Berterloni, Meauring Urban Job Acceibility with Competition, Ditance Decay and Diverity, Journal of Tranport Geography, vol.30, 2013, pp [7] J. Cheng, L. Berterloni, F. le Clercq, L. Kapoen, Undertanding Urban Networ: Comparing a Node-, a Denity- and an Acceibility-baed View, Citie: International Journal of Urban Policy and Planning, vol.31, 2013, pp [8] J. Cheng, L. Berterloni, F. le Clercq, Meauring Sutainable Acceibility. Tranportation Reearch Record: Journal of the Tranportation Reearch Board, vol. 2017, 2007, pp [9] L. Han, C.S. Liew, J. van Hemert, M.P. Atinon, A Generic Parallel Proceing Model for Facilitating Data Mining and Data Integration. Journal of Parallel Computing, vol.37(1), 2012, pp