Prediction of bus passenger trip flow based on artificial neural network

Size: px
Start display at page:

Download "Prediction of bus passenger trip flow based on artificial neural network"

Transcription

1 Special Issue Article Prediction of bus passenger trip flow based on artificial neural network Advances in Mechanical Engineering 2016, Vol. 8(10) 1 7 Ó The Author(s) 2016 DOI: / aime.sagepub.com Shaoqiang Yu 1, Caiyun Shang 1, Yang Yu 1, Shuyuan Zhang 1 and Wenlong Yu 2 Abstract The bus passenger trip flow is the base data for transit route design and optimization, and the characteristic of urban land use is the important factor for transit trip. However, the standard land use data are difficult to reflect the intensity of transit trip. This research proposed a method based on each zone building, land use situation, and bus accessibility to forecast the bus passenger trip flow in future period. Traffic zone is divided into three categories in accordance with the purpose of the residents travel: residential, commercial, and industrial. Then, by artificial neural network model, the three categories of the traffic zone bus passenger trip flow are forecasted. The method is assessed with the data of Dalian developing zone in China and results show its feasibility and reliability. Finally, the future research direction is discussed. Keywords Bus, artificial neural network, land use, passenger, bus passenger trip flow Date received: 28 June 2016; accepted: 3 October 2016 Academic Editor: Gang Chen Introduction Bus traffics are important basic information for transportation planning and bus scheduling, with the expansion of city size, bus route sand sites are on the rise, and the frequent changes have brought many difficulties to the traditional manual method of investigating bus traffic. The traditional manual methods are affected by various factors and constraints which lead not to reflect the dynamics of the long-term changes in urban public transport travel trends. Bus travel demand forecast in the traditional 4-phase method is mainly in the mode of traffic division stage, it is separated from the total traffic demand. Forecasting the distribution of residents travel and travel mode, the bus passenger trip flow is finally obtained. The traditional four-phase method can apply to the different requirements of the forecast period; however, this method needs more data, investigation, and handling of large workload and long processing cycle, so the cost is very high. Thus, this article proposed an artificial neural network (ANN) method, based on urban land use and bus accessibility of each zone to forecast the bus passenger trip flow in future period. Using the historical data of the traffic flow to forecast the bus passenger trip flow, a large-scale investigation is avoided. Influence factors In the model of each zone residents, the bus passenger trip flow forecast, the land use, and type of traffic area are the most important influencing factors; the bus accessibility is of equal importance. 1 6 Land use is 1 Transportation and Management College, Dalian Maritime University, Dalian, P.R. China 2 School of Architecture, Tianjin University, Tianjin, China Corresponding author: Shaoqiang Yu, Transportation and Management College, Dalian Maritime University, Dalian , P.R. China. yusq_dl@163.com Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License ( which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages ( open-access-at-sage).

2 2 Advances in Mechanical Engineering involved in distribution of various types of land use patterns in the zone and each area in the city location; they are the main factors affecting the bus passenger trip flow forecast model. Urban land use is the source of urban traffic demand. The different urban land use layout form, property, and intensity determine the different traffic need and also decide the traffic generating volume, sucking volume, and layout. Traffic zone is divided into three categories in accordance with the purpose of the residents travel: residential, commercial, and industrial. The zones which attract residents to travel are mainly residential zones and commercial zones. 7 9 Public traffic accessibility is the main influence factor of public transport travel. It affects the choice of travel mode and residents travel time and other characteristics of urban resident travel. Then, it affects the traffic flow of public transport residents. If there are more bus routes between the traffic zones, the number of passengers traveling by bus will be relatively more because of the higher degree of public traffic accessibility and the more convenient trip In addition to the above two factors, the distance between the traffic zones also affects the number of residents bus travel. It determined the time cost of public transport travel. The more the time cost, the more the possibility residents choose other means of transportation. Model development ANN model ANN is a kind of model which simulates the behavior characteristics of animal neural network. And it is a mathematical model of distributed parallel information processing This network relies on the complexity of the system. It achieves the purpose of processing information through the adjustment of the internal connection between a large number of nodes. This model has the ability of self-learning and self adaptation. ANN is a mathematical model of information processing, which is similar to the structure of synaptic connections in the brain. Neural network is a kind of computing model, which consists of a large number of nodes and mutual connection. Each node represents a specific output function, called the function activation. The connection between each two nodes represents a weighted value for the connection signal, which is called the weight. This is equivalent to the memory of the ANN. The output of the network changes according to the change in the connection mode of the network, the weight value, and the excitation function. There are three types of neurons in an ANN: an input layer, a hidden layer, and an output layer. Input layer, a large number of neurons accept a large number of non-linear input information. The input information is called the input vector. Output layer, the information in the link is transmitted, analyzed, weighed, and formed the output results. The output of the information is called the output vector. Hidden layer is a layer composed of many neurons and links which are between the input and output layers. A typical artificial neuron model is shown in Figure 1. Define x j for the input node of neuron i, w ij for the weight of i to j. u i is the output of the linear combination of the input node i, and it is the last input of neuron i. u i is the threshold of neuron i. v i is the adjusted value of neuron i deviation u i = X j w ij x j v i = u i + u i ð1þ ð2þ f (.) is an excitation function, and y i is the output node of neuron i Figure 1. Artificial neural network model.

3 Yu et al. 3 Figure 2. Bus passenger trip flow forecast ANN model. y i = f! X w ij x j + u i Bus passenger trip flow forecast ANN model j ð3þ In section Influence factors, the influence factors of bus passenger trip flow forecast is discussed. Therefore, this article inputs each traffic zone land use (the proportion of residential, commercial, and industrial traffic), bus accessibility, area, and distance to other zones. The output node is the bus passenger flow from one traffic zone to another traffic zone. Bus passenger trip flow forecast ANN model is shown in Figure 2. The number of neurons in the hidden layer is a vital step in the ANN method. It directly affects the results of the model. At present, there is no relatively accurate theory and method to determine Generally, the number of neurons can be chosen with the following two formula. 1. P n i = 0 C i p.k k is sample number and n is the number of input neurons. p If i. p, then Cp i = p = ffiffiffiffiffiffiffiffiffiffiffi n + q + a n is the input neuron s number, q is the output neuron s number, and a represent constant between 1 and 10. Model data In this experiment, the bus passenger trip flow forecast ANN model is validated based on the data of Dalian economic zone, a small-sized district of 3.78 million inhabitants. The total area of Dalian economic zone is 168 km 2 and the built-up area covers 52 km 2. According to the community boundary, Dalian economic zone is divided into 27 traffic zones, 28,29 as shown in Figure 3. Land use data of traffic zones According to the national Standard for classification of urban land and for planning of constructional land, Figure 3. Traffic zones location in Dalian economic zone. Figure 4. Land use in Dalian economic zone. it stipulates that urban land use is mainly divided into three categories: residential, commercial, and industrial. There is a complicated relationship between urban transportation and land use. The land use data are the critical data of the traffic demand model. In order to grasp the status of land use in Dalian economic zone, this article collected the current situation of land use in Dalian economic zone, as shown in Figure 4. Then, input each traffic zone land use (the proportion of residential, commercial, and industrial traffic) and area in the bus passenger trip flow forecast ANN model. Bus accessibility The connection between traffic zones can be marked in the form of an adjacency matrix in the bus network. Without the condition of the internal traffic zone bus

4 4 Advances in Mechanical Engineering travel, define H for adjacency matrix, h mn is the element of H. h mn is defined as 1 zone m can be reached by bus to n h mn = 0 Other Thus, the relationship, between all the nodes in the urban public transportation network, is represented by an adjacency matrix H m H = ½hŠ mn = m Defined the H i for the initial matrix of the bus route i accessibility matrix. If traffic zone m can be reached to traffic zone n by bus without transfer, then h mn = 1, else h mn = 0. All bus routes accessibility matrix H without transfer calculate the sum: H = P i H i. Because transit is taking a long time and tired, it reduced the probability of residents to choose public transport. Therefore, when calculating the routes accessibility, only consider one transfer of reachability. Bus routes accessibility matrix by one transfer is defined as H#: H# = H*H. Ultimate bus accessibility is H * :H * =a*h + b* H# (a,b are coefficients). Because direct travel is more convenient than the transfer, coefficient a is greater than b. Through the ANN experiment, a = 10 and b = 1 is the best numerical. Distance between the zones Distance between the zones is the shortest bus route distance between the traffic zones. The transit system operates 8 bus lines and 102 stops. The total length of lines is km and the average station spacing is 0.5 km. Bus routes in Dalian economic zone are shown in Figure 5. Bus passenger trip OD Flow In order to obtain the accurate bus passenger trip origin destination (OD) distribution, this research designs a kind of investigation method which is similar to the ticket sales. This method can gain the distribution of all passenger flow, and traditional bus survey method only can get the total number of people get off each station. The survey was carried out during the early peak period of September 2005, and 8 bus lines and 160 buses in Dalian economic zone were investigated. Figure 5. Bus routes in Dalian economic zone. Case study First, the traffic zones are divided into three categories: (1) residential zones, A-zones; (2) commercial zones, B-zones; and (3) industrial zones, C-zones. A-zones contain 20 traffic zones, B-zones contain 3 traffic zones, and C-zones contain 4 traffic zones. Because of the difference purpose between the different residents travel, in this research, the bus passenger trip flow forecast is divided into three categories. The purpose of one zone travel to A-zones mainly is for shopping or visiting relatives, to B-zones mainly is for traveling or entertainment, and to C-zones mainly is for working. In order to better explain generalization ability, the experimental data are divided into three parts. The first part of data is set to the training datasets, the second part of data is set to the crossvalidation datasets, and the third part of data is set to the testing datasets. Result of destination to A-zones prediction The data, destination to A-zones bus travel, were collected from 480 groups. The effective data group was 412 groups. In order to reduce prediction error, delete the data 68 groups, in which the number of trips is less than 5. The best hidden layer is 6 under the condition of 6 neurons in the input layer, 352 group samples, 500 training times, and target of Forecast the other 60 sets of data, as shown in Figure 6. Di is the relative error of the predicted value and the actual number of the i group and its corresponding proportion are shown in Table 1. As can be seen from Table 1, ANN model is more accurate than non linear regression (NLIN) model. The relative error within 5% is 63% of groups forecast. Accuracy of ANN model is relatively high.

5 Yu et al. 5 Figure 6. Comparison of predicted and actual bus travel to A-zones numbers. Table 1. Relative error of the models based on bus travel to A-zones numbers. Di ANN model Proportion NLIN model Proportion Di < 5% 38 63% % 5%\Di < 10% 9 15% % 10%\Di 13 22% % ANN: artificial neural network. Table 2. Relative error of the models based on bus travel to B-zones numbers. Di ANN model Proportion NLIN model Proportion Di < 5% % 3 38% 5%\Di < 10% % 2 25% 10%\Di % 3 38% ANN: artificial neural network. Result of destination to B-zones prediction The data, destination to B-zones bus travel, were collected from 69 groups. The effective data group was 63 groups. In order to reduce prediction error, delete the data 5 groups, in which the number of trips is less than 5. The best hidden layer is 5 under the condition of 6 neurons in the input layer, 55 group samples, 500 training times, and target of Forecast the other eight sets of data, as shown in Figure 7. The relative error of the predicted value and the actual number and its corresponding proportion are shown in Table 2. As can be seen from Table 2, ANN model is more accurate than NLIN model. The relative error within 5% is 75% of groups forecast. Accuracy of ANN model is relatively high. Result of destination to C-zones prediction The data, destination to C-zones bus travel, were collected from 23 groups. The effective data group was 23 Figure 7. Comparison of predicted and actual bus travel to B-zones numbers. groups. The best hidden layer is 5 under the condition of 6 neurons in the input layer, 15 group samples, 500 training times, and target of Forecast the other seven sets of data, as shown in Figure 8. The relative error of the predicted value and the actual number and its corresponding proportion are shown in Table 3.

6 6 Advances in Mechanical Engineering Table 3. Relative error of the models based on bus travel to C-zones numbers. Di ANN model Proportion NLIN model Proportion Di < 5% 6 86% 3 43% 5%\Di < 10% 1 14% 2 29% 10%\Di 0 0% 2 29% ANN: artificial neural network. article: This research was supported by Social Sciences Planning Project of Liaoning Province (L14BJY015). Figure 8. Comparison of predicted and actual bus travel to C-zones numbers. As can be seen from Table 2, ANN model is more accurate than NLIN model. The relative error within 5% is 86% of groups forecast. Accuracy of ANN model is relatively high. Conclusion The goal of this work is to devise an accurate method for bus passenger trip flow prediction, so as to provide support for transit route design and optimization. A new prediction model is proposed in this article which proposed a method based on each zone building area and bus accessibility to forecast the bus passenger trip flow in future period. By ANN model, the three categories of the traffic zone, A-zones, B-zones, and C-zones, bus passenger trip flow are forecasted. Accuracy of ANN model is relatively higher than NLIN model. However, in the course of the study, there is no distinction between the grade of residential quarters and did not distinguish the location of traffic in the city. It will be the next step of the focus. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this References 1. Elen IV, Matthew GK and John CG. Optimized and meta-optimized neural networks for short-term traffic flow prediction: a genetic approach. Transport Res C: Emer 2005; 11: Wei Y and Chen MC. Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks. Transport Res C: Emer 2012; 21: Yu B, Song XL, Guan F, et al. k-nearest neighbor model for multiple-time-step prediction of short-term traffic condition. J Transp Eng: ASCE 2016; 142: Zhang WZ, Lee DH and Shi QX. Short-term freeway traffic flow prediction: Bayesian combined neural network approach. J Transp Eng: ASCE 2006; 132: Yao BZ, Chen C, Cao QD, et al. Short-term traffic speed prediction for an urban corridor. Comput-Aided Civ Inf. Epub ahead of print 21 July DOI: / mice Yu B, Wang YT, Yao JB, et al. A comparison of the performance of ANN and SVM for the prediction of traffic accident duration. Neural Netw World 2016; 26: Yu B, Kong L, Sun Y, et al. A bi-level programming for bus lane network design. Transport Res C: Emer 2015; 55: Yao BZ, Hu P, Lu XH, et al. Transit network design based on travel time reliability. Transport Res C: Emer 2014; 43: Xu WY, Deng CC, Liu B, et al. The bus passenger traffic time series prediction model. J Liaoning Tech Univ 2014; 33: Pinjari AR and Brat CR. Activity-based travel demand analysis. In: De Palma A, Lindsey R, Quinet E, et al. (eds) A handbook of transport economics. Cheltenham: Edward Elgar Publishing, 2011, pp Mandl CE. Evaluation and optimization of urban public transportation networks. Eur J Oper Res 1980; 5: Murray AT and Wli X. Accessibility tradeoffs in public transit planning. J Geogr Syst 2003; 5: Xiong Y and Schneider JB. Transportation network design using a cumulative genetic algorithm and neural network. Transp Res Record 1992; 12:

7 Yu et al Zhang YD and Wu LN. Stock market prediction of S&P 500 via combination of improved BCO approach and BP neural network. Expert Syst Appl 2009; 36: Lu HP, Zhou Q and Xu W. A neural network model for trip generation forecasting. J Transp Eng Inf 2008; 6: Ishak S, Kotha P and Alecsandru C. Optimization of dynamic neural network performance for short-term traffic prediction. Transp Res Record 2003; 1836: Hussein D. An object-oriented neural network approach to short-term traffic forecasting. Eur J Oper Res 2001; 13: Dharia A and Adeli H. Neural network model for rapid forecasting of freeway link travel time. Eng Appl Artif Intel 2003; 16: Dougherty MS and Cobett MR. Short-term inter-urban traffic forecasts using neural networks. Int J Forecasting 1997; 13: Zhao F, Chow LF, Li MT, et al. Forecasting transit walk accessibility: regression model alternative to buffer method. Transp Res Record 2003; 183: Peng ZX, Shan WX, Guan F, et al. Stable vessel-cargo matching in dry bulk shipping market with price game mechanism. Transport Res E: Log 2016; 95: Yu B, Zhang L, Guan F, et al. Equity based congestion pricing considering the constraint of alternative path. Oper Res. Epub ahead of print 30 January DOI: /s y. 23. Qiao JG and Wu YX. Transit trip generation uncertainty based on fuzzy neural networks. Road Traffic Saf 2014; 14: Kwan MP, Murray AT, O Kelly ME, et al. Recent advances in accessibility research: representation, methodology and applications. J Geogr Syst 2003; 5: Goh M. Congestion management and electronic road pricing Singapore. J Transp Geogr 2002; 10: Ghosh-Dastidar S and Adeli H. Neural network-wavelet microsimulation model for delay and queue length estimation at freeway work zones. J Transp Eng: ASCE 2006; 132: Guan JF, Yang H and Wirasinghe SC. Simultaneous optimization of transit line configuration and passenger line assignment. Transport Res B: Meth 2006; 40: Yao BZ, Yu B, Hu P, et al. An improved particle swarm optimization for carton heterogeneous vehicle routing problem with a collection depot. Ann Oper Res 2016; 242: Yu B, Zhu HB, Cai WJ, et al. Two-phase optimization approach to transit hub location the case of Dalian. J Transp Geogr 2013; 33: