Testing schedule performance and reliability for train stations

Journal of the Operational Research Society (2000) 51, 666±682 #2000 Operational Research Society Ltd. All rights reserved. 0160-5682/00 $15.00 www.stockton-press.co.uk/jors Testing schedule performance and reliability for train stations M Carey* and S Carville University of Ulster, Northern Ireland On busy congested rail networks, randomdelays of trains are prevalent, and these delays have knock-on effects which result in a signi cant or substantial proportion of scheduled services being delayed or rescheduled. Here we develop and experiment with a simulation model to predict the probability distributions of these knock-on delays at stations, when faced with typical patterns of on-the-day exogenous delays. These methods can be used to test and compare the reliability of proposed schedules, or schedule changes, before adopting them. They can also be used to explore how schedule reliability may be affected by proposed changes in operating policies, for example, changes in minimum headways or dwell times, or changes in the infrastructure such as, layout of lines, platforms or signals. This model generates a reliability analysis for each train type, line and platform. We can also use the model to explore some policy issues, and to show how punctuality and reliability are affected by changes in the distributions of exogenous delays. Keywords: rail transport; timetabling; simulation; reliability Introduction Busy complex railway stations having dozens of platforms and subplatforms with more than several hundred trains per day arriving and departing are common in Europe and Asia. Trains of different types and speeds arrive and depart on multiple con icting lines and are subject to restrictions or preferences concerning which lines and platforms they can use. They also have various dwell time and headway requirements, typically have desired or preferred arrival and departure times, and have various costs or penalties for deviating from these times. To ensure that all of these constraints are met, detailed schedules are usually constructed months in advance, and the timetable is usually published. To generate good feasible schedules for such busy stations, various methods have been developed. 1 However, these are deterministic methodsðand do not indicate how the schedule will perform when faced with the delays which typically occur. When the schedule is implemented, on-the-day (deviations from the schedule) are common, due to passengers boarding and alighting, operating delays, failures of equipment or rolling stock, weather, accidents, etc. These delays in-turn cause further knock-on delays, for example a train arriving or departing late may block and delay the arrival or departure of other trains. It is important to keep delays down to a low level, otherwise knock-on delays can quickly escalate. In Britain, for example, *Correspondence: Prof M Carey, Faculty of Business and Management, University of Ulster, BT37 0QB, Northern Ireland. E-mail: m.carey@ulst.ac.uk between 5% and 20% of trains arrive or depart late at typical busy stations. In view of this, the present paper is concerned with developing methods for illustrating and quantifying the behavior of train station schedules when faced with typical patterns of on-the-day exogenous delays. In considering the reliability of a schedule we can take the exogenous delays as given, so that a reliable schedule is then one in which exogenous delays cause the least knock-on delays. We therefore introduce typical patterns of exogenous delays and use a simulation approach to obtain the distributions of knock-on delays. We explore how delays, platform allocations and reliability are affected by increasing the average size of exogenous delays, or increasing the number of trains affected by such exogenous delays, or by some scheduling rules. Previous work on train scheduling has been mainly developing deterministic methods, methods and algorithms for constructing schedules. 1±9 But deterministic scheduling is not our concern here. Previous stochastic simulation models of train movements trains have been mainly concerned with simulating rail freight movementsðparticularly through marshalling yards, or simulating trains meeting and passing on single-track lines, and generally assume the trains are not timetabled. They are not concerned with scheduled traf c at busy complex stations, which is the concern here Halloway and Harker 10 describe an interesting simulation for scheduled traf c but deal with trains meeting and passing on tracks without stations. Chen and Harker 11 estimate delays for scheduled trains, but again for meeting and passing on lines rather than at stations. One reason for this is that in North America

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 667 busy multi-platform rail stations are almost nonexistent, whereas in Europe and Asia they are common. There are also some deterministic simulation models of train movements, including movements through junctions and stations, but these do not consider the effects of the random delays which are prevalent in train services. Such deterministic models are useful for feasibility testing of schedules, which is not the purpose of this paper. We assume throughout that the schedule for which we wish to estimate reliability is already a feasible schedule. By a feasible schedule we mean that if there are no unscheduled delays of any kind to any of the trains in the schedule, then there are no con icts between the times (arrival or departure times, dwell times, platform occupation times, etc.) of any of the trains in the schedule, so that all trains in the schedule can run exactly according to the schedule. Furthermore, we assume that a feasible schedule satis es all minimum headways which are required between trains arriving or departing, connecting, dwelling at platforms, etc. Train operators or planners often use words other than feasible to refer to a feasible schedule. In British rail operations such a schedule is referred to as a `proven' schedule, and the process of checking the schedule for feasibility is called proving the schedule. Deterministic simulation programmes are sometimes used for detailed proving of a schedule, that is, for detailed checking for feasibility, and this is sometimes referred to as testing the reliability of the schedule. However, such schedule proving packages are deterministicðthey do not consider exogenous unscheduled delays or disturbances. In contrast, in this paper we are concerned with testing the reliability of already proven feasible schedules, by considering how reliable they are in the presence of typical patterns of delays and disturbances. We use the phrase `exogenous delays' or `initial delays' to refer to the dozens of causes of delay recorded daily on rail networksðdelays due to breakdown or underperformance of rolling stock, points failures, crew lateness, line maintenance, obstacles on lines, delays in passenger boarding or alighting, etc. These exogenous delays frequently cause knock-on delays to other trains. For example, if a train is late leaving a station platform this may delay the arrival of the next train scheduled to use the platform, which may in turn delay further trains. Or, if a train arrives late its scheduled platform may be already occupied, so that the train has to be sent to a different platform which may delay trains scheduled for that platform. In this paper we are concerned with knock-on delays caused at a single station. In view of this, the delays incurred by trains prior to arriving at the station are treated as exogenous delays, even though they may be due to knock-on delays incurred at earlier stations. Therefore knock-on delays here means only the knock-on delays caused at the current station. Finally, we should note the limitations of this paper, and what is not covered in it. We are not attempting to model the full complexity of the rail industry but are simulating a component of it, namely a busy station. The general train planning problem involves a much wider range of con icting objectives and constraints. It includes, matching train services to travel demands, minimising journey times, avoiding trains con icting not just at one station but at all stations, junctions and track over which they pass, planning for different speed and stopping patterns, and producing schedules that make ef cient use of rolling stock and train crews. We do not deal with these issues in this paper. Nor do we discuss the sophisticated systems and decision processes that are used in signalling and control nor the role of train planners, signalmen and train controllers who make very complex decisions based on years of experience, and without which the system could not operate. The research in this paper involved cooperation with many people in the rail industry. It was carried out over several years, dating back to a few years before the rail industry in Britain was privatised in the mid-1990s. This involved meetings and discussions with train planners, and operators and managers at all levels in British Rail and then, after privatisation, in Railtrack and Train Operating Companies. Because of this time-span, and personnel changes in the industry, our contacts changed over time and some have moved within the industry and some moved out. From train planners we collected detailed data on train operations, preferences, station layout, etc, (see below and Carey and Carville 1 and Carey 12 ) for a number of stations, in particular Leeds, Manchester and York. The experiments reported here are based on a Leeds station since we had the most complete data set for that station. Much of this data was not available in published or printed form. To verify the train scheduling rules that we used, we took a draft annual timetable produced by train planners at British Rail (BR), using their existing, partially manual, timetabling methods, and showed that our scheduling=simulation programme could generate almost exactly the same timetable. We checked any differences between our timetable and theirs and found that any differences were due to their having made exceptions to their own stated timetabling rules. However, it should also be noted that the results of some of the main experiments in this paper could not be veri ed by comparing with observed results from BR or Railtrack, since there are no comparable observed results. For example, we experimented (see below) with varying the distribution of exogenous delays, varying the percentage of trains that suffer exogenous delays, and with not allowing platform changes on-the-day. These are experiments that the train companies would not wish to conduct with actual trains. However, we discussed the results of these experiments with train planners, and they found the results were consistent with their expectations. In some cases they had no rm prior expectations or the results were considered somewhat different than they may have

668 Journal of the Operational Research Society Vol. 51, No. 6 expected, for example, average knock-on delays were larger or smaller than expected. However, even in these cases the results were found equally interesting and informative. Finally, we should note that this was not a project commissioned within the rail industry and the initiative for conducting it came from the authors. The simulation approaches illustrated in this paper may be used in various ways. They may be used by train planners, managers or operators in testing the reliability of proposed station schedules before adopting them. They may also be used in exploring how schedule reliability would be affected by proposed changes in operating policies, for example, changes in minimum headways or dwell times; or proposed changes in train services, for example, numbers of times of trains; or proposed changes in the infrastructure, for example, layout of lines, platforms, signals, etc; or major incidents or accidents, etc. To assess the effects of proposed changes, run the simulation programme with and without the proposed changes, to generate distributions of knock-on delays for the before and after scenarios, and compare these. The distributions to compare, and the costs or bene ts of changes in these distributions, will depend on the context and on the change being considered. Furthermore, the decision for or against any change may be affected by the issues referred to in the previous paragraph. As examples we consider how reliability is affected by permitting, or not permitting, platform assignments to be changed on-the-day, or by allowing late trains to depart after less than their normal minimum dwell times. Outline of the rescheduling or dispatching model To estimate how exogenous delays affect the train schedule (the train timetable and platform allocation) at a station, we used the ATTPS (automatic train timetabling and platforming system) programme, 1 which is outlined below. Before we can introduce exogenous delays we also need an initial schedule to which to apply the exogenous delays. To generate this initial schedule we again used the ATTPS programme. 1 (As a further test of our results, we also re-ran all of the experiments using as the initial schedule the BR published schedule for the station. The results which we obtained when starting from this initial schedule were similar to those reported here based on starting from our ATTPS based schedule.) The simplest and most useful way to describe the ATTPS model is to state what takes as input data, what it produces as outputs and to outline how the outputs are obtained from the input data. The input data for ATTPS consists of: (i) The minimum headways required between trains (which depend on whether their paths intersect, the train types, whether each train is arriving or departing, and from which platforms). (ii) The minimum dwell time required for each train (which depends on the train type, and on whether it is a through train or terminating train). (iii) The station layout. This includes: the numbers of inlines and out-lines, number of through platforms and terminating platforms and sub-platforms, which lines are connected to which platforms, which lines con ict (intersect or share a portion of track, or track circuit section). (iv) A draft timetable. That is, a list of trains indicating the train type, the lines on which the train is expected to arrive and depart respectively, and an approximate or desired arrival and departure time for each train. (v) Platform preferences. For example, for each train type, there may be a different cost or penalty depending on the platform or subplatform to which a trains is assigned. (vi) Train delay costs. For example, for each train or train type, there may be a different cost or penalty for each minute by which the scheduled arrival or departure time deviates from the desired times. The actual data values that we used in the present paper are given in Carey and Carville 1 and Carey. 12 Using the above input data the ATTPS model generates the following outputs: (1) A scheduled arrival time, departure time and platform for each train. These scheduled times and platforms satisfy all of the data requirements and constraints in (i)±(vi) above. (2) An analysis of the solution given in (1), including tables, graphs and distributions of: (a) deviations of train arrival, dwell and departure times from their desired times. (b) changes of train from their most preferred platforms. (c) times for which each platform is occupied. The ATTPS algorithms are set out in Reference 1. A basic version of the algorithm operates as follows. Consider the trains one at a time in a prespeci ed order, for example, order of importance (business or revenue class) and=or chronological order of desired arrival times. For each train t, consider assigning the train to each feasible platform in turn. Assigning the train to a trial platform involves checking for all con icts which this may incur with already schedule trains, and nding all adjustments which would be needed to resolve these con icts. For each trial platform for train t, the algorithm computes the set of train delays, platform preference costs, etc., which would be incurred if train t was sent to that trial platform. By comparing these delays, costs and penalties for each trial platform, the algorithm chooses a best platform for train t. Having assigned train t to a platform, the algorithm proceeds to the next train in the list, and so on. Other

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 669 features of the algorithm include: in which order to consider platforms, how to choose among subplatforms of the same main platform, how delay costs and platform costs are combined or traded-off, looking ahead before con rming a platform choice, etc. In one version of the ATTPS model we required that all trains go to their already scheduled platformsðcall this the xed platform model. In another version we allow trains to be sent to different platforms if this would reduce lateness or other penaltiesðcall this the exible platform model. Schedule performance, with uniform distributions of exogenous delays There is very little published concerning distributions of train arrival, dwell or departure delays at stations, or which forms of distributions t best. Also, the parameters of these delay distributions may vary from station to station, vary over time, and depend on the train types. In view of this, for illustrative purposes, we here use two different types of simple distributions, namely uniform distributions in the present section and beta distributions in the following section. We choose parameters for these distributions so that the mean delays, and the percentages delayed more than 0, 5, 10 or 15 minutes, are consistent with delays patterns recorded at various train stations in Britain. In view of this, we are concerned here with illustrating what can be done, rather than with nding de nitive numerical results for a particular station. The experiments and simulations in this section and the next are different, for example, in this section we simulate how knock-on delays vary as the range of exogenous delays increases or decreases, while holding xed the percentage of trains having exogenous delays. To vary the range of exogenous delays we simply vary the bounds of the uniform distribution. In contrast, in the next section we simulate how knock-on delays vary as the percentage of trains having exogenous delays varies. In doing this we keep the distribution of exogenous delays for each train unchanged. In this section we introduce exogenous delays drawn from a uniform distribution, by adding a sample of such delays to the arrival and dwell times of a sample of trains. In practice only a certain percentage of trains experience exogenous delays, and the distribution of these delays can vary between stations and train types. We experimented with various percentages and distributions. For the results below we assume mean delay values ranging from zero up to greater than or equal to those which are typical in practice at, for example, various BR stations. In the experiments reported below we used the following data and assumptions. 20% of trains, selected at random, experience exogenous delays to arrival times. (In the next Section we experiment with different percentages.) These exogenous delays are uniformly randomly distributed ranging from 2 to20 minutes. (Since the range of delays varies widely in practice, in Experiments 3 and 4 below we consider various ranges, 2 to 2, 2 to 4, and so on up to 2 to 60 minutes.) Dwell delays: half as many trains experience exogenous dwell time delays as arrival time delays, that is, dwell delays for a random 10% of trains. Also, exogenous dwell delays are uniformly randomly distributed from 0 to 10 minutes. (In Experiments 3 and 4 below we consider various ranges.) Minimum dwell time: as well as a scheduled dwell time there may be an absolute minimum dwell time which can be used if the train is running late. We introduce this in Experiment 2 but not in Experiment 1. If a train would otherwise depart late we allow the scheduled dwell time to be reduced to not less than a minimum dwell time, which we assume is fraction, for example, 0.8, 0.6, etc. of the scheduled dwell time. The simulation experiments and their results are set out in more detail below. Experiment 1 delays. Distributions of knock-on delays and total To estimate the distributions of knock-on delays caused by given distributions of exogenous delays we proceeded as follows. (i) Choose the arrival time and dwell time delay distributions, and the percentages of trains to subject to these delays, as set out above. (ii) Using the delay distributions stated above, (a) to the scheduled arrivals time of each train add a delay drawn at random from the distribution of exogenous arrival time delays; (b) similarly, to the scheduled dwell time of each train add a delay drawn at random from the distribution of exogenous dwell time delays. (iii) Simulate running this perturbed timetable for one day, and record all delays (exogenous and knock-on)ðsee Figure 1(a). (iv) Repeat steps (ii)±(iii) 1,000 times, to simulate 1,000 daysðsee Figure 1(b). (v) Compute descriptive statistics for the distribution of delays obtained in (iv). Typical descriptive statistics used to measure transport reliability or punctuality are: (a) mean, median, mode, standard deviation, etc., of delays. (b) the percentage of trains having knock-on delays of less (or more) than 0, 5, 10, etc., minutes. In Step (iii), instead of listing all delays we can save storage space by recording only the numbers of delays between 0 and 5 mins, 5 and 10 mins, etc. Also, we may wish to record a separate distribution of delays for each type of train, for example, express, inter-city, local, freight.

670 Journal of the Operational Research Society Vol. 51, No. 6 Frequency distributions or pdfs of delays obtained in Step (iv) above are shown in Figures 1 and 2. We also computed the statistics in (v) (a)±(b) for knock-on delays and for total delays (exogenous delays knock-on delays). Reliability measures from cumulative delay distributions Figures 1(a) and 1(b) shows the cumulative distributions which our simulations yielded for ve measures of delay. Figure 1(a) is obtained by running the simulation for a single day ((iii) above), and 1(b) obtained by running the simulation for 1,000 days ((iv) above). The shapes and relative positions of the lines in Figures 1(a) and 1(b) are similar, but the differences between them show that the averages contained in Figure 1(b) conceal signi cant daily differences. The percentages of trains delayed more than 0, 5, 15, etc., minutes can be read directly from the Figure 1. These are percentages used by train operators, planners, and the general public, as measures of reliability or performance for trains. Public transport operators are frequently required to publish a selection these percentages, for example the percentages of trains arriving or departing more than 0, 5 or 15 minutes late. In Figure 1 the percentages of trains delayed more than 0, 5, 15, etc., minutes are consistent with those recorded in practice for BR stations of this size and complexity. Figure 1 (a) Cumulative distributions of delays from a single day simulation. (b) Cumulative distributions of delays from a 1,000-day simulation.

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 671 We assumed only 20% of trains experience exogenous arrival delays (U 2 to 20)) and 20% experience exogenous delay (U(0±10)). To illustrate the effects of this in the simulation consider train delays of 5 minutes or more, as shown by ordinates in Figure 1(b) corresponding to 5 on the horizontal axis. We nd 13.1% with exogenous arrival delays (55 mins) and 7.2% with knock-on arrival delays (55 mins), which yields 20.3% in total with arrival delays 55 minutes. These 20.3% with arrival delays, plus 9.1% with exogenous dwell delays (55 mins), caused a total of 25.3% to have departure delays 55 minutes. Recall that in the simulation we did not allow scheduled dwell times to be reduced, hence every train arriving late automatically departed late. These knock-on departure delays could be reduced by allowing dwell times of late trains to be reduced, as discussed later. Also, the percentage of trains with knock-on delays would be signi cantly reduced if we counted only delays greater than say 10 minutes. We have also generated graphs (not shown here) similar to Figure 1 for all trains using a particular platform at the station, or all trains of a particular type, or all trains arriving or departing on a particular line. This helps identify the problem platforms, trains or lines, and focuses train planners attention on these. For example, if a particular train type is less punctual than others perhaps the minimum headways for this train type should be increased. Also, there may be quite different punctuality targets for different train types, for example, for intercity express trains, local stopping trains and freight trains. Typical patterns of relative frequency distributions (or pdfs) of delay The data in the cumulative distributions in Figure 1(b) can instead be presented as the relative frequency distributions in Figure 2, the former being the integral of the latter. If we divide the vertical scale in Figure 2 by 100 the curves in the gure become pdfs (probability distribution functions), and for brevity we may refer to them as pdfs. One difference between these Figures 1(b) and 2 is that in the former we can see the percentage of trains which have zero delays, which is a majority of the trains. This percentage is now shown in Figure 2, but could be shown, for each of the relative frequency curves, as a point mass at delay ˆ 0. The curves in Figure 2 represent only the tail of the pdf corresponding to trains that are actually delayed, and the area under each curve is the % of trains actually delayed. Figure 2 shows the pdfs for ve types of delays. By de nition (exogenous delays of arrivals) (knock-on delays of arrivals) ˆ (total delay of arrivals). The pdfs of the sum of two random variables is the convolution of the component pdfs, see for example Reference 13 for convolutions). There are no other such direct relationships between the ve pdfs. Their relative shapes are consistent with a large busy station such as we simulated but can be quite different for other types of stations. For example, if the station had very little traf c then knock-on delays of arrivals could be negligible even if the exogenous delays of arrivals were large. Conversely, if the station was very congested then knock-on delays of arrivals could be very large even if the exogenous delays of arrivals were negligible. Knock-on delays of arrivals or departures can be caused by any or all of the ve types of delay shown in Figure 1 (note that knock-on delays can be caused by other knock-on delays). To see this, recall that exogenous delays of arrivals or of dwell times can cause train time con icts which then delay (knock-on) the arrival and=or departure of that or other trains. And these arrival or departure delays can cause further con icts which delay the arrival or departure of later trains. Hence the curves (in each Figure, 1 and 2) are all Figure 2 Relative frequency distributions (or pdfs) of delays from a 1,000-day simulation.

672 Journal of the Operational Research Society Vol. 51, No. 6 interdependent in a very complex way which is captured only by the full scale simulation. Each curve (except the exogenous delay curves) depends on the other four. Slight uctuations in curves in Figures 1(b) and 2. We observe that the curves in Figures 1(b) and 2 are fairly smooth but they have some bumpiness or unevenness, except for the exogenous delay curves. This unevenness is not caused by the inherent randomness in the sample of days in the simulation hence does not go away if we take a larger sample. The simulation covered 1,000 days and even when we simulated far more (10,000) or fewer days the shapes of the curves, including the bumps, remained almost exactly the same. The slight bumpiness in the curves is due to the fact that (before we add random disturbances) the daily timetable, like all timetables, is a set of xed times, which inevitably have certain patterns. For example, many scheduled dwell times tend to be 2, 4, 10, etc., minutes, and many scheduled arrival and departure times tend to be on-the-hour or half hour, or 10, 20, etc., minutes after. This can cause some patterns in the lengths of knock-on delays. If a train misses a time slot the alternative slots occur after certain xed intervals, not randomly. This makes some knock-on delay durations more likely than others. It is perhaps surprising that the curves in Figures 1(b) and 2 are as smooth as they are. Con dence intervals for the mean of the delays occurring on any one day. For each of the 1,000 simulated days we computed the mean delay d for arrival, dwell and departure knock-on delays. We found that the distributions of these daily means are approximately normal. Assuming the mean daily delay d is normally distributed, an estimate of the 95% con dence intervals for d is: ( d 1:96s) to ( d 1:96s), where d and s are the mean and standard deviation respectively of the 1,000 values of d from the 1,000 day simulation. Example Assume 20% of trains experience exogenous arrival delays distributed U 2 to 30) exogenous dwell delays distributed U(0 to 15). From the 1,000 day simulation we obtained the standard deviations and hence con dence intervals shown in the following table. 95% con dence interval Random variable d s.d. of d for d 1:96s Mean over a day of knock-on 0.36 0.702 mins delays to arrivals Mean over a day of knock-on 0.15 0.294 mins delays to dwells Mean over a day of knock-on delays to departures 0.39 0.764 mins Con dence intervals for the mean delay (over all days). Above we computed the con dence interval for the `mean daily delay' d. This spread of mean daily delays is not reduced by taking a larger sample of days, since the mean daily delay will vary from day to day and will keep varying no matter how many days we consider. However, train operators are also interested in d, the mean of the `mean daily delays'. The latter ( d) will converge as we increase the number of days in the sample, and the con dence intervals will become narrower (the estimates more accurate) the larger is the sample of days in the simulation. Since the days in the sample are independent of each other, the central limit theorem applies, hence we expect the standardp deviation of the mean of the `mean daily delays' to be n 1 times smaller than the standard deviation of the `mean daily delay' given above, where n is the sample size. Therefore with a sample size of 1,000 days the 95% con dence intervals p for the mean of the `mean daily delays' are 1000 1 ˆ 31:6 times p smaller than those in the table above. That is (1:96s= n 1), hence 0.0222, 0.0093 and 0.0242 minutes respectively, which is less than 2 seconds. Experiment 2 Punctuality improvement or deterioration at a stationðgetting back on schedule. A question of interest to transport operators and users is: `Will the delays encountered at a station mean that trains are even further behind schedule when they depart from the station than when they arrived?' At rst sight it may seem that the distribution of delays must be worse on departure than on arrival. However, there is a mechanism for helping get trains back on time if they arrive late. If a train is late it can depart after a minimum required dwell time rather than adhering to the original scheduled dwell time. (Suppose a train has a scheduled dwell time of say 10 minutes and a minimum required dwell time of 7 minutes. If it arrives 4 minutes late it is ready to leave in the 4 7 ˆ 11th minute, that is, only 1 minute late instead of four minutes late. On the other hand, if the minimum required dwell time is 4 minutes, then it is ready to depart in the 4 4 ˆ 8th minute, but of course it is not allowed to leave until its scheduled time, hence it leaves on time in the 10th minute.) To investigate this, let, r ˆ scheduled dwell time minimum required dwell time scheduled dwell time and refer to this as the `maximum dwell reduction ratio' or simply the `dwell reduction ratio'. This dwell reduction ratio may be different for different trains: for some trains the scheduled dwell time may already be close to its minimum and for others it may not. However, for simplicity we will assume here that the maximum dwell reduction r is the same for all trains. In the rest of this section, that is, Experiments 1, 3 and 4 and we assume that r is zero. In the following table and in Figure 3 we show the delay distributions which result from letting the maximum dwell reduction r be 0.0, 0.2, 0.4 and 0.8 respectively. To generate

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 673 these results we ran a separate 1,000 day simulation for each value of r. Maximum dwell reduction ratio, 0.0 0.2 0.4 0.8 r % of trains with exog arr. delay 85.7 85.7 85.7 85.7 less than 5 mins % of trains departing less than 72.0 78.0 80.6 82.7 5 mins late Difference 13.4 7.7 5.1 3.0 % of trains with exog arr. delay 91.2 91.2 91.2 91.2 less than 15 mins % of trains departing less than 89.4 92.2 93.7 95.4 15 mins late Difference 1.8 1.0 2.5 4.2 The `difference' rows in the table shows the differences between the percentage of trains arriving late and the percentage departing late. In Figure 3 the vertical axis gives the percentage of trains less than x minutes late, where x is corresponding lateness value on the horizontal axis. Hence, the higher the distribution curve in Figure 3, the lower the percentage of late trains, or the higher the train punctuality or performance. We see that, as expected, the larger the dwell reduction ration r the higher the distribution curve, hence the higher the percentage of trains departing on time. As discussed above, we wish to see if train punctuality is better or worse when trains are departing from the station than when they arrived. For this we compare the percentages of trains departing late with the percentages having exogenous arrival lateness: since we are here considering only one station, the exogenous arrival lateness implicitly include all delays incurred at all previous stations or on the lines between them. The graph of exogenous arrival lateness or punctuality is the straight line in Figure 3. Comparing this straight line (arrival punctuality) with the curves (departure punctuality) in Figure 3 shows that for larger values of r the deterioration in punctuality is less, or the improvement in punctuality is greater. However, we note that in this example, for all levels of r, the percentage of trains having zero arrival delays is always greater than the percentage having zero departure delays. This could be different in another example. For example, if the exogenous dwell delays were smaller that would decrease the departure delays more than the arrival delays. Comparing the straight line (arrival punctuality) with the lowest of the four curves in Figure 3 shows that if the dwell reduction ration is r ˆ 0 then train punctuality on departure is always less than on arrival. Now consider say r ˆ 0:4. For this compare the straight line (the arrival punctuality) with the third up (r ˆ 0:4) of the curves in Figure 3. This shows that if we are concerned with lateness up to about 11 minutes, then departure punctuality is worse than arrival punctuality. However, if we are concerned with lateness greater than about 11 minutes, then departure punctuality is better than arrival punctuality. Similar remarks apply to the curves for other values of r. In view of the above, train operators can use the ratio r as a policy instrument in designing more reliable schedules. By better management of resources at stations, operators may be able to cut minimum dwell times hence cut r and substantially improve punctuality of departures. Of course it is not only at stations that trains can get back on schedule. In practice, the scheduled trip times between stations are sometimes set slightly larger than the minimum time needed. This extra time or `recovery' time allows late trains to reduce their lateness. The distribution of arrival Figure 3 Effect of dwell reduction ratios on total departure delays.

674 Journal of the Operational Research Society Vol. 51, No. 6 lateness at the next station may then be `better' than the distribution of departure lateness at the present station. From a passengers perspective arrival punctuality matters more than departure punctuality. Experiment 3 How knock-on delays vary with size of exogenous delays. In Experiment 3 we simulated 1,000 days to obtain the distribution of knock-on delays (and distribution of total delays) when the exogenous delays are from a uniform distribution with bounds 0 and UL. We repeated this simulation experiment for 25 different values of UL, starting at UL ˆ 0 minutes and increasing in steps of 2 minutes up to UL ˆ 48. Also, we assumed exogenous dwell delays are on average about half as long as exogenous arrival delays, i.e., if the exogenous arrival delay is 2 to 20 the exogenous dwell delay is 0 to 10, since negative dwell delays are not allowed. For each of these 25 different simulation experiments we computed various statistics, for example, the percentage of trains more than 0, 5, 10, etc., minutes late. Here we graph some of the results (Figures 4(a)±(b)), to show how the knock-on delays (and total delays) increase as UL the maximum exogenous delays increases. Since the exogenous delays is uniformly distributed from 2 to UL, the expected exogenous delay is UL 2 =2, which increases as UL increases. We illustrate the results mainly for departures delays but the results for arrival and dwell time delays are similar. It can be seen that the knock-on delays increase fairly smoothly as the exogenous delays UL increase. For example, from the lowest curve in Figure 4(a) we see that if UL is say 10 minutes then 65% of trains experience no knock-on delay, and if UL is say 20 minutes then 55% of trains experience no knock-on delay. The characteristic shapes of the curves in Figures 4(a)± (b) can be explained as follows. Consider the lowest curve in Figure 4(a). In this curve the number of knock-on delays increases sharply at rst and then much more slowly. Each increase in the exogenous delays UL increases the likelihood that trains will loose their scheduled time slot and=or platforms. That is, if a train is late another train may have arrived at the platform, so that the late trains has to wait or go to a different platform. However, if a train is so late that it has already lost its scheduled time slot then any further lateness may have less effect on how soon it can nd a new slot, and on whether this causes further knock-on delays. This causes the curve in Figure 4(a) to atten out. Somewhat similar remarks apply to the next curve in Figure 4(a), but less so to the other curves. The reason is that the latter are caused by larger exogenous delays. With larger exogenous delays the trains have already lost their initial scheduled time slot, hence any further increases in the size of the exogenous delay will simply cause a proportionate increase in knock-on delays. An interesting feature of the curves is that, except for the lowest curve, they start off at or near to at. For example, the curve showing the percentage of trains with knock-on delays of `430 minutes' is at up until the maximum exogenous delay UL is 30 minutes. This indicated there are no knock-on delays greater than 30 minutes unless there are some exogenous delays greater than 30 minutes. Similarly, there are almost no knock-on delays greater than 10, 20, etc. minutes unless there are some exogenous delays greater than 10, 20, etc. minutes respectively. Effect of allowing platform changes on- Experiment 4 the- day. The most dramatic aspect of the simulation results is the effect of allowing or not allowing trains to change from their scheduled platforms. If a train arrives later than scheduled its scheduled platform may be already taken by a later train. In that case we could hold the late train until its scheduled platform is free, or send it to some other platform if one will be free sooner. Similarly, if a train departs later than scheduled, the next train scheduled to go to that platform may either wait until the platform is free, or go to another platform if one is free sooner. Note that any of these on-theday changes of train times or platforms may cause yet further knock-on changes to following trains. To explore the effect of allowing on-the-day changes of platforms we ran the experiments twice. Firstly we requiring that all trains go to their scheduled platforms. We refer to this as the ` xed platform model'. Secondly we ran all the experiments while allowing trains to be sent to a different platform if this would reduce lateness or other penalties (platform desirability). We refer to this as the ` exible platform model'. The results are illustrated in Figure 5. We found that allowing platforms to be changed in response to on-the-day lateness caused a dramatic reduction in knock-on delays. When exogenous delays are large, for example, up to 60 minutes, the mean size of knock-on arrival delays is reduced by about 90% and the mean size of knock-on departure delays is reduced by about 40%. This large reduction in the size and number of knock-on delays is perhaps larger than appears to be expected by rail operators. It has relevance for the design and operation of train stations. It suggests it is important that platforms be feasible for as many of the various train types as possible. This involves layout of lines and signals, but it also traveller information systems, and ensuring that travellers can easily walk from one platform to another. Some stations are designed so that changing trains from their scheduled platform is very inconvenient for passengers, involving long walks up and down stairs perhaps with luggage. On the other hand, some stations are designed so that all platforms are quickly accessible from a convenient central waiting area. In that case the platform

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 675 Figure 4 (a) Train performance decreases with the size of exogenous delays: knock-on departure delays. (b) Train performance decreases with the size of exogenous delays: total departure delays.

676 Journal of the Operational Research Society Vol. 51, No. 6 Figure 5 Effect on mean size of delays of allowing platform changes on the day. allocation for each train need not be announced until shortly before its arrival or departure. Indeed, the platform schedule need not be published in advance. This is the custom for the main multi-platform terminal stations around London. However, even in this case it is generally desirable to have an unannounced planned platform for each train so that on-the-day train controllers and operators need worry only about deviations from this schedule. Fitting equations to graphs of simulation results The graphs of the simulation results in Figures 1 to 5 are all fairly smooth. When we initially used small samples of days, say 10 or 20, the curves were much more jagged and uctuated randomly about the curves which were obtained. We increased the sample sizes until we obtained curves which are almost identical in repeated simulations, and hence have narrow con dence intervals. The smoothness of the curves is of course not only due to averaging over large numbers of days, but also to the inherent regularity in the system being simulated. Schedule performance, with beta distributions of exogenous delays In the previous Section we used uniform distributions for a set of simulations and experiments. Here we use different distributions and perform different simulation experiments. The pdfs of arrival, dwell and departure delays for scheduled transport services typically have a nite range, are unimodal and skewed bell-shaped with a longer tail of lateness than earliness. This is typically true for train delays including those on the BR network. One reason for this skewness is that if a train is running earlier than scheduled it can get back on schedule by slowing down, whereas if it is running late it may not get back on schedule as it has to respect prespeci ed maximum speeds and accelerations. The beta distribution pdf has all the above characteristics of a typical pdf of delays, hence it is often appropriate for modelling transportation delays and we use it here. Also, it has four parameters (a, b, T min and T max de ned below), which gives it more exibility in tting empirical data than a pdf such as the normal or exponential which have fewer parameters. The beta distribution has a pdf f de ned on the interval [0, 1] by, f x ˆxa 1 1 x b 1 ; 1 B a; b where a; b > 0 and B a; b is the beta function (hence the name of the distribution). An example of a beta density is given in Figure 6. The beta density can be rescaled and

M Carey and S CarvilleÐTesting schedule performance and reliability for train stations 677 shifted to be de ned on any nite interval, say [T min, T max ]. The pdf (1) then becomes, f x ˆ x T min a 1 T max x b 1 T max T min a b 1 B a; b : 2 We used shape parameters a ˆ 2, b ˆ 4, minimum delay T min ˆ 2, and maximum delay T max ˆ 20 minutes, which gives the beta distribution in Figure 6. This implies 63% of delayed trains are more than 5 minutes late, 19% are more than 10 minutes late, and 2% are more than 15 minutes late, which is typical of the pattern of exogenous delay for delayed trains in Britain. If we apply this distribution of exogenous delays to all BR trains then, when we add in the resulting knock-on delays, the total delays would be far in excess of the typical pattern of delays for BR. However, in practice only a percentage p of trains experience exogenous delay and this percentage is often different for different train types and parts of the rail network. In Experiment 5 we assumed p ˆ 20% of trains chosen at random experience exogenous delays, with a beta distribution, and we simulate this occurring every day for a 1,000 days. In Experiment 6 we experiment with different percentages p, starting at p ˆ 0% and increasing in steps of 2% to 50%. We choose a cut-off of 50% as it is unlikely that on any one day more than 50% of trains arriving would be subject to exogenous delay (as opposed to a knock-on delay). We conducted these experiments to show how the number of exogenous delays affects the number and size of knock-on delays. This is important to train operators seeking to reduce delays. In particular, we can nd the level of exogenous delays at which the total delays (exogenous plus knock-on) will exceed the punctuality targets set for train operators. For each day simulated in each of the above experiments we applied the beta distribution of delays to the train timetable for a busy station (Leeds). To train arrival times and station dwell times we added an exogenous delay drawn at random from the beta distribution. We then simulated all train arrivals and departures for thousands of days using the ATTPS package, 1 and kept a record of all knock-on delays and changes of platforms caused by the exogenous beta distribution delays. These experiments and their results are set out below. Experiment 5 Distribution of knock-on delays. To estimate the distributions of knock-on delays caused by given distributions of exogenous delays we proceeded as follows. Steps (iii)±(v) are the same as in Experiment 1, and the comments made there following Steps (iii)±(v) also apply here. (i) Choose percentages of train to experience exogenous delays of arrival times and dwell times. We initially chose p 1 ˆ 20% for arrivals and p 2 ˆ 10% for dwell times. (ii) Select p 1 % of trains at random and to the arrival time of each of these trains add a delay drawn at random from the above beta distribution of delays. Similarly for train dwell times. For dwell time delays we used a beta distribution with parameters T min ˆ 2toT max ˆ 20 minutes. (iii) Simulate running this perturbed timetable for one day, and record all train delays (exogenous and knock-on) for the dayðsee Figure 7(a). (iv) Repeat steps (ii)±(iii) 1,000 times, to simulate 1,000 daysðsee Figure 7(b). (v) Compute descriptive statistics (mean, median, standard deviation, etc.) for the distribution of delays in (iv). An example of the frequency distribution (or pdf) of delays over 1,000 days, as obtained from (iv), is given in Figure 7. For comparison, the Figure also shows the pdf of exogenous delays and the pdf of total delays (exogenous delay knock-on delays). Con dence intervals for (parameters of) the distribution of delays These can be computed in exactly the same way as in Experiment 1 above. Again, as in Experiment 1 the widths of the con dence intervals are all so small as to be negligible, indicating that the statistics obtained from the 1,000 day simulation, for example, the mean, median, percentiles, etc. of delays are accurate simulation estimates. Figure 6 Beta probability density on interval [0, 1] with shape parameters a ˆ 2, b ˆ 4. Experiment 6 How knock-on delays vary with % of trains subject to exogenous delays. In the above experiment we simulated 1,000 days to obtain the distribution of knock-on delays (and distribution of total delays) when a xed percentage ( p) of trains are subject to exogenous beta distribution delays. We repeated this simulation experiment for 25 different values of p, starting at