Vincent L. Bernardin (corresponding), Resource Systems Group

Size: px
Start display at page:

Download "Vincent L. Bernardin (corresponding), Resource Systems Group"

Transcription

1 Paper Author (s) Vincent L. Bernardin (corresponding), Resource Systems Group Steven Trevino, Resource Systems Group, Inc. John P. Gliebe, Resource Systems Group, Inc. Paper Title & Number Unbiased Estimation Of Destination Choice Models with Attraction Constraints [ITM # 12] Abstract Availability, capacity or attraction constraints are common in the application of destination choice models, especially for work location choice models. However, while common in application, shadow prices (Lagrangian or similar penalty terms) are rarely included in parameter estimation. It is known that this can lead to biased parameter estimates, but this is commonly ignored. This paper presents an empirical study of destination choice models developed for Iowa, using a genetic algorithm to develop parameter estimates with and without shadow prices to determine the significance of parameter bias in a practical application. Statement of Financial Interest The authors have no direct financial interest in this research to the extent that this research does not present proprietary methods or information to be offered for sale; however, the authors do have an indirect financial interest in the success of this research in so far as it promotes similar consulting engagements. Statement of Innovation Availability, capacity or attraction constraints are common in the application of destination choice models, especially for work location choice models. However, while common in application, shadow prices (Lagrangian or similar penalty terms) are rarely included in parameter estimation. It is known that this can lead to biased parameter estimates, but this is commonly ignored. This paper presents an empirical study of destination choice models developed for Iowa, using a genetic algorithm to develop parameter estimates with and without shadow prices to determine the significance of parameter bias in a practical application.

2

3 UNBIASED ESTIMATION OF DESTINATION CHOICE MODELS WITH ATTRACTION CONSTRAINTS Vincent L. Bernardin, Jr., Ph.D. Senior Consultant, RSG 2709 Washington Ave, Ste. 9 Evansville, IN Vince.Bernardin@RSGinc.com Ph: Steven Trevino Analyst, RSG 2709 Washington Ave, Ste. 9 Evansville, IN Steven.Trevino@RSGinc.com Ph: John Gliebe, Ph.D. Senior Consultant, RSG 2200 Wilson Blvd., Suite 205 Arlington, VA John.Gliebe@RSGinc.com Ph: Submitted: December 1, ,254 Equivalent Words

4 STATEMENT OF INNOVATION Availability, capacity or attraction constraints are common in the application of destination choice models, especially for work location choice models. However, while common in application, shadow prices (Lagrangian or similar penalty terms) are rarely included in parameter estimation. It is known that this can lead to biased parameter estimates, but this is commonly ignored. This paper presents an empirical study of destination choice models developed for Iowa, using a genetic algorithm to develop parameter estimates with and without shadow prices to determine the significance of parameter bias in a practical application. STATEMENT OF FINANCIAL INTEREST The authors have no direct financial interest in this research to the extent that this research does not present proprietary methods or information to be offered for sale; however, the authors do have an indirect financial interest in the success of this research in so far as it promotes similar consulting engagements.

5 OBJECTIVES, MOTIVATION AND INNOVATION Destination choice models as they are commonly called, or perhaps more properly, multinomial logit location choice models are used in a significant and increasing number of travel forecasting models. As of 2005, they were used by 5% of metropolitan planning organizations in the United States (1) and anecdotal evidence suggests that number is higher today and increasing with time, with destination choice models slowly replacing the gravity models used for distributing trips in traditional travel models. The gravity model is commonly found in both singly and doubly constrained forms, depending on whether the model is constrained to reproduce the exogenously estimated number of trips produced or attracted by a zone or both. In particular, doubly constrained models are nearly ubiquitous for distributing work trips, given the relatively high confidence in both the estimated number of work trips produced by households (given their number of workers) and the estimated number of work trips attracted by employment in a zone. Destination choice models have long been recognized as an extension or generalization of gravity models. (2) However, the traditional or simple multinomial logit model is actually only a generalization of the singly constrained gravity model. The doubly constrained gravity model corresponds to a more complex logit model with an availability or capacity constraint. Destination choice models with constraints do not observe the independence of irrelevant alternatives (IIA) property of simple multinomial logit models but rather belong to the broad class of universal or mother logit models defined by McFadden et al. (3) Location choice models with constraints have been explored and applied in both research and practice but with little connection. Research has produced several model formulations and primarily focused on modeling choice set formation, (4) (5) (6) (7) but these formulations have seen no significant use in practice. In contrast, the practice of applying destination choice models with shadow prices solved by iterative proportional fitting to enforce an attraction constraint has become commonplace. The common practice has been only to include these shadow prices in model applications, but to estimate the model parameters ignoring the constraint. The reason for this is understandable, as standard logit model estimation software packages do not allow the estimation of models with constraints of this sort. However, this is problematic, as it has been demonstrated in general (8) and can be clearly observed upon reflection, that failure to incorporate constraints when they exist leads to biased parameter estimates. A paper by de Palma et al. (9) examining residential location choice in Paris provides a lone example in the literature that adopts the constraint formulation common in practice and estimates parameters in this framework. The paper finds that the constrained model is both distinctly different than its unconstrained counterpart and superior at explaining observed choices. This brief presents the development of an application-based approach to parameter estimation capable of incorporating attraction constraints and avoiding specification bias. If accepted, the presentation will also present the empirical results of its application to develop destination choice

6 models for the Iowa statewide model, itram, comparing constrained and unconstrained parameter estimates and reporting on the statistical significance of their differences. If these differences are generally small, it might suggest that the common approach, while theoretically improper, is good enough. On the other hand, if the differences are significant and meaningful, it suggests that estimation algorithms capable of incorporating constraints such as the application-based one presented here are necessary to produce realistic models. Figure 1 Iowa NHTS Destinations METHODOLOGY The itram model includes a total of 3,314 zones (1,866 in Iowa). The Iowa DOT purchased an add-on sample to the National Household Travel Survey (NHTS) which provides the observations for destination choice model estimation. The sample included a total of 2,439 households, of which 1,745 had weekday travel diaries, of which 1,591 were 100% complete. For purposes of destination choice model estimation, all weekday person diaries and trips were considered, regardless of whether they came from a complete or partially complete household. The resulting dataset included 14,009 trips by 3,586 individuals. Initially, 11,973 of these trips

7 had a trip end geocoded to the street address or nearest intersection. However, with some additional geocoding work, the number of usable trip ends was increased to 12,337. The approach adopted for the estimation of destination choice model parameters for itram is an application-based genetic algorithm. The code to implement the estimation routine is entirely native in the model s application software, developed in its scripting language, and calls the actual model application code in the estimation process. In addition to being able to accommodate and incorporate attraction constraints, this approach also has the practical advantage of greatly reducing the opportunity for inconsistencies between the estimation and estimation data sets and the final resulting forecasting tool. The genetic algorithm begins by generating an initial population or set of solutions, each solution being a complete set of parameters. The algorithm then uses the application code to apply each candidate solution. Another module evaluates the fitness of each solution, calculating its log likelihood by comparing the predicted probabilities from the model application to the observed choices from Iowa s NHTS add-on sample. Once the fitness of all the candidate solutions in the population has been evaluated, the least fit solutions die or are removed from the population, the best solution is cloned to start the next generation s population and the remaining members of the new population are created by either mating or randomly recombining two solutions from the parent population (whose probability of reproducing is a function of their fitness) or by mutating a single solution from the parent population (again, whose probability of being selected for mutation is a function of its fitness). The process iterates until no further progress can be made and a maximum likelihood solution is produced. In this context, the likelihood is calculated using a slightly different procedure than in most estimation software, using the model s aggregate predictions. The model is applied given a set of parameters and produces a trip table matrix with rows representing production or residence zones and columns representing attraction zones. If partial or full market segmentation is used, the model is applied for each segment to produce a separate trip table for each market segment. Each matrix is then normalized by dividing by the marginal row sums, so that each row vector becomes a probability distribution over all zones, conditional on the home zone (and market segment). The log of the probability matrix is then multiplied (cell by cell) times a matrix of weighted observations from the survey data. The grand sum of the resulting matrix is the log likelihood (for the market segment). If market segments are used, the log likelihood for the whole model is obtained by summing the log likelihoods of the individual market segments. While this approach is partially aggregate it does not result in any information loss. Moreover, it allows for the use of inequality constraints on parameters and the use of efficient optimized aggregate application routines with minimal runtime, and as noted previously, obviates the need for sampling of alternatives common in destination choice model estimation, simplifying the estimation and potentially improving the statistical efficiency of the estimator. While the overall genetic programming approach is more computationally intensive than traditional estimation methods, this approach is robust to multiple optima which are possible for

8 constrained models. The genetic algorithm code used for this project has been successfully used previously to estimate truck model parameters (10), generalized cost parameters for static equilibrium assignment (11) and regional growth allocation models (12). MAJOR RESULTS As of the writing of this brief, Iowa s NHTS data had been processed and the genetic algorithm code was in the process of being integrated with the model application code. Parameter estimation will begin shortly and will be complete in January or February of 2014 at latest. Results will be ready for presentation well in advance of the 5 th TRB Innovations in Travel Demand Forecasting Conference in late April of Both constrained and unconstrained versions of the home-based work model will be estimated at minimum. Parameter estimates for these two models will be presented together with the results of t-tests to evaluate the significance of their difference. IMPLICATIONS FOR TRAVEL MODELING Given the increasing prevalence of destination choice models in travel modeling, this study is important to ensure that these new models are developed properly and provide more realistic results, as they promise. Poorly estimated, biased destination choice models could lead agencies to abandon them in favor of simpler gravity models that they believe more reliable. The itram project demonstrates that in general it is not infeasible to estimate destination choice model parameters with constraints and presents one attractive method for doing this. Further implications will depend on its results, but the findings should provide some preliminary indication of the urgency of shifting the practice away from the common approach of estimating destination choice model parameters without accounting for constraints. However, this study will only provide evidence from one dataset and one state, so further investigation will be necessary, regardless, to confirm any preliminary findings from this study. REFERENCES 1. TRB Special Report 288 Metropolitan Travel Forecasting: Current Practice and Future Direction. Transportation Research Board of the National Academies, Washington, D.C., Daly, A. Estimating Choice Models Containing Attraction Variables, Transportation Research, Part B: Methodological. Vol. 16, No. 1, 1982, pp McFadden, D., K. Train and W. Tye. An Application of Diagnostic Tests for the Independence of Irrelevant Alternatives Property of the Multinomial Logit Model. Transportation Research Record, No 637, 1977, pp Zheng, J. and J. Guo. Destination choice model incorporating choice set formation. TRB 87th Annual Meeting Compendium of Papers, Pagliara, F. and H. Timmermans. Choice set generation in spatial contexts: a review. Transportation Letters: the International Journal of Transportation Research, Issue 3, 2009, pp Martinez, F., F. Aguila and R. Hurtubia. The constrained multinomial logit: A semi-compensatory choice model. Transportation Research Part B, Vol. 43, 2009, pp

9 7. J. Auld and A. Mohammadian. Planning Constrained Destination Choice in the ADAPTS Activity- Based Model. TRR Satomura, T., J. Kim, G. Allenby. Multiple-Constraint Choice Models with Corner and Interior Solutions. Marketing Science, Vol. 30, No. 3, 2011, pp 9. De Palma, A., N. Picard and P. Waddell. Discrete choice models with capacity constraints: an empirical analysis of the housing market of the greater Paris region. Journal of Urban Economics, Vol. 62, No. 2, 2007, pp Bernardin, V., S. Shokouhzadeh, L. Klieman & V. Lingala. A Genetic Algorithm to Develop Truck Model Parameters from Local Truck Count Data. Presented at the 13th TRB National Transportation Planning Applications Conference, Reno, NV, May Bernardin, V., S. Trevino, S. Shokouhzadeh and M. Conger. Improving Static Assignments Using Genetic Algorithms to Estimate Parameters for Complex Generalized Costs. Presented at the 4rd TRB Conference on Innovations in Travel Modeling, Tampa, Florida, May Bernardin, V., J. Gliebe and S. Shokouhzadeh. An Integrated Two-Tiered System for Land Use Modeling, Planning and Visualization for a Mid-Sized Midwest City. Presented at the 14th TRB National Transportation Planning Applications Conference, Columbus, OH, May 2013.