Transportation Research Forum

Similar documents
THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 8 SURVEY SAMPLING AND ESTIMATION

HOUSEHOLD SOLID WASTE RECYCLING INDUCED PRODUCTION VALUES AND EMPLOYMENT OPPORTUNITIES IN TAIWAN

2.36 Bridge Inspections. Introduction. Scope and Objective. Conclusions

DO ATTITUDES AFFECT BEHAVIORAL CHOICES OR VICE-VERSA: UNCOVERING LATENT SEGMENTS WITHIN A HETEROGENEOUS POPULATION

Using Matrix to Solving the Probabilistic Inventory Models (Demand Model)

Referrals in Search Markets

DECOMPOSING PURCHASE ELASTICITY WITH A DYNAMIC STRUCTURAL MODEL OF FLEXIBLE CONSUMPTION. Tat Chan. Chakravarthi Narasimhan.

Richard Bolstein, George Mason University

JEL codes: F10, F12, F14

Consumer price indices: provisional data December 2015

MANY ROADS TO TRAVEL: ALTERNATIVE APPROACHES TO ROUTE SELECTION FOR YUCCA MOUNTATION SHIPMENTS

EFFICIENCY: WASTE. MICROECONOMICS Principles and Analysis Frank Cowell. Almost essential Welfare and Efficiency. Frank Cowell: Efficiency-Waste

Vehicle Occupancy Data Collection Methods

Consumer price indices: provisional data December 2016

Melt Pool Size Control in Thin-Walled and Bulky Parts via Process Maps

Effect Weibull Distribution Parameters Calculating Methods on Energy Output of a Wind Turbine: A Study Case

Poverty and vulnerability: a static vs dynamic assessment of a population subjected to climate change shock in Sub-Saharan Africa.

ASSESSMENT OF THE POWER CURVE FLATTENING METHOD: AN APPROACH TO SMART GRIDS

Consumer price indices: final data

Scaling Effects in Laser-Based Additive Manufacturing Processes

A Low-Temperature Creep Experiment Using Common Solder

Consumer prices: final data

The limits to profit-wage redistribution: Endogenous regime shifts in Kaleckian models of growth and distribution

Impact of Sampling on Small Area Estimation in Business Surveys

Poverty Effects of Higher Food Prices

A Study on Pendulum Seismic Isolators for High-Rise Buildings

AN ASSESSMENT OF VULNERABILITY TO POVERTY IN RURAL NIGERIA

EFFECTIVE UTILIZATION OF FLYWHEEL ENERGY STORAGE (FES) FOR FREQUENCY REGULATION SERVICE PROVISION MIRAT TOKOMBAYEV THESIS

Structural Change and Economic Dynamics

Improved Fuzzy Load Models by Clustering Techniques in Distribution Network Control

Corporate Governance, Entrenched Labor, and Economic Growth. William R. Emmons and Frank A. Schmid

COMPETENCE OF PHA TEAMS

Consumer Panic Buying and Quota Policy under Supply Disruptions

DEPARTMENT OF ECONOMICS

Power-Aware Task Scheduling for Dynamic Voltage Selection and Power Management for Multiprocessors

Københavns Universitet. A regional econometric sector model for Danish agriculture Jensen, Jørgen Dejgård; Andersen, Martin; Christensen, Knud

Consumer prices: final data November 2017

Measurement and Reporting of Vapor Phase Mercury Emissions from Low-Emitting Stationary Sources (DRAFT 9/25/08)

On Activity-based Network Design Problems

ROBUST SCHEDULING UNDER TIME-SENSITIVE ELECTRICITY PRICES FOR CONTINUOUS POWER- INTENSIVE PROCESSES

Protecting the Environment and the Poor:

Consumer prices: final data July 2017

UC Berkeley Research Reports

Estimation of Critical Stress in Jointed Concrete Pavement

Personalized Pricing and Quality Differentiation on the Internet

The Division of Labour under Uncertainty. Nigel Wadeson *

Consumer prices: provisional data April 2017

Consumer prices: provisional data January 2017

2.3 Creation of Crown Agencies and Borrowing without Authority

BOD 5 removal kinetics and wastewater flow pattern of stabilization pond system in Birjand

Manpower Requirements of Malaysian Manufacturing Sector under the Third Industrial Master Plan

DYNAMIC ROOF CRUSH INTRUSION IN INVERTED DROP TESTING

The Welfare Effects of Pfiesteria- Related Fish Kills: A Contingent Behavior Analysis of Seafood Consumers

ANALYSIS OF DEEPSTALL LANDING FOR UAV

PHASE CHANGE MATERIALS

The impact of soda taxes on consumer welfare: implications of storability and taste heterogeneity

A Novel Smart Home Energy Management System: Cooperative Neighbourhood and Adaptive Renewable Energy Usage

Efficient Resource Management using Advance Reservations for Heterogeneous Grids

PREDICTION OF METAL PLASTICITY DURING THE METAL FORMING PROCESS. Y.E. Beygelzimer (DonSTU, Ukraine), D.V. Orlov (DonSTU, Ukraine)

Open Access The Current Situation and Development of Fire Resistance Design for Steel Structures in China

ON THE REINFORCED RELIABILITY OF FORWARD COLLISION WARNING SYSTEM WITH MACHINE LEARNING

2.37 Inland Fish and Game Licences. Introduction 1997 $ 1, , , , , ,102

The Study on Identifying the Relationship between Opportunity Recognition and Sustainability in Small Business in Sri Lanka

Biofuels Role in Mexico s Rural Development

TECHNICAL NOTE. On Cold-Formed Steel Construction DESIGN CONSIDERATIONS FOR FLEXURAL AND LATERAL-TORSIONAL BRACING

Strategic Competition and Optimal Parallel Import Policy.

DIRECT VERSUS TERMINAL ROUTING ON A MARITIME HUB-AND-SPOKE CONTAINER NETWORK

Abrand choice model with heterogeneous price-threshold parameters is used to investigate a three-regime

Equation Chapter 1 Section 1

R-20F method: An approach for measuring the isolation effect of foams used fighting forest fires

Research on the Cost Curves and Strategies Related to the Carbon Emission Reduction in China

Optimization of maintenance strategies and ROI analysis of CMS through RAM-LCC analysis. A wind energy sector case study.

M.Tech Scholer J.P.I.E.T, Meerut, Uttar Pradesh, India. Department of computer science J.P.I.E.T, Meerut, Uttar Pradesh, India

Labor Supply with Social Interactions: Econometric Estimates and Their Tax Policy Implications

Introduction of Prediction Method of Welding Deformation by Using Laminated Beam Modeling Theory and Its Application to Railway Rolling Stock

Draft for Public Comment Australian/New Zealand Standard

Block Order Restrictions in Combinatorial Electric Energy Auctions

A NON-PARAMETRIC ESTIMATOR FOR RESERVE PRICES IN PROCUREMENT AUCTIONS

Estimating technical efficiency and the metatechnology ratio using the metafrontier approach for cropping systems in Kebbi State,Nigeria

In te current study, wind-induced torsional loads on low and medium eigt buildings were examined in te boundary layer wind tunnel. uilding model (scal

Granular filtration LAB EXPERIMENTS

Texto para Discussão. Série Economia

technicalmonograph Natural ventilation strategies for refurbishment projects Can we avoid mechanical ventilation?

Computer Simulated Shopping Experiments for Analyzing Dynamic Purchasing Patterns: Validation and Guidelines

Chapter 2. Functions and Graphs. 03 Feb 2009 MATH 1314 College Algebra Ch.2 1

Citation for published version (APA): Riezebos, J. (2002). Time bucket size and lot-splitting approach. s.n.

The Effect of Shocks and Remittances on Household s Vulnerability to Food Poverty: Evidence from Bangladesh

AUTHOR ACCEPTED MANUSCRIPT

WSEAS TRANSACTIONS on POWER SYSTEMS

Optimal Scheduling of Heat Pumps for Power Peak Shaving and Customers Thermal Comfort

Scaling Soil Water Retention Curves using a Correlation Coefficient Maximization Approach

The German value of time (VOT) and value of reliability (VOR) study The survey work

AIR-VOID-AFFECTED ZONE IN CONCRETE BEAM UNDER FOUR-POINT BENDING FRACTURE

Sustainable transportation and order quantity: insights from multiobjective optimization Bouchery, Y.; Ghaffari, A.; Jemai, Z.; Fransoo, J.C.

DRAFT PAPER MODELING AND VISUALIZATION FOR IMAGING OF SUBSURFACE DAMAGE

Evaluating adaptability of filtration technology to high-turbidity water purification

Outboard Engine Emissions: Modelling and Simulation of Underwater Propeller Velocity Profile using the CFD Code FLUENT

ANALYSIS OF PLANNING, MANAGEMENT AND EXECUTION OF MOTOR VEHICLES REPAIR SERVICES FOR THE PURPOSES OF DEVELOPMENT OF AN OPERATIONAL PLANNING MODEL

Asymmetric Information and. Limited Information about Price Tourists and Natives. Informed and Uninformed Customers. Few informed customers 11/6/2009

Measurement and simulation of dissolved oxygen in Zayandehrood river

Transcription:

Transportation Researc Forum Comparison of Alternative Metods for Estimating Houseold Trip Rates of Cross-Classification Cells wit Inadequate Data Autor(s): Judit L. Mwakalonge and Daniel A. Badoe Source: Journal of te Transportation Researc Forum, Vol. 51, No. 2 (Summer 2012), pp. 5-24 Publised by: Transportation Researc Forum Stable URL: ttp://www.trforum.org/journal Te Transportation Researc Forum, founded in 1958, is an independent, nonprofit organization of transportation professionals wo conduct, use, and benefit from researc. Its purpose is to provide an impartial meeting ground for carriers, sippers, government officials, consultants, university researcers, suppliers, and oters seeking excange of information and ideas related to bot passenger and freigt transportation. More information on te Transportation Researc Forum can be found on te Web at www.trforum.org.

JTRF Volume 51 No. 2, Summer 2012 Comparison of Alternative Metods for Estimating Houseold Trip Rates of Cross-Classification Cells Wit Inadequate Data by Judit L. Mwakalonge and Daniel A. Badoe Tis paper investigates te forecast performance of a traditional cross-classification model and alternative models tat seek to address te sortcomings of traditional cross-classification analysis, specifically wen it as cells wit inadequate data. Te study uses five cross-sectional datasets collected in te San Francisco Bay Area in 1965, 1981, 1990, 1996, and 2000. Alternative models, estimated wit travel data collected in te base year, were assessed for teir ability to replicate te number of trips made by ouseolds in eac cell of a cross-classification matrix and at te traffic zone level, respectively, in eac of te five years. Te results sowed tat te traditional crossclassification analysis (CCA) model, notwitstanding aving a few unreliable cells provided more consistent predictions of travel tan any of te alternative metods. Tey also sow tat it is better to syntesize trip rates for only tose cells of te cross-classification matrix wit inadequate data rater tan to adjust te entire trip-rate matrix as is currently te practice. INTRODUCTION Te four-step Urban Transportation Modeling System (UTMS) continues to be te metod adopted by te majority of metropolitan planning organizations for simulating traffic volumes using te links of urban transportation networks (TRB 2007). Tis paper focuses on trip generation, te first step of te four-step UTMS. Given te sequential nature of UTMS, improved forecast accuracy at te trip generation stage is important to reducing errors in te forecasts emanating from te final step of te process. A number of metods for accomplising trip generation are documented in te travel demand modeling literature. Tese include multiple linear regression (Cotrus et al. 2003, Ewing et al. 1996), cross-classification analysis (Walker and Peng 1991, Rengaraju and Satyakumar 1995), discrete coice models (Zao 2000), fuzzy logic models, and artificial neural networks (Huisken 2000). However, of tese metods, cross-classification analysis (CCA) is te most widely used in practice (Rengaraju and Satyakumar 1994). Cross-classification analysis involves te use of trip rates (i.e., trips per person or trips per ouseold) to compute regional travel demand. Recognizing te eterogeneity in regional populations, te approac first divides te population into relatively omogeneous groups or categories based on two or tree ouseold attributes. Tereafter, a trip rate is calculated for eac relatively omogeneous group. Te tecnique is non-parametric in tat it does not assume any probabilistic distributional relationsip between te dependent and explanatory variables. Furtermore, te metod makes use of te raw data obtained from a ouseold travel beavior survey directly, and its simplicity as made it attractive to practitioners (Rengaraju and Satyakumar 1994). Te metod, owever, as its sortcomings. First, given te typical size of travel survey samples tat most planning agencies ave available for travel demand model development, cross classifying te sample into a large number of relatively omogeneous categories leaves some cells wit few or no observations for te computation of trip rates. Tese problematic cells typically exist at te extreme ends of te cross-classification matrix. As an example, te proportion of ouseolds in an urban area wit a single person and owning tree 5

Houseold Trip Rates or more veicles is likely to be very small. A simply drawn random sample of ouseolds from te regional population may include few or no ouseolds wit suc caracteristics. Terefore, cross classifying te travel data could result in suc a cell being empty, making it impossible to estimate directly a trip rate for it. Second, te estimated trip rates of te cross-classification matrix suffer from differential reliability resulting from te differences in te numbers of ouseolds in eac cell for trip-rate computation. Trip rate is te expected number of trips a ouseold makes per day. Tis difference in reliability could result in counterintuitive trip-rate progressions in te trip-rate matrix. Tese two sortcomings among oters documented in te literature ave spurred researcers to investigate new tecniques for improving upon te basic model. Examples of tese studies include tose by Rengaraju and Satyakumar (1994), Kikuci and Ree (2003), and Stoper and McDonald (1983). Te most known of tese metods, proposed by Stoper and McDonald (1983), makes use of multiple classification analysis (MCA). However, its implementation also raises concerns. First, it modifies all te trip rates obtained using te CCA procedure, notwitstanding several cells in te matrix aving adequate data for computation of reliable trip rates. Second, sometimes implementation of te MCA procedure results in te computed trip rate for some of te cells of te classification matrix aving a negative sign, wic is not meaningful. Te analyst addresses te resultant negative triprates problem by assigning a zero trip rate to suc a cell (Ortuzar and Willumsen 2001). Assigning zeros to cells tat eiter ad values earlier or were empty in CCA is unrealistic. Kikuci and Ree (2003) applied a fuzzy optimization metod to syntesize missing cell values and adjust cell values wit abnormal beavior wen compared to neigboring cells. However, te fuzzy optimization metod, like te MCA, canges te cell values of te entire classification matrix instead of te cells wit inadequate data. Additionally, te fuzzy optimization tecnique requires knowledge of a programming language and is terefore not readily accessible to transportation planners, wic limits its use by practitioners. Tus, wile tese attempts to remedy te weaknesses of CCA are recognized, te problem of adjusting te trip rates tat are derived from te observed sample persists. Additionally, it appears tat no study as investigated bot te sort-term and long-term forecast performance of te metods proposed to remedy te sortcomings of CCA. Guevara and Tomas (2007) recommended not using te MCA metod proposed by Stoper and McDonald (1983). However, teir recommendation was based in part on analysis done using a single origin-destination survey data. Furter, tey conducted teir model evaluation using forecasted land use scenarios and not observed land use and travel caracteristics. Te above discussion motivates an investigation into alternative metods or modifying existing metods for syntesizing trip rates for cross-classification cells wit no data tat do not require te modification of trip-rate values for cells wit adequate data. Specific objectives of te paper are, first, to develop trip generation models using CCA and MCA, respectively, and to compare ow te models perform in te prediction of travel in te base year. Te second objective is to compare te performance of bot CCA and MCA models in sortterm and long-term forecast applications. Te tird is to present alternative metods for addressing te sortcomings of CCA and to compare te forecast performance of tese alternative metods against te models developed using CCA and MCA, respectively. Te rest of te paper is organized as follows. In te second section, te teory underlying te existing and proposed metods for estimating a trip rate for a cross-classification cell wit no data are presented. Te tird section presents te descriptive analysis of te travel data used in te researc. Te fourt section presents te model estimation results and results from applying te alternative metods in predicting travel. Finally, te last section presents a summary and conclusions drawn from te study. 6

JTRF Volume 51 No. 2, Summer 2012 ALTERNATIVE MODELS FOR SYNTHESIZING TRIP RATES FOR CROSS-CLASSIFICATION CELLS WITH NO DATA Tis section presents a brief description of te teory underlying te alternative models investigated in tis study. Te existing models considered in tis study include CCA and MCA models. Te current practice is to employ te MCA tecnique tat modifies te wole cross-classification triprate matrix. However, MCA can also be used to estimate trip rates for empty cells and unreliable cells. Terefore, tis study makes use of MCA models and tecniques employed in estimating missing values to compute trip rates for empty and less reliable cells. Te tecniques for estimating missing values investigated in tis researc are Multiple Imputation (MI) and K-Nearest Neigbor (KNN). Te teory of eac of tese metods is discussed in turn below. Cross-Classification Analysis As discussed in te introduction, CCA involves te computation of trip rates typically at te ouseold level. However, recognizing te eterogeneity in travel beavior tat exists among ouseolds in an urban region, ouseolds are grouped according to two or more caracteristics tat are strongly associated wit trip-making beavior. Houseolds belonging to eac defined group are terefore assumed relatively similar in trip-making beavior. Te model s basic assumption is tat ouseold trip rates remain stable over time for defined ouseold stratifications. It sould be noted tat te model could be developed for eac trip purpose. However, in tis researc, we consider trips made across all trip purposes by a ouseold and two-ouseold attributes for defining groups of similar travel beavior. Te ouseold trip rate for eac defined group is calculated as: (1) Were m, n = values of two-ouseold attributes used in defining omogeneous groups (cells) y mn = trip rate for cell of cross-classification matrix wit ouseold attribute values mn y mn = trips made by ouseold in cell mn H mn = total number of ouseolds in cell mn Multiple Classification Analysis MCA is similar to multiple regression analysis wit dummy variables. Te approac is applicable were te dependent variable is quantitative and te explanatory variables are categorical, represented by dummy variables. Terefore, MCA wit one categorical variable is equivalent to one-way Analysis of Variance (ANOVA), similarly MCA wit two categorical variables correspond to two-way ANOVA (Reterford and Coe 1993). Stoper and McDonald (1983), as a remedy to te sortcomings of CCA, were te first to apply te tecnique in trip generation analysis. Tereafter, several researcers (Ortuzar and Willumsen 2001, Wardman and Preston 2001, Abdel-Aal 2004) applied te metod. However, none of te mentioned studies used MCA to estimate trip rates for empty and/or unreliable cells only. Rater, tey employed it to modify te wole trip-rate matrix. Te general matematical form of te MCA model is expressed as: (2) y Hmn ymn mn = =1 Hmn _ ymn = Gµ + αm + βn + ε mn 7

Houseold Trip Rates Were y mn G µ m β n ε mn = te trip rate for a cell in a cross-classification matrix wit ouseold attribute values mn = te grand mean of trips made by te ouseolds in te dataset = te column-effect for column m of a cross-classification matrix = te row-effect for row n of a cross-classification matrix = error term For comparison purposes, tis study reviews and investigates tree MCA models designated as MCA1, MCA2, and MCA3. Te first, MCA1, takes te following form (Guevara and Tomas 2007). (3) _ Were y = G +α +β { M mn µ m n m n N (4) (5) (6) Gµ H y = =1 H ymn n N αm = G µ H n N mn ymn m M βn = G µ H m M mn N, M = te respective number of classes for te two stratification variables n, m = te values of two ouseold attributes used in defining omogeneous groups (cells) H = te total number of ouseolds y = te trips made by ouseold G µ = te grand mean of trips made by te ouseolds in te dataset m = column effect for column m of a cross-classification matrix β n = row effect for row n of a cross-classification matrix = error term ε mn Te second MCA model, MCA2, takes te same matematical form as te first one except te row and column effects are calculated as weigted means, wic terefore takes into consideration te unequal number of observations in te cells of te cross-classification matrix (Stoper and McDonald 1983, Guevara and Tomas 2007). (7) (8) αm = w mn y mn / w mn Gµ n N n N βn = wmnymn / wmn Gµ m M m M 8

Were w mn y mn G µ β n m JTRF Volume 51 No. 2, Summer 2012 = weigting factor for cell mn = trip rate for a cell in a cross-classification matrix wit ouseold attribute values mn = overall mean tat is average number of trips per ouseold = row effect for row n of a cross-classification matrix = column effect for column m of a cross-classification matrix Te tird, MCA3, is from an MCA regression of ouseold trips on all classification variables. However, te model is sligtly different from ordinary least squares in tat wen calculating te marginal effect of an explanatory variable, te oter explanatory variables are eld constant at teir mean values in te entire sample (Reterford and Coe 1993). Te model s matematical form is (9) _ y mn = a+ β X n n N n + α X m m M m Ten te trip rates for te categories of variable X n are calculated as: (10) _ y mn = a+ β X n n N Were X n, X m = 1 if te nt or m t element of X is observed, and equals a zero oterwise. y mn = trip rate for a cell in a cross-classification matrix wit ouseold attribute values mn β n = row effect for row n of a cross-classification matrix m = column effect for column m of a cross-classification matrix n and m are initial classes tat are considered as reference classes, ence a constant a to be estimated is added. Multiple Imputations (MI) n + α m m M _ X m MI is a tree-step approac tat employs regression analysis to impute missing values (Rubin 1976). Te first step is to estimate a model using observations wit complete data and, tereafter, use te estimated model to fill in te missing values. Te second step is to estimate a model using a complete data set wit bot observed and imputed values. For tis case, te analyst substitutes predicted values for te missing values to create imputed datasets. Te procedure is repeated until te analyst as te desired number of imputed datasets. Usually, tree to ten imputed datasets are desirable (Wayman 2003). Finally, te estimates from steps one and two are combined to account for te uncertainty regarding te imputation. In matematical form, te joint distribution is a function of te marginal and conditional distribution and it is represented as (Horton and Kleinman 2007): (11) f ( Y, X ) = f miss obs ( Y, Y X, β) P( X ) Were Y obs = observed dependent variable (trip rates) Y miss = missing dependent variable X = vector of explanatory variables (two ouseold attributes used in defining omogeneous groups (cells)) β = vector of parameters miss obs f( Y, Y X, β) = Conditional probability distribution P(X ) = Marginal probability distribution 9

Houseold Trip Rates Te final imputed estimate is te combined estimate tat follows Rubin s procedure (Rubin 1976), wic is a simple average of individual estimates from te observed and imputed datasets. Matematically tis is, (12) y K = (1/ K) y i i= 1 Hmn (13) ymn = ymn / Hmn = 1 Were K= number of imputed full datasets All oter variables are as defined earlier. K-Nearest Neigbor (KNN) KNN is a tecnique for estimating unobserved data based on te caracteristics and values of te observed nearest data. KNN tecnique as been widely applied in medical researc and geosciences (Muammad et al. 2004) but less so in transportation. Te simplicity of te KNN metod motivated its application in estimating empty cells in te trip-rate matrix. Selection of nearest cells is determined based on similarity in caracteristics between te filled nearest cells and te empty cell. For example, a missing trip rate for a single-person ouseold wit four or more veicles may ave similar caracteristics to a single-person ouseold wit tree veicles, since bot ouseolds ave surplus veicle supply. Terefore, a missing cell value is computed by weigting te predetermined nearest cell values as follows, (14) (15) y mn = w mn yˆ mn / wmn m n N M m n N M wmn = / 2 omn σmn Were / σ 2 mn o mn = variance estimate for te mn t nearest cell = number of observations in te mn t nearest cell All oter variables are as defined earlier. DATA Te researc uses five cross-sectional datasets collected in different years (1965, 1981, 1990, 1996, and 2000) in te San Francisco Bay area. Te 1965 dataset as information on more tan 20,000 ouseolds, wile te 1981 dataset as information on more tan 7,000 ouseolds. Te 1990 dataset as information on more tan 9,000 ouseolds, wile te 1996 dataset is te smallest sample wit information on a little more tan 3,600 ouseolds. Finally, te 2000 dataset as information on more tan 15,000 ouseolds. Te analysis presented below uses te sample data and unlinked trips. Information on linked trips and te trip-linking procedure are in MTC (2003). Te five datasets are comparable since te region as remained relatively stable in terms of geograpic area. However, te survey instrument canged from ome interview to telepone interview (1981 onward), and trip recall to activity diary (1996 and 2000 are activity-based surveys). In te context of ow te alternative modeling metods are to be assessed in tis study, te differences in instruments are unlikely to pose any problems. 10

JTRF Volume 51 No. 2, Summer 2012 Trip Rate Distribution Wit te exception of 1981, Figure 1 sows tat te ouseold trip rate in te Bay Area remained relatively stable. Tere is a noticeable decrease in ouseold trip rates in 1965 compared wit 1981. Purvis (1994) reported tat oter major cities, namely Dallas and Denver, exibited te same pattern in trip-making beavior and noted tis decrease in trip rate. However, at te individual level, tere is a progressive increase in trip rate from 1965 to 1996, and tereafter it remained stable. Figure 1: Houseold and Person Trip Rate by Survey Year Trip rate 10 8 6 4 Person Houseold 2 0 1965 1970 1975 1980 1985 1990 1995 2000 Year Houseold Size Houseold size affects travel demand; on average, te larger a ouseold, te greater its activity needs and te number of trips made. Figure 2(a) sows ouseold size distribution across te analysis years. Generally, tere is an increase in single person ouseolds and a decrease in four or more person ouseolds from 1965 to 2000. Altoug te trip rate increases wit ouseold size, it increases at different rates over te analysis years across different ouseold groups. For example, tere is a dramatic increase in travel demand from 1981 to 1990 for ouseolds wit tree to four persons. Tis increase is partly explained by a more tan 10% increase in te working age group (age 36 to 55), a small trip-rate increase of 0.43 trips for single-person ouseolds and an increase of 1.30 trips for two-person ouseolds. All else being equal, travel beavior was stable from 1965 to 1981 and from 1996 to 2000 as sown in Figure 2 (b). Veicle Ownersip People purcase veicles wit te aim of increasing teir mobility and activity participation. On average, te greater te number of veicles owned by a ouseold, te greater te number of trips tey are likely to make by veicle. As observed, te percentage of ouseolds wit no veicle was iger in 1981 tan in 1965. Wit te exception of te 1990 ouseold trip rates, tere is a consistent, altoug minor increase in trip rate for zero-veicle ouseolds from 1965 to 2000, and a stable trip rate for ouseolds wit one or more veicles. Houseolds wit tree or more veicles ad a muc iger trip rate in 1990 tan in any of te oter years. Figure 3 is a grapical summary of tese details. 11

Houseold Trip Rates Figure 2: (a) Houseold Size Distribution by Survey Year (b) Trip Rate by Houseold Size for Eac Survey Year Variable Selection Te 1965 dataset as eigt potential explanatory variables. Te objective was to select two or tree tat could capture most of te variation in ouseold trips. In accomplising tis objective, te study uses analysis of variance procedure (ANOVA), and te results are in Appendix A. At te 5% level of significance, te results sow tat ouse tenure (own or rent) and dwelling type, wit respective probabilities of 0.2942 and 0.2556, were not statistically significant. Variables tat were statistically significant are ouseold size, number of ouseold members wit drivers licenses and ouseold income, te number of motorcycles owned by a ouseold and veicles owned by a ouseold. Appendix A sows tat te number of ouseolds wit drivers licenses correlates moderately wit te number of veicles owned by a ouseold. Consequently, te analysis uses te number of veicles owned by a ouseold and ouseold size as te stratification variables. EMPIRICAL TEST Test Procedure Te assessment of te performance of te alternative metods for developing a cross-classification model for trip generation involved five steps. In te first step, te study estimates te CCA model and te tree MCA models wit te 1965 data using te ouseold as te modeling unit. Te 12

JTRF Volume 51 No. 2, Summer 2012 Figure 3: (a) Veicle Ownersip Distribution by Survey Year (b) Trip Rate by Veicle Ownersip for Eac Survey Year second step uses eac of te four models to predict travel collectively made by ouseolds in eac classification cell and by ouseolds in eac traffic analysis zone in 1965, respectively. Te latter assessment of model performance at te traffic zone level is important because trip distribution, a step in te four-step UTMS, requires as input trip productions and trip attractions at te traffic zone level. In te tird step, te study uses te four models in step one to predict ouseold travel in eac cross-classification cell in 1981, 1990, 1996, and 2000, respectively. In te fourt step, eac of te metods proposed for syntesizing trip rates for cross-classification cells wit little or no data was applied to predict te trip rate for only tose cells of te traditional cross-classification matrix considered unreliable. Te ouseold trip rates from te traditional CCA were preserved for te cells wit enoug observations. Finally, te fift step uses te cross-classification matrices from te fourt step to predict ouseold travel in all te years for wic data were available. RESULTS AND DISCUSSION Estimated Models Te results of te model estimation in te first step are in Table 1. Tey sow tat eac cell as a different number of observations. For example, for single-person ouseolds, te sample size for 13

Houseold Trip Rates tose wit no veicle is 1,062, wereas for tose wit four or more veicles it is 11. Given tat te reliability of te trip rate for eac cell is a function of te number of observations in te cell, it is apparent tat tere are differences in cell reliability in te CCA model due to eac cell aving a different number of observations. In te descriptive analysis presented earlier, tere was a monotonically increasing relationsip between trip rate and ouseold size (Figure 2b). A similar relationsip was observed between trip rate and veicle ownersip (Figure 3b). However, tis increasing trend is not consistently observed wen one examines ow trip rates tat are conditional on a specific ouseold size vary wit veicle ownersip or ow trip rates tat are conditional on a specific veicle ownersip level vary wit ouseold size in te CCA matrix. As an example, Figure 2b sows tat for a ouseold size of one (single-person ouseold), trip-rate increases wit increasing veicle ownersip until a ouseold veicle ownersip level of tree wen it drops, and ten increases tereafter for tose single person ouseolds tat ave four or more veicles. A similar observation in Figure 3b regards te relationsip between trip rate and ouseold veicle ownersip for two-person ouseolds. As expected, Table 1 sows a counterintuitive progression in trip rates in te less reliable cells (e.g., single-person ouseold wit tree veicles). To address tis problem, te practice is to employ MCA; and te results of doing so for te different metods are in Table 1. MCA2 yielded a trip rate for single-person ouseolds wit no veicle tat as a negative sign. Since a negative trip rate is unrealistic, te practice is to set it to zero (Ortuzar and Willumsen 2001). However, for illustration, Table 1 preserves tis negative-valued trip rate altoug in te forecasting analysis done later it is set to zero. MCA1 and MCA3 yielded ouseold trip rates wit trends tat are consistent wit expectation; tat is, iger ouseold trip rates for iger values of veicle ownersip and ouseold size, respectively. Prediction of Travel at te Houseold and Traffic Zone Level In 1965 Travel demand models need to provide accurate predictions to guide decisionmakers in infrastructure investment decisions. Terefore, te four models were applied in turn to predict travel in te base year at te disaggregate ouseold level, and teir performance was judged based on te coefficient of determination (R 2 ) and te percent mean absolute error (PMAE) calculated as, (16) Were pred y mn = predicted number of trips made by ouseolds in cell mn y obs mn = observed number of trips made by ouseolds in cell mn N, M = te respective number of classes for te two stratification variables Table 2 sows an assessment of te accuracy of eac of te four models in predicting te trips made by eac ouseold in 1965. Of te tree MCA models, MCA1 as te smallest PMAE wile MCA2 as te largest error value. Contributing to te ig PMAE value of MCA2 was its complete failure to predict any trips made by single person ouseolds wit no veicles. (Column tree of row four in Table 1 as a negative trip rate tat is set to zero in forecasting). Also sown in columns two and tree are te results of regressing te observed number of daily trips made by te ouseolds in eac cross-classification cell against te number of daily trips to be made by te ouseolds in eac cross-classification cell predicted by eac model (CCA, MCA1, MCA2, or MCA3). Tey indicate tat for CCA, MCA1, and MCA3, te estimated slope coefficients are almost one wile te slope coefficient of MCA2 is about 0.92. Based on te coefficients of determination, MCA1, MCA2, and MCA3 explain te variation in ouseold trips in te 1965 data very well; MCA1 and MCA3 14 pred obs y PMAE mn ymn = *100 / * obs m M y mn n N ( N M)

JTRF Volume 51 No. 2, Summer 2012 Table 1: Estimated CCA and MCA Models Using 1965 Data Houseold Size 1 2 3 4 5+ Total Number of Veicles Model 0 1 2 3 4+ Total CCA 2.201 3.644 4.756 3.941 5.000 3.084 MCA1 1.091 3.108 4.567 5.290 5.486 3.908 MCA2-1.675 1.827 4.757 6.586 7.314 3.084 MCA3 2.028 3.710 5.182 6.404 6.571 4.376 No. of obs. in cell 1062 1367 82 17 11 2539 CCA 3.658 5.417 6.216 7.161 6.889 5.603 MCA1 3.051 5.068 6.527 7.250 7.445 5.868 MCA2 0.844 4.346 7.276 9.105 9.833 5.603 MCA3 3.446 5.128 6.601 7.823 7.989 5.794 No. of obs. in cell 556 3016 2081 193 54 5900 CCA 5.406 7.240 8.159 8.940 8.978 7.773 MCA1 4.927 6.944 8.403 9.126 9.322 7.744 MCA2 3.014 6.516 9.446 11.275 12.003 7.773 MCA3 5.181 6.863 8.335 9.558 9.724 7.529 No. of obs. in cell 175 1543 1575 448 89 3830 CCA 6.774 9.111 10.999 11.797 12.046 10.362 MCA1 7.328 9.345 10.804 11.527 11.723 10.145 MCA2 5.603 9.105 12.034 13.864 14.592 10.362 MCA3 7.613 9.295 10.768 11.99 12.156 9.961 No. of obs. in cell 106 1283 1818 424 130 3761 CCA 9.173 11.883 14.46 16.367 16.270 13.769 MCA1 10.813 12.830 14.289 15.012 15.208 13.630 MCA2 9.010 12.512 15.441 17.271 17.999 13.769 MCA3 10.981 12.663 14.135 15.358 15.524 13.329 No. of obs. in cell 139 1444 2137 523 211 4454 CCA 3.587 7.089 10.018 11.848 12.576 8.346 MCA1 5.442 7.459 8.918 9.641 9.837 8.259 MCA2 3.587 7.089 10.018 11.848 12.576 8.346 MCA3 5.998 7.680 9.153 10.375 10.541 8.346 No. of obs. in cell 2038 8653 7693 1605 495 20484 Note: CCA-Cross Classification Analysis; MCA1-Multiple Classification Analysis Model 1; MCA2-Multiple Classification Analysis Model 2; MCA3-Multiple Classification Analysis Model 3. explain more tan 99% of te variation in ouseold trips in te 1965 dataset, wile MCA2 explains about 97.64% of te variation. In te conventional four-step UTMS modeling approac, regardless of te unit employed in trip generation, te predicted trips by ouseolds are aggregated to traffic zone levels for input into te trip distribution or modal coice step. Consistent wit tis procedure, te study combines te trips predicted for ouseolds by eac of te four models to traffic zone levels. Afterwards, te observed zonal trips were regressed against te predicted zonal trip productions yielded by eac of te four models. A summary of te results are in te bottom alf of Table 2. Based on te coefficient of 15

Houseold Trip Rates determination (R 2 ) CCA, MCA1, and MCA3 explain over 96% of te variance in te observed trips at te traffic zone level wile MCA2 explains sligtly over 95% of tis variance. Using te mean absolute error measure (PMAE), te CCA model yields te lowest error measure of 12.373. Te MCA models, wic were supposed to address te sortcomings of CCA, yield zonal predictions of trips tat ave greater error compared wit te CCA model. Te results from regressing te observed zonal trips against te predicted zonal trips using te four models are presented in columns two and tree of te bottom alf of Table 2. Table 2: Performance of 1965 CCA and MCA Models in Predicting Trips at Houseold and Traffic Zone Levels in 1965 Houseold Level Model Intercept Slope R2 PMAE CCA 0.00 1.0000 1.0000 0.000 MCA1 140.68 0.9939 0.9957 9.264 Standard Error 140.00 0.0136 t-value 1.00 73.0700 MCA2 466.63 0.9222 0.9764 26.887 Standard Error 323 0.0299 t-value 1.44 30.8400 MCA3-61.52 1.0090 0.9972 9.605 Standard Error 114.00 0.0111 t-value -0.54 90.6100 Traffic Zone Level Model Intercept Slope R2 PMAE CCA -15.97 1.0266 0.9661 12.373 Standard Error 8.39 0.0115 t-value -1.90 89.5900 MCA1-10.07 1.0318 0.9633 12.895 Standard Error 8.68 0.0120 t-value -1.16 86.0600 MCA2 8.04 0.9968 0.9556 14.730 Standard Error 9.38 0.0128 t-value 0.86 77.9100 MCA3-16.80 1.0280 0.9654 12.758 Standard Error 8.48 0.0116 t-value -1.98 88.7100 Note: CCA-Cross-Classification Analysis; MCA1-Multiple Classification Analysis Model 1; MCA2-Multiple Classification Analysis Model 2; MCA3-Multiple Classification Analysis Model 3. 16

JTRF Volume 51 No. 2, Summer 2012 Forecast Performance of Alternative Models Te models estimated in te first step were ten used to forecast travel in te years for wic data were available. Te time lag between 1965 and 1981, 1990, 1996, and 2000 provided for an assessment of te medium- to long-term forecast performance of tese models. Te results of te analyses are in Table 3. Te regression results of te observed number of daily trips made by ouseolds in eac cross-classification cell against te corresponding number of daily trips predicted by te models (CCA, MCA1, MCA2, or MCA3) for eac cross-classification cell are in columns tree to nine of tis table. Examining te 1981 results, wit te exception of MCA2 all te models explained in excess of 98% of te variation in te observed trips made by ouseolds. Values of te percent mean absolute error measure in column 10 of Table 3 for all te models were smallest for 1981 compared wit tose in any of te oter years. Focusing on 1981, CCA ad te smallest percent mean absolute error measure value of 11.69% as sown in te tent column of te second row of Table 3. For 1990, te models explain sligtly more tan 91% of te variation in ouseold trips, wic is about 7% less tan te explained variation using te 1981 dataset for CCA, MCA1, and MCA3 models. Additionally, te error measures for CCA, MCA1, and MCA3 in 1990 is approximately double teir corresponding values in 1981, wile tat for te MCA2 model declines. Te CCA model performs better tan te MCA models for te 1990 application. Te tree models, CCA, MCA1, and MCA3, explain trip variation at te ouseold level in excess of 96% using te ouseold trip data in 1996. MCA2, on te oter and, explains 85% of te variation in te trip data, wic is about 11% less tan tat for te oter tree models. In terms of te error measure, te CCA model ranks first it as te lowest percent average error, followed by MCA1 and ten MCA3. Te MCA2 model yields te igest percent average error and terefore ranks fourt. Te application of te models to generate long-term forecasts for 2000 yielded results similar to tose obtained for 1990 and 1996. In terms of explaining trip variation at te ouseold level, all te models performed well by explaining more tan 96% of te total variation in te trips made by ouseolds. In general, te CCA model yields travel forecasts wit lower error values compared wit error values obtained wit forecasts by te MCA models in all te applications. Prediction of Houseold Trip Rates for Cross-Classification Cell wit Inadequate Data As discussed earlier, a callenge in te use of CCA is te possibility of aving a number of cells of te cross-classification matrix aving few or no observations. Te primary concern under suc circumstances sould be wit te problematic cells only; tat is, tose wit little or no data and not te entire trip-rate matrix. However, current planning practice calls for modifying te entire ouseold trip-rate matrix obtained by CCA (Ortuzar and Willumsen 2001) rater tan just te problematic cells. Tis study applies te same MCA models to estimate a ouseold trip rate for eac of te problematic cells only, wile preserving te ouseold trip rates obtained from ordinary CCA for te remaining cells. In addition to te MCA models, two oter tecniques for estimating missing cell values, namely KNN and MI, are employed for predicting ouseold trip rates for only tose empty and/or unreliable cells, and after te forecast performance of all tese models are assessed. It is noted tat no study was found in te literature tat employed MI or KNN to syntesize ouseold trip rates for cross-classification cells wit inadequate data. From te ordinary cross-classification analysis results using te 1965 data presented in Table 1, eac cell of te cross-classification matrix ad observations. However, based on te tresold number of observations required for statistical reliability reported in Ortuzar and Willumsen (2001) te number of observations for tree of te cells was low. Te defining caracteristics of tese cells are: (1) single-person ouseolds owning tree veicles, (2) single-person ouseolds owning four or more veicles, and (3) two-person ouseolds owning four or more veicles. Terefore, te tree 17

Houseold Trip Rates Table 3: Performance of 1965 Models in Predicting Trips by Houseolds in Eac Cross- Classification Cell in Years 1981, 1990, 1996, and 2000 Year 1981 1990 1996 2000 Model Coefficient Intercept Standard Error t- value Coefficient Slope Standard Error t- value R 2 PMAE CCA 39.89 37 1.09 1.0188 0.0138 73.84 0.9960 11.69 MCA1 112.47 78 1.43 1.0156 0.0304 33.37 0.9800 15.98 MCA2 365.71 181 2.02 0.8775 0.0668 13.14 0.8820 33.18 MCA3 21.98 52 0.42 1.0247 0.0200 51.3 0.9910 16.00 CCA 18.15 265 0.07 1.2393 0.0719 17.24 0.9280 27.12 MCA1 240.30 287 0.83 1.2377 0.0811 15.26 0.9100 30.14 MCA2 287.76 268 1.07 1.0909 0.0670 16.27 0.9200 32.40 MCA3 42.40 249 0.17 1.2059 0.0659 18.31 0.9360 31.17 CCA 32.93 53 0.62 1.3164 0.0385 34.23 0.9807 27.62 MCA1 58.80 68 0.86 1.3191 0.0503 26.21 0.9676 29.87 MCA2 145.59 145 1.00 1.1724 0.1015 11.55 0.8529 34.54 MCA3 46.13 70 0.66 1.2681 0.0490 25.88 0.9668 28.96 CCA 173.75 221 0.79 1.2776 0.0337 37.95 0.9840 28.76 MCA1 354.12 247 1.43 1.2588 0.0377 33.39 0.9800 31.47 MCA2 467.17 334 1.40 1.1201 0.0460 24.36 0.9630 31.91 MCA3 285.78 281 1.02 1.2229 1.2229 29.41 0.9740 31.45 Note: CCA-Cross Classification Analysis; MCA1-Multiple Classification Analysis Model 1; MCA2-Multiple Classification Analysis Model 2; MCA3-Multiple Classification Analysis Model 3. MCA models, and KNN and MI in turn, were used to syntesize ouseold trip rates for just tese tree cells tat would oterwise ave unreliable ouseold trip rates. Te remaining cells of te matrix retained teir ouseold trip rates obtained from te ordinary cross-classification analysis (CCA). Te estimated ouseold trip rates for tese cells obtained by te CCA, MCA, KNN, and MI are in Table 4. For eac of te tree cells, te estimated ouseold trip rate by MCA1, MCA2, MCA3, KNN, or MI exceeds te corresponding ouseold trip rate obtained by CCA. Furter, replacing te ouseold trip rates obtained by CCA for te tree problematic cells wit tose yielded by any of te models results in te expected increasing relationsip between ouseold trip rates and increasing ouseold size or increasing veicle ownersip respectively. Forecast Performance of Houseold Trip-Rate Matrices Developed Table 5 presents te values of te measures for evaluating te accuracy of ouseold trip predictions given by te five alternative models, respectively. Te measures are evaluated using te observed trips and te predicted trips made by ouseolds in eac cross-classification cell. Evaluation of te accuracy of predictions of travel in 1965. In te base year (1965), MCA1 ad te lowest PMAE value and terefore te best performance in predicting travel based on tis measure. It is followed by MI. MCA2 ad te worst performance in predicting travel, reflected by it aving te igest PMAE value. Te coefficient of determination is one for all te models, indicating tat eac explained all te variation in ouseold trips in te base year. 18

JTRF Volume 51 No. 2, Summer 2012 Table 4: Predicted Houseold Trip Rates for Cells of 1965 Trip-Rate Matrix wit Inadequate Data Given by Alternative Models Model H 1 =1, V 2 =3 H 1 =1, V 2 =4+ H 1 =2, V 2 =4+ Cross Classification Analysis 3.941 5.000 6.889 Multiple Classification Analysis Model 1 5.290 5.486 7.445 Multiple Classification Analysis Model 2 6.586 7.314 9.833 Multiple Classification Analysis Model 3 6.404 6.571 7.989 Multiple Imputation 5.351 6.585 8.912 K-Nearest Neigbor 6.186 7.139 8.344 1. H = Size of te ouseold 2. V = Number of veicles available to te ouseold Evaluation of te accuracy of predictions of travel in 1981. Based on te coefficient of determination, all te models explain in excess of 99% of te variation in ouseold trips in 1981. CCA yielded te lowest PMAE value of 11.689, indicating it ad a travel forecast accuracy superior to tat of te oter models. MCA1 ad te next lowest PMAE value followed by MI, ten KNN, and ten MCA3. MCA2 ad te igest PMAE value due to te rater large ouseold trip rate estimates it gives for te tree cells wit inadequate data (see Table 4). For eac model, te PMAE value in 1981 is iger tan te corresponding value in 1965. Evaluation of te accuracy of predictions of travel in 1990. Te coefficient of determination using te predictions of ouseold travel by eac of te models ranges from 0.924 to 0.928, indicating tat te models are able to explain in excess of 92.4% of te variation in ouseold trips. Te values of PMAE range from 27.117 for CCA to 29.392 for KNN. MCA1 as te second lowest PMAE value (27.260). Tis indicates tat based on tis measure (PMAE) te CCA model of ouseold trip rates, notwitstanding tree of te cells aving inadequate data, gives more accurate ouseold travel forecasts tan tose given by te oter models. Immediately following tis is MCA1. Again, for eac model, te PMAE value in 1990 is iger tan te corresponding value in 1981. Evaluation of te accuracy of predictions of travel in 1996. Te coefficient of determination evaluated using te predictions of ouseold travel by te six models ranges from 0.975 for MCA2 to 0.981 for CCA. Tis indicates tat te models explain in excess of 97.5% of te variation in ouseold trips. PMAE is igest for MCA2 (31.972), indicating te worst forecast performance of ouseold travel based on tis measure. CCA as te lowest PMAE value of 27.619, indicating te best forecast performance of travel based on tis measure. Immediately following it is MCA1, wic as a PMAE value of 28.215. MI, wit a PMAE value of 29.453, as te tird best forecast performance of travel. Again, for eac model, te PMAE value in 1996 is iger tan te corresponding value in 1990. Evaluation of te accuracy of predictions of travel in 2000. Te coefficient of determination based on te predictions of ouseold travel by eac of te models is 0.983. Tis indicates tat all te models are able to explain 98.3% of te variation in ouseold trips. KNN as te igest PMAE value of 32.686, indicating te worst forecast performance of ouseold travel based on tis measure, wile CCA wit a PMAE value of 28.756 as te best forecast performance of ouseold travel. MCA1, wit a PMAE value of 30.147, as te next best forecast performance of ouseold travel. 19

Houseold Trip Rates Table 5: Performance of 1965 Alternative Models in Predicting Trips by Houseolds in Eac Cross-Classification Cell in Years 1965, 1981, 1990, 1996, and 2000 Year 1965 1981 1990 1996 2000 Model Coefficient Intercept Standard Error t- value Coefficient Slope Standard Error t- value R 2 PMAE MCA1-4.46 2-2.14 1.000 0.0002 4868 1.000 2.287 MCA2-15.90 9-1.83 1.001 0.0008 1193 1.000 6.245 MCA3-8.28 4-2.19 1.001 0.0004 2742 1.000 4.395 KNN -9.78 5-2.13 1.001 0.0004 2253 1.000 4.834 MI -9.64 5-1.79 1.001 0.0005 1919 1.000 3.624 CCA 39.89 37 1.09 1.0188 0.0138 74 0.996 11.689 MCA1 35.97 36 0.98 1.020 0.0139 73 0.996 13.521 MCA2 25.57 38 0.67 1.023 0.0145 71 0.995 16.554 MCA3 32.38 37 0.87 1.021 72.690 0 0.996 15.279 KNN 31.35 37 0.84 1.021 72.470 0 0.996 15.194 MI 31.28 37 0.84 1.021 0.0141 72 0.995 14.208 CCA 18.15 265 0.07 1.2393 0.0719 17 0.928 27.117 MCA1 1.09 265 0.00 1.242 0.0720 17 0.925 27.260 MCA2-39.36 268-0.15 1.250 0.0727 17 0.924 29.421 MCA3-15.43 266-0.06 1.245 0.0723 17 0.925 29.126 KNN -19.74 267-0.00 1.246 0.0723 17 0.925 29.392 MI -16.78 266-0.06 1.245 0.0722 17 0.925 27.787 CCA 32.93 53 0.62 1.3164 0.0385 34 0.981 27.619 MCA1 23.17 54 0.42 1.320 0.0395 33 0.979 28.215 MCA2 4.37 59 0.07 1.325 0.0432 31 0.975 31.972 MCA3 14.02 56 0.25 1.323 0.0408 32 0.978 30.165 KNN 12.39 56 0.22 1.323 0.0410 32 0.977 30.679 MI 15.77 56 0.28 1.322 0.0408 32 0.978 29.453 CCA 173.75 221 0.79 1.2776 0.0337 38 0.984 28.756 MCA1 156.89 222 0.71 1.279 0.0340 37 0.983 30.147 MCA2 127.22 227 0.56 1.281 0.0350 36 0.983 30.852 MCA3 140.90 223 0.63 1.280 0.0342 37 0.983 32.142 KNN 136.78 224 0.61 1.281 0.0343 37 0.983 32.686 MI 138.87 224 0.62 1.280 0.0343 37 0.983 31.326 Note: CCA-Cross-Classification Analysis; MCA1-Multiple Classification Analysis Model 1; MCA2-Multiple Classification Analysis Model 2; MCA3-Multiple Classification Analysis Model 3. 20

JTRF Volume 51 No. 2, Summer 2012 SUMMARY AND CONCLUSIONS Tis paper investigated te forecast performance of trip generation models based on crossclassification (CCA) and multiple classification analysis (MCA). In addition, it examined te replacement of ouseold trip rates in unreliable cross-classification cells wit values estimated by tree MCA models and two metods for estimating missing values namely Multiple Imputation (MI) and K-Nearest Neigborood (KNN). Te results of te study lead to te following conclusions. First, te metods tat call for modifying te entire ouseold trip rate matrix obtained from ordinary cross-classification analysis give a performance in prediction of ouseold travel tat is worse tan tat given by te metods tat call for syntesizing ouseold trip rates for cells wit inadequate data only wile preserving te oter ouseold trip rates obtained from ordinary cross-classification analysis. Tis result is evident by comparing te upper part of Table 2 to te upper part of Table 5. Tus, it is concluded tat adjusting all te trip rates of a CCA matrix using MCA, te current industry standard, results in a forecasting model tat is inferior to CCA and ence sould be avoided by practitioners. Wenever cells wit inadequate data exist in a CCA matrix, te substitution of te trip rates of tese unreliable cells only wit trip rates obtained from te MCA models, te MI metod, or KNN results in more accurate forecasts compared wit adjusting te trip rates for all te cells. Second, even toug tree of te cells of te ordinary cross-classification matrix ad inadequate data, te model surprisingly and consistently gave te best performance in te prediction of ouseold travel in bot te medium and te long term (see column 10 of Table 3). Tus, te basic CCA model is robust and practitioners can use it to provide credible forecasts of travel if few of te cells of te CCA matrix are unreliable. It may also indicate tat te recommended minimum number of observations for a cell can peraps be reduced and still lead to te development of reliable crossclassification models. It is for future researc to determine te appropriate minimum number of observations for a cell. Tird, replacing te unreliable ouseold trip rates of an ordinary CCA matrix wit ouseold trip rates estimated using te MCA models, KNN and MI did improve upon te performance of te cross-classification model compared wit adjusting all te trip rates of te CCA matrix. Among tese metods for syntesizing a ouseold trip rate, on average, MCA1 and MI ave te lowest error values (column 10 of Table 5). However, since MCA1 is subject to biases (Guevara and Tomas 2007), te MI model may be preferred over MCA1 even toug MCA1 may be a simpler model compared wit te MI model. Finally, te forecast performance of cross-sectional models declines wit time. For example, te corresponding PMAE values associated wit eac model increased wit te time interval between te base and application years (see column 10 of Table 5). Tis certainly is logical because of te greater canges expected to occur in land use patterns, socio-demograpic caracteristics and attitudes of te population, transportation system caracteristics, and tecnology wit time elapsed from te base year. Tus, irrespective of te metod used to syntesize ouseold trip rates for unreliable cells, te furter out te application year te greater te inaccuracy of travel forecasts. Te prime limitation of tis study is wit te single region source of te dataset used. Clearly, to generalize, te conclusions tests ave to be done on data from several oter regions. 21

Houseold Trip Rates APPENDIX A: Results of Analysis of Variance and Correlation Analysis Respectively Analysis of Variance Source Partial Sum of Squares Degrees of Freedom Mean Square F value P value Model 309730 82 3777 104.73 0.0000 Houseold Size 127687 18 7094 196.68 0.0000 Number of Motorcycles 623 3 208 5.76 0.0006 Number of Drivers 9537 8 1192 33.05 0.0000 Tenure 346 8 43 1.20 0.2942 Houseold Income 11669 14 833 23.11 0.0000 Dwell Type 491 11 45 1.24 0.2556 Number of Veicles 1551 20 77 2.15 0.0021 Residual 682363 18919 36 Total 992093 19001 52 Number of Observations 19002 R-squared 0.3122 Correlation Matrix Houseold Size Houseold Size 1 Number of Motorcycles Number of Motorcycles 0.0466 1 Number of Drivers Number of Drivers 0.4717 0.0955 1 Number of Veicles Number of Veicles 0.2898 0.0460 0.5294 1 Houseold Income Houseold Income 0.1426 0.0169 0.2920 0.2805 1 Acknowledgements Te autors would like to tank Carles Purvis of te San Francisco Bay Area Metropolitan Transportation Commission for providing data used in tis study. Additionally, te first autor conducted tis researc wile at Tennessee Tecnological University. References Abdel-Aal, M.M. Cross Classification Trip Production Model for te City of Alexandria. Alexandria Engineering Journal 43 (2), (2004): 177-189. Cotrus, A., J.N. Prasker, and Y. Siftan. Spatial and Temporal Transferability of Trip Generation Demand Models in Israel. Journal of Transportation and Statistics 8 (1), (2003): 37-56. Ewing, R., M. Deanna, and S. Li. Land Use Impacts on Trip Generation Rates. Journal of te Transportation Researc Board 1518, (1996): 1 6. Guevara, C. A. and A. Tomas. Multiple Classification Analysis in Trip Production Models. Transport Policy 14, (2007): 514-522. 22

JTRF Volume 51 No. 2, Summer 2012 Horton N.J. and K.P. Kleinman. Muc Ado About Noting: A Comparison of Missing Data Metods and Software to Fit Incomplete Data Regression Models. American Statistical Association 61(1), (2007): 79-90. Huisken, G. Neural Networks and Fuzzy Logic to Improve Trip Generation Modeling. Paper presented at te 79 t Annual Meeting of te Transportation Researc Board, Wasington, D.C., 2000. Kikuci, S. and J. Ree. Adjusting Trip Rate in te Cross-Classification Table by Using te Fuzzy Optimization Metod. Journal of te Transportation Researc Board 1836, (2003): 76 82. Metropolitan Transportation Commission (MTC). Trip Linking Procedures: 1990 Bay Area Houseold Travel Survey. Working Paper 2. Oakland, CA, Revised June 2003. Muammad S.B., I.G. Segal, and D. Laurence. K-Ranked Covariance Based Missing Values Estimation for Microarray Data Classification. Proceedings of te 4 t International Conference on Hybrid Intelligent Systems, Kitakyusu, Japan, December 5-8, 2004. Ortuzar, J.D. and L.G.Willumsen. Modelling Transport. 3 rd edition. Jon Willey and Sons, Inc., Cicester, England, 2001. Purvis, C.L. Canges in Regional Travel Caracteristics and Travel Time Expenditures in te San Francisco Bay Area: 1960 1990. Journal of te Transportation Researc Board 1466, (1994): 99 109. Rengaraju, V. and M. Satyakumar. Structuring Category Analysis Using Statistical Tecnique. Journal of Transportation Engineering 20 (6), (1994): 931-939. Rengaraju, V. and M. Satyakumar. Tree-Dimensional Category Analysis Using Probabilistic Approac. Journal of Transportation Engineering 121(6), (1995): 538-543. Reterford, R. D. and M.K. Coe. Statistical Model for Causal Analysis. Jon Wiley & Sons, New York, Inc., NY, 1993. Rubin D.B. Inference and Missing Data. Biometrika 63, (1976): 581 590. Stoper, P.R. and K.G. McDonald. Trip Generation by Cross-Classification: An Alternative Metodology. Journal of te Transportation Researc Board 944, (1983): 84 91. Transportation Researc Board (TRB). Metropolitan Travel Forecasting: Current Practice and Future Direction. Special Report 288, Wasington, DC, 2007. Walker, T. and H. Peng. Long-Range Temporal Stability of Trip Generation Rates Based on Selected Cross-Classification Models in te Delaware Valley Region. Journal of te Transportation Researc Board 1305, (1991): 61 71. Wardman, M.R. and J.M. Preston. Developing National Multi-modal Travel Models: A Case Study of te Journey to Work. Paper presented at te 9 t World Conference on Transport Researc, Seoul, Sout Korea, 2001. Wayman J.C. Multiple Imputation For Missing Data: Wat Is It And How Can I Use It? Paper presented at te 2003 Annual Meeting of te American Educational Researc Association, Cicago, Illinois, 2003. 23