Demand forecasting with discrete choice models

Size: px
Start display at page:

Download "Demand forecasting with discrete choice models"

Transcription

1 Demand forecasting with discrete choice models Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne January 3, 2017 M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

2 Outline Outline 5 Confidence intervals 1 Introduction 2 Aggregation 3 Forecasting 4 Price optimization 6 Willingness to pay 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

3 Introduction Introduction Demand = choices Travel or not? Destination Mode Route Departure time M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

4 Introduction Introduction Choice model P(i x n,c n ;θ) What do we do with it? Note It is always possible to characterize the choice set using availability variables, included into x n. So the model can be written Example: logit P(i x n,c;θ) = P(i x n ;θ) P(i x n,c n ;θ) = e V in(x n;θ) j C n e V jn(x n;θ) P(i x n ;θ) = y in e V in(x n;θ) j C y jne V jn(x n;θ) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

5 Aggregation Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

6 Aggregation Aggregation Aggregate shares Prediction about a single individual is of little use in practice. Need for indicators about aggregate demand. Typical application: aggregate market shares. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

7 Aggregation Aggregation Population Identify the population T of interest (in general, already done during the phase of the model specification and estimation). Obtain x n and C n for each individual n in the population. The number of individuals choosing alternative i is N T N T (i) = P n (i x n ;θ). n=1 The share of the population choosing alternative i is W(i) = 1 N T N T P(i x n ;θ) = E[P(i x n ;θ)]. n=1 M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

8 Aggregation Aggregation Population Alternatives 1 2 J Total 1 P(1 x 1 ;θ) P(2 x 1 ;θ) P(J x 1 ;θ) 1 2 P(1 x 2 ;θ) P(2 x 2 ;θ) P(J x 2 ;θ) N T P(1 x NT ;θ) P(2 x NT ;θ) P(J x NT ;θ) 1 Total N T (1) N T (2) N T (J) N T M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

9 Aggregation Distribution Data Assume the distribution of x n is available. x n = (x C n,x D n ) is composed of discrete and continuous variables. x C n distributed with pdf p C (x). x D n distributed with pmf p D (x). Market shares W(i) = x D x C P n (i x C,x D )p C (x C )p D (x d )dx C = E [P n (i x n ;θ)], M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

10 Aggregation Sample enumeration Aggregation methods Issues None of the above formulas can be applied in practice. No full access to each x n, or to their distribution. Practical methods are needed. Practical methods Use a sample. It must be revealed preference data. x n must reflect a specific, well identified, real scenario Observed market shares are used to calibrate the constants It may be the same sample as for estimation. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

11 Aggregation Sample enumeration Sample enumeration Stratified sample Population is partitioned into homogenous segments. Each segment is randomly sampled. Let n be an observation in the sample belonging to segment g Let ω g be the weight of segment g, that is ω g = N g # persons in segment g in population = S g # persons in segment g in sample The number of persons choosing alt. i is estimated by N(i) = ω g I ng = ω g(n) P(i x n ;θ) n n samplep(i x n ;θ) g where I ng = 1 if individual n belongs to segment g, 0 otherwise, and g(n) is the segment containing n. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

12 Aggregation Sample enumeration Sample enumeration Predicted shares Ŵ(i) = n samplep(i x n ;θ) g N g 1 I ng = 1 ω N T S g N g(n) P(i x n ;θ) T n Comments Consistent estimate. Estimate subject to sampling errors. Policy analysis: change the values of the explanatory variables, and apply the same procedure. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

13 Aggregation Sample enumeration Market shares per market segment Let h be a segment of the population. Let I nh = 1 if individual n belongs to this segment, 0 otherwise. Number of persons of segment h choosing alternative i N h (i) = n ω g(n) P(i x n ;θ)i nh Market share of alternative i in segment h n Ŵ h (i) = ω g(n)p(i x n ;θ)i nh n ω. g(n)i nh M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

14 Aggregation Sample enumeration Example: interurban mode choice in Switzerland Sample Revealed preference data Survey conducted between 2009 and 2010 for PostBus Questionnaires sent to people living in rural areas Each observation corresponds to a sequence of trips from home to home. Sample size: 1723 Model: 3 alternatives Car Public transportation (PT) Slow mode M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

15 Aggregation Sample enumeration Example: interurban mode choice in Switzerland Robust Parameter Coeff. Asympt. number Description estimate std. error t-stat p-value 1 Cte. (PT) Income 4-6 KCHF (PT) Income 8-10 KCHF (PT) Age 0-45 (PT) Age (PT) Male dummy (PT) Marginal cost [CHF] (PT) Waiting time [min], if full time job (PT) Waiting time [min], if part time job or other occupation (PT) Travel time [min] log(1+ distance[km]) / 1000, if full time job Travel time [min] log(1+ distance[km]) / 1000, if part time job Season ticket dummy (PT) Half fare travelcard dummy (PT) Line related travelcard dummy (PT) Area related travelcard (PT) Other travel cards dummy (PT) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

16 Aggregation Sample enumeration Example: interurban mode choice in Switzerland Robust Parameter Coeff. Asympt. number Description estimate std. error t-stat p-value 17 Cte. (Car) Income 4-6 KCHF (Car) Income 8-10 KCHF (Car) Income 10 KCHF and more (Car) Male dummy (Car) Number of cars in household (Car) Gasoline cost [CHF], if trip purpose HWH (Car) Gasoline cost [CHF], if trip purpose other (Car) Gasoline cost [CHF], if male (Car) French speaking (Car) Distance [km] (Slow modes) Summary statistics Number of observations = 1723 Number of estimated parameters = 27 L(β 0 ) = L( ˆβ) = [L(β 0 ) L(ˆβ)] = ρ 2 = ρ 2 = M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

17 Aggregation Sample enumeration Example: interurban mode choice in Switzerland Male Female Unknown gender Population Car 64.96% 60.51% 70.88% 62.8% PT 30.20% 32.52% 25.59% 31.3% Slow modes 4.83% 6.96% 3.53% 5.88% M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

18 Forecasting Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

19 Forecasting Forecasting Procedure Scenarios: specify future values of the variables of the model. Recalculate the market shares. Market shares Increase of the cost of gasoline Now 5% 10% 15% 20% 25% 30% Car 62.8% 62.5% 62.2% 61.8% 61.5% 61.2% 60.8% PT 31.3% 31.6% 31.9% 32.2% 32.5% 32.8% 33.1% Slow modes 5.88% 5.90% 5.92% 5.95% 5.97% 6.00% 6.02% M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

20 Forecasting Forecasting Market share Base model 5% 10% 15% 20% 25% 30% Scenario: increase of the cost of gasoline Car Slow modes Public transportation M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

21 Price optimization Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

22 Price optimization Price optimization Optimizing the price of product i is solving the problem maxp i ω g(n) P(i x n,p i ;θ) p i Notes: n sample It assumes that everything else is equal In practice, it is likely that the competition will also adjust the prices M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

23 Price optimization Illustrative example A binary logit model with so that Two groups in the population: Group 1: β p = 2, N s = 600 Group 2: β p = 0.1, N s = 400 Assume that p 2 = 2. V 1 = β p p V 2 = β p p 2 e βpp P(1 p) = e βpp e βpp 2 M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

24 Price optimization Illustrative example 100% % 800 Share 60% 40% Revenues 20% % Price of product 1 Share Revenues 500 M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

25 Price optimization Case study: interurban mode choice in Switzerland Scenario A uniform adjustment of the marginal cost of public transportation is investigated. The analysis ranges from 0% to 700%. What is the impact on the market share of public transportation? What is the impact of the revenues for public transportation operators? M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

26 Price optimization Case study: interurban mode choice in Switzerland 100% Share of public transportation 80% 60% 40% 20% Revenues for public transportation 0% 0 0% 100% 200% 300% 400% 500% 600% 700% Adjustment of the marginal cost Revenues Market shares M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

27 Price optimization Case study: interurban mode choice in Switzerland Comments Typical non concavity of the revenue function due to taste heterogeneity. In general, decision making is more complex than optimizing revenues. Applying the model with values of x very different from estimation data may be highly unreliable. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

28 Confidence intervals Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

29 Confidence intervals Confidence intervals Model P(i x n,p i ;θ) In reality, we use θ, the maximum likelihood estimate of θ Property: the estimator is normally distributed N( θ, Σ) Calculating the confidence interval by simulation Draw R times θ from N( θ, Σ). For each θ, calculate the requested quantity (e.g. market share, revenue, etc.) using P(i x n,p i ; θ) Calculate the 5% and the 95% quantiles of the generated quantities. They define the 90% confidence interval. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

30 Confidence intervals Case study: confidence intervals (500 draws) 100% 80% Share of public transportation 60% 40% 20% 0% 0% 100% 200% 300% 400% 500% 600% 700% Adjustment of the marginal cost M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

31 Confidence intervals Case study: confidence intervals (500 draws) Revenues for public transportation % 100% 200% 300% 400% 500% 600% 700% Adjustment of the marginal cost Revenues 90% confidence interval M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

32 Confidence intervals Confidence interval Model P(i x n,p i ; θ) There are also errors in the x n. If the distribution of x n is known, draw from both x n and θ. Apply the same procedure. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

33 Willingness to pay Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

34 Willingness to pay Willingness to pay Context If the model contains a cost or price variable, it is possible to analyze the trade-off between any variable and money. It reflects the willingness of the decision maker to pay for a modification of another variable of the model. Typical example in transportation: value of time Value of time Price that travelers are willing to pay to decrease the travel time. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

35 Willingness to pay Willingness to pay Definition Let c in be the cost of alternative i for individual n. Let x in be the value of another variable of the model (travel time, say). Let V in (c in,x in ) be the value of the utility function. Consider a scenario where the variable under interest takes the value x in = x in +δ x in. We denote by δin c the additional cost that would achieve the same utility, that is V in (c in +δ c in,x in +δ x in) = V in (c in,x in ). The willingness to pay is the additional cost per unit of x, that is δ c in/δ x in M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

36 Willingness to pay Willingness to pay Continuous variable If x in is continuous, if V in is differentiable in x in and c in, invoke Taylor s theorem: V in (c in,x in ) = V in (c in +δ c in,x in +δ x in) V in (c in,x in )+δin c V in (c in,x in )+δ x V in in (c in,x in ) c in x in δin c δin x = ( V in/ x in )(c in,x in ) ( V in / c in )(c in,x in ) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

37 Willingness to pay Willingness to pay Linear utility function If x in and c in appear linearly in the utility function, that is V in (c in,x in ) = β c c in +β x x in + then the willingness to pay is δin c δin x = ( V in/ x in )(c in,x in ) ( V in / c in )(c in,x in ) = β x β c M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

38 Willingness to pay Value of time The value of time is defined as the amount of money that an individual is willing to pay to save one unit of time δin c δin t = δc in 1 = ( V in/ t in )(c in,t in ) ( V in / c in )(c in,t in ) = β t β c Therefore VOT in = δ c in/( δ t in) = ( V in/ t in )(c in,t in ) ( V in / c in )(c in,t in ) If V is linear in these variables, we have VOT in = δ c in/( δ t in) = β t β c. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

39 Willingness to pay Case study: value of time for car drivers 45% Value of time (Car) 40% 35% Share of the population (%) 30% 25% 20% 15% 10% 5% 0% Value of time (CHF/hour) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

40 Willingness to pay Case study: value of time for car drivers (nonzero) 16% Value of time (Car) 14% 12% Share of the population (%) 10% 8% 6% 4% 2% 0% Value of time (CHF/hour) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

41 Willingness to pay Case study: value of time for public transportation 40% Value of time (PT) 35% 30% Share of the population (%) 25% 20% 15% 10% 5% 0% Value of time (CHF/hour) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

42 Willingness to pay Case study: value of time for public transportation (nonzero) 10% Value of time (PT) 9% 8% Share of the population (%) 7% 6% 5% 4% 3% 2% 1% 0% Value of time (CHF/hour) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

43 Elasticities Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

44 Elasticities Disaggregate elasticities Point vs. arc Point: marginal rate Arc: between two values Point Direct vs. cross Direct: wrt attribute of the same alternative Cross: wrt attribute of another alternative Arc Direct E Pn(i) x ink = P n(i) x ink x ink P n (i) P n (i) x ink x ink P n (i) Cross E Pn(i) x jnk = P n(i) x jnk x jnk P n (i) P n (i) x jnk x jnk P n (i) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

45 Elasticities Aggregate elasticities Population share W(i) = 1 N T N T P(i x n ) n=1 Aggregate elasticity E W(i) x jk = W(i) x jk x jk W(i) = N T n=1 P n (i) P n (i) P n (i) x jk x jk NT n=1 P n(i) = N T n=1 P n (i) NT n=1 P n(i) EPn(i) x jnk. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

46 Elasticities Case study: elasticity of travel time (PT) 40% Elasticity of travel time (PT) 35% 30% Share of the population (%) 25% 20% 15% 10% 5% 0% Elasticity of travel time (PT) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

47 Elasticities Case study: elasticity of travel time (PT, non zero) 14% Elasticity of travel time (PT) 12% Share of the population (%) 10% 8% 6% 4% 2% 0% Elasticity of travel time (PT) M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

48 Consumer surplus Outline 1 Introduction 2 Aggregation Sample enumeration 3 Forecasting 4 Price optimization 5 Confidence intervals 6 Willingness to pay Value of time 7 Elasticities 8 Consumer surplus 9 Summary M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

49 Consumer surplus Consumer surplus Concept Difference between what a consumer is willing to pay for a good and what she actually pays for the good Area under the demand curve and above the market price Discrete choice demand characterized by the choice probability role of price taken by the utility utility can always be transformed into monetary units M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

50 Consumer surplus Consumer surplus Demand curve Consumer surplus at current situation Additional consumer surplus Current Vi Future V 2 i V 1 i P(i V i,v j )dv i = 0 1 P n (i) V 2 i V 1 i e µv i e µv i +e µv j dv i M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

51 Consumer surplus Consumer surplus Binary logit V 2 i V 1 i P(i V i,v j )dv i = V 2 i V 1 i e µv i e µv i +e µv j dv i = 1 µ ln(eµv2 i +e µv j ) 1 µ ln(eµv1 i +e µv j ). M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

52 Consumer surplus Consumer surplus Generalization i C V 2 V 1 P(i V)dV i. If the choice model has equal cross derivatives, that is P(i V,C) V j = P(j V,C) V i, i,j C, the integral is path independent. Logit i C V 2 V 1 P(i V)dV i = 1 µ ln j C 2 e µv2 j 1 µ ln j C 1 e µv1 j. M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53

53 Summary Summary Aggregation Sample enumeration Precision Confidence intervals Calculated by simulation Indicators Market shares Revenues Willingness to pay Elasticities Consumer surplus M. Bierlaire (TRANSP-OR ENAC EPFL) Forecasting January 3, / 53