Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS

Size: px
Start display at page:

Download "Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS"

Transcription

1 Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS Hierarchical Linear Models (HLM) provide a flexible and powerful approach when studying response effects that vary by groups. HLMs are useful when the data are measured at more than one-level (e.g., brands nested within product categories; product categories nested within markets etc.). When the measures are repeated over time the model is called longitudinal HLM (e.g., weekly scores within brand; Rabe-Hesketh and Skrondal 2005). Unlike the traditional OLS regression approach, longitudinal HLM allows us to treat the coefficients of the model as random effects drawn from a normal distribution of possible estimates. This implies that a modeler can detect to what extent the brands vary in the coefficients of interest, intercept and/or slope parameters. In other words, in longitudinal HLM, the random effects in the intercept and/or slope serve to shift the regression line up or down by brand. Additionally, in the longitudinal HLM, the variance of an outcome variable is split into between and within variances, which increases the precision of estimates. Model: In our two-level HLM, time series observations within brands constitute the first level, and the brands form the second level. We fit the hierarchical linear model to our data, thus combining fixed and random effects. The model is described as follows: (Eq.A1) where the index t is for units (time series observations), i for brands. stands for the random intercept for brand i. is the residual error term for brand i at time t. is the intercept. represents that the coefficients of the independent variables X vary across brands. denotes the dependent variable. Furthermore, we make the following assumptions: ~ 0, ; ~ 0,

2 and the random intercepts and the residual error terms are independent. This model is the varying-intercept and varying-coefficient model. Throughout the paper, we opt for this model formulation since the log likelihood of this specification is always higher than that of the only varying-intercept model. Also, the LR test result reveals the same conclusion. We conduct the LR test also to compare the model with one-level ordinary linear regression with two-level model. Estimation: There are two alternative methods to estimate the parameters of the above model: (i) Maximum Likelihood (MLE) and (ii) Restricted Maximum Likelihood (RMLE). Both methods produce similar regression coefficients. They differ in terms of estimating the variance components, i.e. the latter takes into account the loss of degrees of freedom resulting from the fixed effects (Snijders and Bosker 1999). Which method to use remains a matter of personal taste (StataCorp 2005, pg. 188). Thus, we fit the model via MLE which is the default in STATA. The estimation technique is iterative and relies on the Expectation-Maximization (EM) algorithm. The convergence is achieved when the error tolerance is met. Intraclass Correlation: The percentage of observed variation in the dependent variable that can be attributed to the brand-level characteristics is computed by dividing by the total variance: (Eq.A2) where represents the within-brand correlation, usually referred to as the intraclass correlation coefficient (Hox 2010). The percentage of variance that can be attributed to time-series traits, then, is found by 1. For instance, assuming that we allow for random effects in the stickiness models for both the intercept and the AR(1) and AR(2) coefficients. Then brand-level variance becomes: (Eq.A3)

3 2. CROSS RANDOM EFFECTS MODELING In our longitudinal HLM, we treat time as nested within brands. Crossed Random Effects (CRE) modeling assumes that all brands are affected similarly by some events of characteristics associated with the time. A typical example is panel data where the factor individual (brand, market etc.) is crossed with another factor time (for a review see Rabe-Hesketh and Skrondal 2005, page 249). Therefore, it is reasonable to deem time as crossed with brands. As with HLM, CRE is a mixed-effects modeling, i.e. we are provided with fixed and random effects parameters. In CRE modeling, the effects of both brands and time vary (Baltagi 2005). Hence, by employing CRE, a researcher is able to break down the random effects into two components: across brands and over time. Model: Specifically, the following equation shows our CRE model: (Eq.A4) where are random intercepts for brands i and time t, respectively, and is a residual error term. represents the dependent variable, is the intercept term, denotes the estimated fixed effect parameter for the independent variables. We make the following assumption about the random intercepts: ~ 0,, ~ 0, These random intercepts are not correlated with each other. Furthermore, they are not correlated with the residual error term. Regarding the residual error term, we assume that ~ 0, In this model, the random intercept for brand is shared across all time periods for a given brand i whereas the random intercept for time period is shared by all brands in a given period t. The residual error comprises both the interaction between time and brand and any other effect specific to brand i in period t. An interaction between brand and

4 time might occur since some events in some periods could be more beneficial to some brands than others. Estimation: As with longitudinal HLM, we use an iterative MLE method that makes use of the EM algorithm. Intraclass Correlations: We define two intraclass correlations: (i) One for correlations of observations for the same period across brands: (Eq.A5) (ii) and one for correlations of observations on the same brand over time: (Eq.A6) As a diagnostic check, for both models we use normal Q-Q plots to determine whether or not there is a violation of the normality assumption. References Baltagi, Badi H. (2005), Econometric Analysis of Panel Data: 3 rd Edition, London: Wiley. Hox, Joop J. (2010), Multilevel analysis: Techniques and Applications, 2 nd Edition, New York:Routledge. Snijders, Tom A. B. and Roel J. Bosker (1999), Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Thousand Oaks, CA: Sage. Rabe-Hesketh, Sophia and Anders Skrondal (2005), Multilevel and Longitudinal Modeling Using Stata, College Station, TX: Stata Press. StataCorp. (2005), Stata Longitudinal/Panel Data Reference Manual, Release 9, College Station, TX: StataCorp LP.

5 Appendix B Dynamic Programming Specifics Step 1: We assume that a brand manager observes the state variables, sales and awareness, in every period t, makes his marketing mix decisions, i.e. price and advertising, and earns a reward that depends on both the state level and the actions taken. As such, we assume that the state variables evolve according to the following joint controlled exogenous linear Markov process:,,,, (Eq.B1) where t is the time index, A and S represent the awareness and sales, respectively. and denote the price and the advertising actions, respectively. and are the constant terms for awareness and sales equations respectively, coefficients are for the lagged state variables, coefficients are for the marketing mix actions, price and advertising. is a random shock term with mean zero. The random shocks are assumed to be independently and identically distributed over time, and independent of past states and actions. Eq.B1 is also called state transition equation. It implies that the state variables (awareness and sales) in period t+1 depend on the states and actions (price and advertising) in period t and an exogenous random shock. We estimate this transition equation by time series econometrics and use the estimated parameters in the optimization part in Step 2. Step 2: Although most marketers assume profit maximization when developing their marketing mix decisions, it appears that most company/brand managers set their marketing mix policy to achieve a pre-specified sales target (Saghafi 1988). We set our objective function in the same spirit as Roy, Hanssens and Raju (1994). We assume that the brand manager wishes to make price and advertising decisions in order to minimize the deviation of the sales (S) and the awareness (A) around pre-specified targets and, respectively.

6 More specifically, the brand manager wants to minimize expected discounted stream of weighted squared deviations:, (Eq.B2) where A and S are the sales and the awareness, and are the target values for awareness and sales, respectively. Ω is a 2x2 constant positive definite matrix of preference weights. We consider a finite time horizon for the above loss function, i.e. min, (Eq.B3) where t is the time index and 0,1 is the discount factor. This problem is called Stochastic Dynamic Programming (SDP) problem in discrete time (Stokey, Lucas and Prescott 1989). In order to formulate this problem as a maximization problem, we posit that the reward function is equal to the negative of the loss function in Eq.B2, i.e.,,,. We represent the above problem by so-called Bellman equation (Bellman, 1957):, max,,,,,, t=1,2, T. (Eq.B4) Eq.B4 maximizes the sum of current and expected future rewards. The resulting solution to Eq.B4 is known as the value function in the DP. It is the optimal value when the optimal actions are taken given the level of state. Since the problem is finite horizon case, the Bellman equation in (Eq.B4) is solved by backward recursion technique. As the brand manager faces no decision after the terminal decision period T, the terminal value function,, is fixed. The procedure works in the following way: given, we find the optimal marketing actions for the entire state space for period T, and get. Using, we compute the optimal marketing actions for period T-1 and obtain. This procedure continues until is derived for the

7 entire state space. The price and advertising decisions, in each period are defined in a feasible set such that each decision variable takes lower and upper bound values. For instance, price and advertising decisions cannot be negative, i.e., 0,.State space is also a bounded interval of the real line, i.e. A, S 0,. In order to solve Eq.B4, we consider a state discretization with 10 scenarios for each state variable. In addition, we use 30 equidistant nodes for each decision variable. Utilizing the CompEcon Matlab Toolbox 1, we apply the backward recursion method and find the solution to Eq.B4 (see Miranda and Fackler 2002). As an example, in Figure B1, value functions belonging to the final period are shown with respect to the different sales-awareness levels. Both brands value functions increase as state level increases, but brand SC has a higher value function than brand SD. Figure B1: Value Functions References Miranda MJ, Fackler PL (2002) Applied Computational Economics and Finance (MIT Press, Cambridge, MA). Roy A, Hanssens DM, Raju JS (1994) Competitive pricing by a price leader. Management Science. 40(7): The toolbox can be downloaded from

8 Saghafi MM (1987) Market share stability and marketing policy: An axiomatic approach. Research in Marketing. 9: Stokey NL, Lucas RE, Prescott EC (1989) Recursive Methods in Economic Dynamics (Harvard University Press, Cambridge, MA).

9 Appendix C Table C1: Relative Contributions of This Study Features Srinivasan et al Fisher et al. 2011b This Study A. Research Objective Descriptive: attitude/transactions decomposition Normative: marketing resource allocation Descriptive: attitude/transactions decomposition Normative: marketing resource allocation Optimal vs. actual behavior B. Research Setting B2C context B2B context B2C context C. Empirical Modeling Dependent variables Multivariate vector - brand sales and mind-set metrics of awareness, consideration and liking Sequence of conditional probabilities for awareness, consideration and usage and liking Marketing mix decisions Cross-sectional metrics data Time-series data Incorporate Funnel Hierarchy Incorporates competition Incorporates dynamics Marketing mix sales Empirical Application Modeling Approach Formal mediation analysis Accounts for Endogeneity Estimation Advertising, promotion, price and distribution Random sample of 8000 customers in France Yes with 96 four weekly observations Marketing budget only Random sample of 800 customers per country Limited with 2 survey waves in 2004 and 2006 Agnostic Hierarchical funnel Agnostic Yes Only for items No Yes No Yes Bottled juice, bottled water, cereals, and shampoos Aggregate Vector Autoregressive model Package delivery service Multivariate vector -brand sales and mind-set metrics of awareness, consideration Advertising, promotion and prices Random sample of 8000 customers in France Yes with 96 four weekly observations Bottled juice, bottled water, cereals, and shampoos Individual Choice Hierarchical Linear Model / model Cross-effects model No No Yes Yes No Yes Standard econometric methods No closed-form solution; approximation methods and search Holdout/Prediction test No No Yes D. Marketing Decision Modeling Prescriptive action Based on impulse response functions to exogenous shocks Based on dynamic programming model Marketing resource No Yes Yes Allocation theory Criteria for Marketing Mix Decisions Potential No No Yes Stickiness Only for items No Yes Responsiveness Only for items Yes Yes Conversion Yes Yes Yes Simulation with Profit/Sales Objectives No Yes Yes MLE that relies on the EM algorithm Based on dynamic programming model

10 Table C2: Mediated Effects of Marketing Mix Actions* Shampoo Awareness Consideration Liking Total Indirect Direct effect Mediated Effect Price % Promotion % Advertising % Bottled Water Awareness Consideration Liking Total Indirect Direct effect Mediated Effect Price % Promotion % Advertising % Juice Awareness Consideration Liking Total Indirect Direct effect Mediated Effect Price % Promotion % Advertising % Cereals Awareness Consideration Liking Total Indirect Direct effect Mediated Effect Price % Promotion % Advertising % *Read as: In shampoo category, the indirect effect of advertising on sales via awareness is The total indirect effect of advertising via all mindset variables is The proportion of the total effect that is mediated due to the mindset variables is 24%. Table C3: Maximum Likelihood Random Effects Estimates of Sales Conversion in Longitudinal HLM* Random Effects Shampoo Bottled Juice Cereals Water DV=Sales DV=Sales DV=Sales DV=Sales * is the standard deviation of the intercept at the brand level, is the standard deviation of the residuals. is the standard deviation of the slope parameter for consideration, is the standard deviation of the slope parameter for liking, is the standard deviation of the slope parameter for awareness.

11 Random Effects Table C4: Maximum Likelihood Random Effects Estimates of Attitude Responsiveness in longitudinal HLM* Shampoo Bottled Water DV=Awareness DV=Consideration DV=Liking DV=Awareness DV=Consideration DV=Liking Juice Cereals DV=Awareness DV=Consideration DV=Liking DV=Awareness DV=Consideration DV=Liking Random Effects * is the standard deviation of the intercept at the brand level, is the standard deviation of the residuals. is the standard deviation of the slope parameter for price, is the standard deviation of the slope parameter for promotion, is the standard deviation of the slope parameter for advertising. Table C5: Maximum Likelihood Random Effects Estimates of Marketing Mix Models (Transactions Route) in Longitudinal HLM* Random Effects Shampoo Bottled Water Juice Cereals DV=Sales DV=Sales DV=Sales DV=Sales * is the standard deviation of the intercept at the brand level, is the standard deviation of the residuals. is the standard deviation of the slope parameter for Price, is the standard deviation of the slope parameter for Promotion, is the standard deviation of the slope parameter for Advertising.

12 Table C6: Maximum Likelihood Random Effects Estimates of Transactions + Consumer Attitude Models in longitudinal HLM* Random Effects Shampoo Bottled Juice Cereals Water DV=Sales DV=Sales DV=Sales DV=Sales * is the standard deviation of the intercept at the brand level, is the standard deviation of the residuals. is the standard deviation of the slope parameter for Price, is the standard deviation of the slope parameter for Promotion, is the standard deviation of the slope parameter for Advertising, is the standard deviation of the slope parameter for Awareness, is the standard deviation of the slope parameter for Consideration, is the standard deviation of the slope parameter for Liking.