A Statistical Analysis of Water Conservation Policies

Size: px
Start display at page:

Download "A Statistical Analysis of Water Conservation Policies"

Transcription

1 A Statistical Analysis of Water Conservation Policies POLS 500 Independent Study Advisor: Dr. Christopher Den Hartog California Polytechnic State University San Luis Obispo Cynthia Allen June 7, 2010

2 1 INTRODUCTION As part of the author s Master of Public Policy graduate research project, an Ordinary Least Squares (OLS) multivariate regression model was used to predict the impact that price, number of customers, rainfall, income, water conservation programs, fines, and media reports have on water consumption. Particular attention was placed on rates designed to encourage a reduction in water consumption. The analysis was performed with data from participating water purveyors for the period from January 1996 through December On November 7, 2009, the principal water purveyors in San Luis Obispo and Santa Barbara Counties were asked to provide water rates, water delivery statistics, number of residential customers, and water conservation policies and 23 agencies responded (see Table 1 for a list of participating water purveyors). OLS regression analysis was performed on the data for the region as a whole and also filtered by county. OLS regression estimates of coefficients for the three models are reported in Table 2, Table 3, and Table 4. For the region, and when filtered for Santa Barbara County, five of the seven coefficients had the expected signs and all five were less than.05 which means they were all statistically significant at the five percent level on a one-tailed test with a critical t-value of When filtered for San Luis Obispo County, only four of the seven coefficients exhibited the expected signs and all four were less than.05. For all three models, the F-statistic was more than 2.21 which means that the models, as a whole, were statistically significant. The coefficients for media, price, fine, customers, rain, and water conservation exhibited the same results in all three models with only income exhibiting inconsistent results. This, however, could be explained by examining the raw data which reveals that San Luis Obispo County s median income went down during the research period. The negative coefficient on price indicated that as the price of water increased, the amount of water consumed typically decreased. As expected, as the number of customers increase, the amount of water consumed

3 2 increased and as the amount of rain increased, the amount of water consumed decreased. Finally, water consumption also increased as the median income increased. Unpredictably, media and water conservation did not have the expected effect on water consumption. The consistent results exhibited by price lead the author to reject the null hypothesis and conclude that punitive water rates have an effect on water consumption. However, because the regression results for media influence and water conservation policies did not have the expected effect on water consumption and, due to the use of time-series, cross-section data, it was prudent to check for issues with multicollinearity, heteroskedasticity, and autocorrelation.

4 3 DATA For this analysis, the data set used in the original research, which includes the following, was utilized: Amount of annual water delivered per customer class. The dependent variable is used to evaluate total water consumption and refers to the water withdrawn through a meter attached to the service line for residential use. Units of water are reflected in gallons. Number of water connections per customer class. This independent variable is used to determine the impact of additional customers on water consumption and refers to the number of customers connected to the water system. Date of water conservation program inception. This independent variable is used to determine the availability of water conservation program elements to the customers of the purveyor and refers to individual devices that make up a water conservation program and/or drought policy. This is a dummy variable that focuses on the availability of water conservation program elements. A 1 has been used for the months in which the elements were available and a 0 for those months when the water conservation program unavailable. Median income. This independent variable is used to determine the impact of income on water consumption and refers to the median household income as reported by the United States Census Bureau. Price per unit of water. This independent variable is used to determine the impact of price on water consumption. This variable has been lagged one month to account for the delay between the month the price goes into effect and the month that the consumer makes changes in their behavior as a result of that change. The price of water refers to the per unit (CCF 1 ) 1 1 CCF = one hundred cubic foot = 748 gallons = 1 unit

5 4 rate paid by the consumer for water used. For those agencies with tiered rates, this represents the price per unit at the highest tier. Fines. This independent variable is used to determine the impact of fines for excessive water consumption on water consumption. This variable refers to the punitive amount paid by the consumer for excess water consumption and has been lagged one month to account for the delay between the month the fine goes into effect and the month that the consumer makes changes in their behavior as a result of that change. This is a dummy variable that focuses on the presence of fines. A 1 has been used for the months in which the fines were in effect and a 0 for those months without fines. Rainfall. This independent variable is used to determine the impact of rainfall on the amount of water consumed and refers to the monthly total precipitation. Media. This independent variable is used to determine the impact of media reports of drought on the amount of water consumed and refers to media attention regarding California drought conditions, including the declaration of drought by the Office of the Governor. This is a dummy variable that focuses on the public outreach by the State or the media to reduce water consumption. A 1 has been used for the months in which drought was mentioned in the Los Angeles Times and a 0 for those months without drought coverage.

6 5 REGRESSION ANALYSIS Regression analysis is a statistical technique used to describe the relationships between two variables based on the principle of minimizing errors in prediction (Meier, Brudney, & Bohte, 2009, p. 542). For regression analysis to be accurate, several assumptions must be met: (1) errors are normally distributed; (2) variance of the error term is constant (homoskedastic); (3) errors are independent of each other; and (4) the relationship between the independent variable and dependent variable is linear. The goal of this research is to test the robustness of the original research findings by rigorously testing the data for issues with heteroskedasticity, autocorrelation, and multicollinearity. Figure 1 presents a graphical representation of the change in R 2 as detailed in the text below. Logarithmically Transformed Dependent Variable For the original research project, units of water were logarithmically transformed 2. For this subsequent research, the transformation was removed. OLS regression estimates of coefficients for the non-transformed model are reported in Table 6. When comparing the new model with the original, the change in adjusted R 2 is minimal, from 77.9% to 78.4%. However, although not statistically significant, the coefficient for the water conservation variable changed sign from positive to negative. Heteroskedasticity The assumption of homoskedasticity states that the variance of the regression errors is constant. The homoskedasticity assumption states that the variance of the regression errors is σ 2 regardless of which set of values of the p predictor variables is used to generate those errors. 2 Logarithmic transformation of the data has been recommended by the literature in the case of highly skewed distributions (Leydesdorff & Bensman, 2009).

7 6 When this assumption is violated, we say that the errors are heteroskedastic, a condition known as heteroskedasticity. The homoskedasticity assumption implies that the variance of the errors is unrelated to any predictor or any linear combination of the predictor variables (Hayes & Cai, 2007, p. 710). Utilizing the non-transformed dependent variable, two methods were utilized to correct for heteroskedasticity, a fixed effects model and White (1980) robust standard error model which was clustered on agency. OLS regression estimates of coefficients for the models are reported in Table 5 and Table 8. The fixed effects model increased the R 2 to 82.5%. The White model exhibited a minimal change in R 2 from 78.0% to 78.5% and, again, although not statistically significant, the coefficient for the water conservation variable changed sign from positive to negative. Autocorrelation Kan and Wang (2010) note that sample autocorrelation coefficients are widely used to test the randomness of a time series (101). To control for autocorrelation, two models were used. The first model utilized the non-transformed dependent variable and the dependent variable was lagged one month and added to the equation as an independent variable. The second model utilized a lagged dependent variable and the White (1980) robust standard error. OLS regression estimates of coefficients for the models are reported in Table 7. The lagged model exhibited a change in R 2 from 77.9% to 89.8%. OLS regression estimates of coefficients for the models are reported in Table 9. When the lagged White model was utilized, the R 2 decreased to 77.0%. Although not statistically significant, for both models, the coefficient for the water conservation variable changed sign from positive to negative. Multicollinearity According to Alabi, Ayinde, and Olatayo (2008), the problem of multicollinearity arises when the assumption of independent regressors fails. To check for multicollinearity, utilizing the

8 7 previously discussed models and the non-transformed dependent variable, independent variables were systematically removed from the regressions: Media Influence when the variable representing media influence was removed from the equation for each model, the coefficient for the water conservation variable changed sign from positive to negative. However, none of them showed statistical significance. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 10. When compared to the original equation, the model exhibited a minimal change in the adjusted R 2 from 77.9% to 78.5%. o The lagged model is reported in Table 11. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.8%. o The White model is reported in Table 12. When compared to the original equation, the model exhibited a minimal change in R 2 from 78.0% to 78.5%. o The lagged White model is reported in Table 13. When compared to the original equation, the model exhibited a decrease in R 2 from 78.0% to 77.0%. Price when the variable representing price was removed from the equation for each model, the coefficient for the water conservation variable changed sign from positive to negative. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 14. When compared to the original equation, the model exhibited a minimal reduction in the adjusted R 2 from 77.9% to 77.8%. o The lagged model is reported in Table 15. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.7%.

9 8 o The White model is reported in Table 16. When compared to the original equation, the model exhibited a minimal change in R 2 from 78.0% to 77.8%. o The lagged White model is reported in Table 17. When compared to the original equation, the model exhibited a reduction in R 2 from 78.0% to 76.4%. Fine when the variable representing fine was removed from the equation for each model, the coefficient for the water conservation variable changed sign from positive to negative. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 18. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 78.3%. o The lagged model is reported in Table 19. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.7%. o The White model is reported in Table 20. When compared to the original equation, the model exhibited a minimal change in R 2 from 78.0% to 78.3%. o The lagged White model is reported in Table 21. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 76.9%. Number of Customers when the variable representing number of customers was removed from the equation for each model, the coefficient for the water conservation variable did not change. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 22. When compared to the original equation, the model exhibited a major change in the adjusted R 2 from 77.9% to 23.6%. o The lagged model is reported in Table 23. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 88.3%.

10 9 o The White model is reported in Table 24. When compared to the original equation, the model exhibited a substantial change in R 2 from 78.0% to 23.7%. o The lagged White model is reported in Table 25. When compared to the original equation, the model exhibited a significant change in R 2 from 78.0% to 22.4%. Income when the variable representing income was removed from the equation for each model, the coefficient for the water conservation variable did not change. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 26. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 71.8%. o The lagged model is reported in Table 27. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.3%. o The White model is reported in Table 28. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 71.8%. o The lagged White model is reported in Table 29. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 70.3%. Rain when the variable representing rain was removed from the equation for each model, the coefficient for the water conservation variable changed sign from positive to negative. OLS regression estimates of coefficients for: o The non-transformed model is reported in Table 30. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to75.7%. o The lagged model is reported in Table 31. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.1%. o The White model is reported in Table 32. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 75.8%.

11 10 o The lagged White model is reported in Table 33. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 75.8%. Water Conservation OLS regression estimates of coefficients for the: o The non-transformed model is reported in Table 34. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 78.4%. o The lagged model is reported in Table 35. When compared to the original equation, the model exhibited a change in the adjusted R 2 from 77.9% to 89.8%. o The White model is reported in Table 36. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to78.4%. o The lagged White model is reported in Table 37. When compared to the original equation, the model exhibited a change in R 2 from 78.0% to 77.0%. Conclusion This paper explored the robustness of the results presented in Allen (2010). In the original research, the consistent results exhibited by price lead the author to reject the null hypothesis and conclude that punitive water rates have an effect on water consumption. With a few exceptions, subsequent testing to check for issues with multicollinearity, heteroskedasticity, and autocorrelation resulted in findings that were fairly consistent with the original tests. These results lead the author to believe that the original results were fairly robust. Although not statistically significant, the coefficient for the water conservation variable changed sign from positive to negative when the non-transformed model, the White model, the lagged White model, and the multicollinearity models for media influence, price, fine, and rain were utilized. Finally, when the number of customers was removed from the equation in the multicollinearity models, the R 2 decreased substantially for the non-transformed model, the

12 11 White model, and the lagged White model. This can be explained by the fact that as the number of customers increase, the amount of water consumed will increase proportionately.

13 12 Figure 1 Graphically Representation of R 2 Change FIGURES Original All Variables Remove Media Remove Price Remove Fine Remove Customers Remove Income Remove Rain Remove Water Conservation Non-Transformed Lagged White Lagged White

14 13 TABLES Table 1 Participating Purveyors Buellton, City of Santa Barbara County Carpinteria Valley Water District Goleta Water District Lompoc, City of Los Alamos Community Services District Mission Hills Community Services District Montecito Water District Santa Barbara, City of Santa Maria, City of Santa Ynez River Water Conservation District ID #1 Solvang, City of San Luis Obispo County Arroyo Grande, City of Atascadero Mutual Water Company Avila Beach Community Services District Cambria Community Services District Grover Beach, City of Heritage Ranch Community Services District Morro Bay, City of Nipomo Community Services District Oceano Community Services District Paso Robles, City of Pismo Beach, City of Vandenberg Village Community Services District

15 14 Table 2 Linear Regression Results All Variables (Dependent Variable Logarithmic Usage) Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value Table 3 Linear Regression Results Santa Barbara County (Dependent Variable Logarithmic Usage) Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value N 1781 Table 4 Linear Regression Results San Luis Obispo County (Dependent Variable Logarithmic Usage) Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value N 1584

16 Table 5 Linear Regression Results Fixed Effects (Eliminated Dummy Variable Vandenberg Village) Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) Buellton Carpinteria Goleta Lompoc E Los Alamos Mission Hills Montecito 6.346E Santa Barbara 2.236E E Santa Maria 1.832E E Santa Ynez 4.363E Solvang Arroyo Grande 1.100E Atascadero 3.609E Avila Beach Cambria Grover Beach Heritage Ranch Morro Bay E Nipomo 1.769E Oceano Paso Robles 2.684E Pismo Beach R % Adjusted R % F Value

17 16 Table 6 Linear Regression Results All Variables Constant E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value Table 7 Linear Regression Results All Variables Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

18 17 Table 8 Linear Regression Results Robust Standard Errors All Variables Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F Table 9 Linear Regression Results Robust Standard Errors All Variables (Dependent Variable Lagged Usage) Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F N 3336

19 18 Table 10 Linear Regression Results Remove Media Variable Constant E Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value Table 11 Linear Regression Results Remove Media Variable Constant Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

20 19 Table 12 Linear Regression Results Robust Standard Errors Remove Media Variable Intercept E Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F Table 13 Linear Regression Results Robust Standard Errors Remove Media Variable (Dependent Variable Lagged Usage) Intercept E Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F N 3336

21 20 Table 14 Linear Regression Results Remove Price Variable Constant E Media (1, 0) E Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value Table 15 Linear Regression Results Remove Price Variable Constant Media (1, 0) Fine (1, 0) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

22 21 Table 16 Linear Regression Results Robust Standard Errors Remove Price Variable Intercept E Media (1, 0) Fine (1, 0) E Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F Table 17 Linear Regression Results Robust Standard Errors Remove Price Variable (Dependent Variable Lagged Usage) Intercept E Media (1, 0) Fine (1, 0) E Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F N 3336

23 22 Table 18 Linear Regression Results Remove Fine Variable Constant E Media (1, 0) Price ($/CCF) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Adjusted R % F Value Table 19 Linear Regression Results Remove Fine Variable Constant Media (1, 0) Price ($/CCF) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

24 23 Table 20 Linear Regression Results Robust Standard Errors Remove Fine Variable Intercept E Media (1, 0) Price ($/CCF) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F Table 21 Linear Regression Results Robust Standard Errors Remove Fine Variable (Dependent Variable Lagged Usage) Intercept E Media (1, 0) Price ($/CCF) Customers (N) Income ($) Rain (inches) Water Conservation (1, 0) R % Wald F N 3336

25 24 Table 22 Linear Regression Results Remove Customers Variable Constant 2.147E Media (1, 0) Price ($/CCF) Fine (1, 0) 3.464E Income ($) Rain (inches) Water Conservation (1, 0) 4.151E R % Adjusted R % F Value Table 23 Linear Regression Results Remove Customers Variable Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Income ($) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

26 25 Table 24 Linear Regression Results Robust Standard Errors Remove Customers Variable Intercept 2.147E E Media (1, 0) Price ($/CCF) Fine (1, 0) 3.464E E Income ($) Rain (inches) Water Conservation (1, 0) 4.151E E R % Wald F Table 25 Linear Regression Results Robust Standard Errors Remove Customers Variable (Dependent Variable Lagged Usage) Intercept 1.884E E Media (1, 0) Price ($/CCF) Fine (1, 0) 3.420E E Income ($) Rain (inches) Water Conservation (1, 0) 4.170E E R % Wald F N 3343

27 26 Table 26 Linear Regression Results Remove Income Variable Constant 1.718E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Rain (inches) Water Conservation (1, 0) 1.023E R % Adjusted R % F Value Table 27 Linear Regression Results Remove Income Variable Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Rain (inches) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

28 27 Table 28 Linear Regression Results Robust Standard Errors Remove Income Variable Intercept 1.718E Media (1, 0) Price ($/CCF) Fine (1, 0) E Customers (N) Rain (inches) Water Conservation (1, 0) 1.023E R % Wald F Table 29 Linear Regression Results Robust Standard Errors Remove Income Variable (Dependent Variable Lagged Usage) Intercept 1.489E Media (1, 0) Price ($/CCF) Fine (1, 0) E E Customers (N) Rain (inches) Water Conservation (1, 0) 1.034E R % Wald F N 3336

29 28 Table 30 Linear Regression Results Remove Rain Variable Constant E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Water Conservation (1, 0) R % Adjusted R % F Value Table 31 Linear Regression Results Remove Rain Variable Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Water Conservation (1, 0) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

30 29 Table 32 Linear Regression Results Robust Standard Errors Remove Rain Variable Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Water Conservation (1, 0) R % Wald F Table 33 Linear Regression Results Robust Standard Errors Remove Rain Variable (Dependent Variable Lagged Usage) Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Water Conservation (1, 0) R % Wald F N 3336

31 30 Table 34 Linear Regression Results Remove Water Conservation Variable Constant E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) R % Adjusted R % F Value Table 35 Linear Regression Results Remove Water Conservation Variable Constant Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) Lagged Usage (gallons) R % Adjusted R % F Value N 3335

32 31 Table 36 Linear Regression Results Robust Standard Errors Remove Water Conservation Variable Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) R % Wald F Table 37 Linear Regression Results Robust Standard Errors Remove Water Conservation Variable (Dependent Variable Lagged Usage) Intercept E Media (1, 0) Price ($/CCF) Fine (1, 0) Customers (N) Income ($) Rain (inches) R % Wald F N 3336

33 32 REFERENCES Alabi, O. O., Ayinde, K., & Olatayo, T. O. (2008). Effect of Multicollinearity on Power Rates of the Ordinary Least Squares Estimators. Journal of Mathematics and Statistics, 4(2), Allen, C. (2010). A Study of Water Rates in the Counties of San Luis Obispo and Santa Barbara, California. California Polytechnic State University, San Luis Obispo. Hayes, A. F., & Cai, L. (2007). Using heteroskedasticity-consistent standard error estimators in OLS regression: An introduction and software implementation. Behavior Research Methods, 39(4), Kan, R., & Wang, X. (2010). On the distribution of the sample autocorrelation coefficients. Journal of Econometrics, 154(2), Leydesdorff, L., & Bensman, S. (2009). Classification and Powerlaws: The Logarithmic Transformation. Journal of the American Society for Information Science and Technology (forthcoming) Meier, K. J., Brudney, J. L., & Bohte, J. (2009). Applied statistics for public and nonprofit administration (7th ed.). Belmont, CA: Wadsworth. White, H. (1980). A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica, 48(4),