Has Water Quality Improved?:

Size: px
Start display at page:

Download "Has Water Quality Improved?:"

Transcription

1 Has Water Quality Improved?: Use of a Spreadsheet for Statistical Analysis of Paired Watershed, Upstream/Downstream and Before/After Monitoring Designs 1 Garry L. Grabow, Jean Spooner, Laura A. Lombardo, Daniel E. Line, and Kevin L. Tweedy NCSU Water Quality Group, Biological and Agricultural Engineering Department, NC State University Detection of differences or trends in water quality data requires organization and analysis of data collected from the field. Such analysis may be done on a computer using statistical packages such as SAS, or by using spreadsheets. Recent versions of spreadsheet software include several statistical functions, and with minimal data setup, may be used to analyze and interpret data 2. This article illustrates the use of a spreadsheet to organize and analyze data for detecting a water quality change due to changes in land management. The spreadsheet used here is Excel version Two parametric 4 statistical tests, the t-test and regression analysis, will be presented for use with paired watershed, upstream/downstream, and before/after water quality monitoring designs. These analyses are also presented using SAS in a companion fact sheet (Grabow, et al. 1998) Monitoring Designs Three monitoring designs common to water quality studies are paired watershed, upstream/downstream, and before/after. A paired watershed design (Clausen and 1 Prepared for the Sixth National Nonpoint-Source Monitoring Workshop, September 21-24, 1998, Cedar Rapids, Iowa. 2 It is assumed that the reader has had some experience with a spreadsheet, and an introductory course in statistics, or some statistical experience. 3 While directions refer specifically to Excel version 7.0 commands, other spreadsheets have similar capabilities. 4 Excel does not have non-parametric data analysis tools. Spooner, 1993) comprises two watersheds of similar location and land use (control and treatment) and two time periods of study (calibration and treatment). Typically, one sampling station is positioned at the outlet of each watershed. During the calibration period (typically at least two years), land use at both control and treatment sites should remain the same. The goal is to establish a relationship between the watersheds. At the end of the calibration period, best management practices (BMPs) are implemented at the treatment site. The project then proceeds into the treatment period (usually at least 2 years). Again, the goal is to establish a relationship between control and treatment watersheds. The relationships are compared to see if a change has occurred due to BMP implementation. An upstream/downstream-(before/after) design (Spooner et al., 1985) also requires calibration and treatment periods (before and after BMP implementation); however, unlike the paired watershed design, only one watershed is monitored, with sampling stations positioned upstream and downstream of the treatment area. With a before/after monitoring design (Spooner et al., 1985), water quality data from one downstream station is collected for a period of time before and after BMP implementation. Statistical Tests T-test and Regression Analysis Two statistical tests are presented which may be used to detect water quality changes a t-test and regression analysis. A t-test is used to detect a statistically significant change in the mean of the water quality parameter of interest before and after BMP implementation. The test compares the mean values of the parameter calculated for each period (pre- and post- BMP) and considers the variability of the data. The advantages of a t-test are that it is easily interpreted and widely used. 1

2 Regression analysis, however, usually presents a more powerful way of detecting and evaluating change, particularly when covariates (e.g. the control and treatment water quality parameter of interest in a paired watershed design), or explanatory variable and water quality variable (in a before/after design) exhibit a strong relationship. Regression analysis, as presented here, is actually an analysis of covariance (AN- COVA) that allows for the comparison of groups or classes of data (e.g. before and after) while adjusting for changes in explanatory variables (e.g., streamflow). It is strongly suggested that a regression analysis be done in all cases. Example Treatment Control Period Design Date/Time Watershed Watershed Before/After t 1 y 1 x 1 Before Paired t 2 y 2 x 2 Before Watershed t 3 y 3 x 3 Before... After Date/Time Downstream Upstream Period t 1 y 1 x 1 Before Upstream/ t 2 y 2 x 2 Before Downstream t 3 y 3 x 3 Before... After Wat Quality Explanatory Date/Time Parameter Variable Period t 1 y 1 x 1 Before Before/ t 2 y 2 x 2 Before After t 3 y 3 x 3 Before... After Figure 1. Data Setup in Spreadsheet for 3 Monitoring Designs generation of intermediate variables which may be required for the analysis (e.g., log transformed values) and greater ease of specifying data ranges for the statistical functions found in the spreadsheet. As an example, data obtained from the Morro Bay, California EPA 319 National Monitoring Program Project will be used to demonstrate how to detect whether a change in average storm turbidity has occurred due to BMP installation. A t-test and regression analysis will be performed using a spreadsheet. The sampling design is a paired watershed on two sub-basins within the Morro Bay watershed. Chumash Creek is the treatment watershed, while Walters Creek is the control. Raw Data Organization and Exploratory Data Analysis With most spreadsheet operations, analysis is easiest when data are placed in columns with no blank rows, and labels only at the top. Different worksheets (pages) within a workbook may be used to logically separate data, such as from different sampling stations. Benefits of this format include easy With all three monitoring designs, the water quality data are paired by date and should be placed on the same spreadsheet row (see Figure 1). For a paired watershed design, the paired data are the control and treatment watershed data collected from the same time period or sampling event. In an upstream/downstream design, the pairing is between data collected from upstream and downstream stations for the same time period or sampling event. A lag may be required if the travel time between the two stations is significant relative to the sampling frequency. For a before/after design, the explanatory variable (see below) is paired with the water quality parameter of interest. Before performing any statistical analyses to detect differences or change, some exploratory data analysis should be done to see if the data are in the proper form for analysis. The statistical tests discussed here are parametric and require that the 2

3 data be approximately normally distributed and independent (not autocorrelated). Check for Autocorrelation If the raw data are collected at short time intervals, e.g., sub-hourly, hourly, daily, or even weekly, there is a good chance it will be autocorrelated. Checking and adjusting the data if necessary for autocorrelation insures that the data are independent. Autocorrelation, sometimes referred to as serial correlation, is reflected by data that is related or correlated to previous and subsequent samples. In regression analysis, the residuals are checked for autocorrelation. Autocorrelation occurs quite often in water quality and hydrologic data, as evidenced by hydrographs, sedigraphs, and chemographs which exhibit an autocorrelated trend in the data. Both parametric and non-parametric Statistical tests are based upon the assumption of independent data. Autocorrelation results in the data containing redundant information and less information than for the same number of independent observations; Therefore, statistical tests performed on non-independent or autocorrelated data may not be as conclusive as the tests imply. As a result adjustment for autocorrelation must be made to ensure proper interpretation. Perhaps the easiest way to check for autocorrelation is to plot the data over time. Repeated trends or patterns are an indication of autocorrelation. Autocorrelation can be seen in Figure 2 for the raw 30-minute paired turbidity data collected on Walters Creek. More formal tests for autocorrelation can be performed by statistical packages such as SAS (PROC AUTOREG) 5. Turbidity, ntus /09/95 00:00 01/09/95 12:00 01//95 00:00 01//95 12:00 Date/Time Figure 2. Autocorrelation exhibited in Walters Creek Turbidity data Correction for Autocorrelation The easiest way to eliminate autocorrelation, and thus simplify the subsequent analysis, is to aggregate or average the data into larger time steps; for instance, from hourly to daily, or from daily to storm period. This is often termed data reduction. In the example, since the 30-minute turbidity data collected from Chumash and Walters Creek were autocorrelated, averages were calculated from paired data for each storm event lasting one to three days. If it is not possible to take an average (perhaps this results in too few data points) there are other ways of accounting for autocorrelation. This may be done through adjustment of the variance 6 using SAS 5. Once correction for autocorrelation has been made, organization of reduced data in the spreadsheet should follow the same layout presented earlier in Figure 1. Explanatory Variables 01/11/95 00:00 01/11/95 12:00 01/12/95 00:00 01/12/95 12:00 With a before/after single downstream station design, it is necessary to factor out or block variables other than land treatment that may have changed over time and caused a change in the water quality parameter of interest. Accounting for these variables allows for better documentation of 5 See (Grabow et al, 1998) 6 The reader is referred to (Gilbert 1987 and Sanders et al. 1990). 3

4 the water quality change due to land treatment. These variables are called explanatory variables since they explain much of the variation in data and may include rainfall, streamflow, or other hydrologic factors. Another name for these variables is covariates. Relative Frequency Skewed Distribution (Positive Skew) Normal Distribution Explanatory variables are useful in all monitoring designs to document the relationship between water quality and hydrologic and/or meteorologic changes. For trend detection in a paired watershed or upstream/downstream design, including explanatory variables in the statistical analysis is less important, since concurrent or paired data are collected under similar hydrologic conditions. With a before/after design, explanatory variables must be integrated into the analysis. For example, suspended sediment yield, which is highly dependent upon flow, may be regressed against streamflow, then a t- test may be performed on the mean of the pre- and post-bmp residuals. In regression analysis, calibration and treatment period relationships between sediment yield and flow may be compared. Once the data are corrected for autocorrelation and organized in the spreadsheet, further evaluation is required to see if the data are normally distributed. Parameter Value Figure 3. Normal and Skewed Distributions skewed if it has a longer left tail (see Figure 3). In most event-related water quality data, the skew will be positive reflecting many relatively low values and a few very high values. There are several ways to check for skewness in data using a spreadsheet. One way is to use the descriptive statistics 7 function available in Excel. The Data Analysis 8 command under Tools allows access to this feature (see Figure 4). Once in the descriptive statistics dialog box select the input range (your data) and a place in the spreadsheet to put the results (output range) and check the Summary Statistics box. Among the results will be Skewness (see Figure 5). If the data are perfectly normal the coefficient of skew will be zero. As a rule of thumb, an absolute value of skewness above 1.0 indicates a degree of skewness in the data that should be addressed. Check for Normality Parametric statistical tests, or those tests based upon estimates of statistical parameters such as mean and standard deviation, require that the data be normally (or nearly so) distributed. A normal distribution is indicated by a bell-shaped, symmetrical curve. Many times water quality and hydrologic data are not normally distributed, but are skewed. The skewness indicates the departure of the distribution from the normal curve. Data are positively skewed if the curve has a longer right tail and negatively Figure 4. Excel Analysis Tools 7 This and other statistical functions may be found in the Analysis ToolPak add-in. This will need to be installed if not available under the Tools command. 4

5 Column1 Mean Standard Error 133. Median Mode #N/A Standard Deviation Sample Variance Kurtosis.00 Skewness 3.02 Range Minimum Maximum Sum Count 35 Figure 5. Summary Statistics for Chumash Turbidity Data Figure 5 shows a high degree of skewness (skewness = 3.02) for Chumash storm average turbidity. A more visual way of checking for normality is to develop a histogram of the data. This is also accessed through the analysis Tool- Pak via the Tools/Data Analysis menu commands. In the histogram dialog box, enter the data range and check the box for chart output. A graph will be generated and the degree of normality, or conversely skewness, can be seen by the symmetry of the histogram. A bell shaped symmetry indicates normality. If you check the cumulative probability box, a symmetrical sigmoidal or S shaped curve would indicate normally distributed data. Figure 6 shows that average storm turbidity from Chumash Creek are skewed. Correction for Skewness If the data are skewed, transformations of the variable(s) should be done. The most common transformation in water quality and hydrology is a logarithmic transformation. 8 Italics indicate a menu command in Excel. Frequency Histogram 0% More Frequency Chumash turbidity, NTUs 120% 0% 80% 60% 40% 20% Cumulative % Figure 6. Histogram of Chumash Turbidity Data This is done in Excel by using the LOG function =LOG(x) to return the base logarithm of the value x in a new column. After doing this transformation, repeat the diagnostic steps given previously to check for normality. The example data were log transformed. This reduced the skewness of Chumash average storm turbidity data to The reduced example data set, including log transformation, is shown in Figure 7. Even if the data are moderately skewed, in many cases log transformation will improve the distribution and therefore improve the subsequent analysis. If the data remain highly skewed after transformation, non-parametric data analysis may be required 9. Typically, if water quality data have been log transformed, the residuals from a regression analysis are normally distributed and have a constant variance. A non-constant variance will show up in the residual plot as a funnel" or mounding in the residual plot rather than an even distribution. The residual plot in Figure 8 (using the residuals from Equation 2 in Regression Analysis ) shows that the residuals have a constant variance. Since these data were aggregated to a storm basis, autocorrelation is not a concern. If autocorrelation were a 9 Non-parametric analysis is not treated in this fact sheet since it is not directly available in Excel. 5

6 Control Treatment Walters Chumash (1) (2) Observation avg turb avg turb log log (1)-(2) (storm) ntu ntu Chumash Walters Difference Figure 7. Example Data Set (Partial) after Storm Averaging and Log Transformation concern the residuals must be plotted over time. This may be done by changing the "xvalues of the data series to the column labeled observation number in the residual output. This assumes that the data the residuals are generated from are entered in chronological order. Evidence of autocorrelation in the residuals plotted over time may indicate that the trend is non-linear. Normality of the residuals can be checked by subjecting them to the analysis as detailed before in the section Check for Normality. Variable 1 Residual Plot Statistical Tests to Detect Changes This discussion is limited to detecting discrete as opposed to gradual changes, e.g. before and after BMP implementation. The most common ways of detecting a discrete water quality change due to land treatment changes are t-tests (analysis of variance or ANOVA) and regression analysis (Analysis of Covariance or ANCOVA). The tests presented here require that at least one year of post BMP data have been collected. The tests may be repeated as more data are collected. Residuals Variable 1 (log Walters TSS) Figure 8. Residual Plot Adjustment of the regression analysis for autocorrelated residuals is too complex to be easily done in a spreadsheet and is not discussed here. ANOVA on Means A comparison of the means between paired watershed data, upstream/downstream data, or single station before/after data may be done using analysis of variance (ANOVA). For paired watershed and upstream/downstream monitoring designs, the analysis is a two-way ANOVA on the water quality variable, with the two factors being the PERIOD (calibration or treatment) 6

7 and LOCATION (control or treatment watershed, or upstream or downstream station respectively). For a before/after monitoring design, the analysis is a one-way ANOVA on the residuals of the water quality variable regressed against the explanatory variable, and the factor is the period (before or after). See Figure 9 for an ANOVA diagram for the three monitoring designs. Location Paired Watershed Control Treatment Before Period Excel cannot perform a two-way ANOVA unless the data are balanced, which in the case of a paired watershed or upstream/downstream design necessitates that an equal number of observations be collected during the calibration and treatment periods. This is unlikely to be the case with event data, since the number of storm events within the period of study are not controllable. Conversion to a one-way ANOVA may be done by calculating differences on paired data between the paired watersheds or upstream downstream stations. After Y bc Y ac Y bt Y at 1-way Difference b Before Upstream/Downstream Location Upstream Downstream D D a After Y bu Y au Y bd Y ad 1-way Difference b Before Before/After Residuals from regression of water quality vs. explanatory Figure 9. ANOVA Table D b r D a After r a Using the example data from the paired watershed, Figure 8 shows a new variable which is the difference in the log transformed turbidity values of Chumash and Walters Creek This eliminates the factor of location and reduces the analysis to a oneway ANOVA based on the period (calibration and treatment). Is There a Difference? A one-way ANOVA with only 2 levels of the factor (before and after) is equivalent to a t-test. In Excel, select Tools/Data Analysis/t-test: two-sample assuming unequal variances 11. In the dialog box, select the calibration or before data for the variable 1 range, and the treatment or after data for the variable 2 range. The hypothesized difference is 0 (default). The default probability value of α=0.05 is usually appropriate. The output will give both one- and two-tailed results. The two-tail results are for the null hypotheses of the means being equal. If the computed value for t t Stat is greater than the critical value t critical two tail, then the null hypothesis is rejected and it is concluded that the means are different: in other words, a statistically significant change has been detected. Another way to interpret the results is by using the p value. If the P(T<=t) is less than the selected α value, the null hypothesis stated above is rejected, and it is concluded that the means are different. For the example data set, the mean of the differences in log transformed storm TSS values (Chumash minus Walters) was significantly different for the pre- and post- BMP periods at the α=0.05 level since the p value of is less than α (See Figure ). 11 The example data were found to have unequal variances between pre- and post-bmp data as determined by the F- test available in Excel. The t-test assuming equal variances should be selected if the results in the F test indicate such. 7

8 t-test: Two-Sample Assuming Unequal Variances Pre Post Mean Difference b c Variance Observations Hypothesized Mean 0 Difference df 16 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Figure. Excel output for a t-test for difference of preand post-bmp differences in log transformed turbidity values (Chumash Walters). b=avg. (log Chumash log Walters) calibration period; c = avg. (log Chumash log Walters) treatment period. How Much is the Difference? If the conclusion is that there has been a statistically significant change in the water quality parameter, the next step is to quantify that change. For a paired watershed or upstream/downstream design, using a difference variable to convert from a two-way to one-way analysis, the percent reduction in the original (untransformed) units is [ a a a ( b c) ]0 (1) where a is the average log transformed pre- BMP value for the treatment or downstream station, b is the average of the differences for the calibration period and c is the average of the differences for the treatment period. In our example, the mean log transformed value of pre-bmp turbidity for Chumash Creek is 2.85 (Calculated in Excel), the mean difference in Chumash and Walters pre-bmp log transformed turbidity is.345 and the mean difference in post-bmp turbidity is.0168 (See Figure ). Using equation (1) the mean percent reduction in turbidity for Chumash relative to its pre-bmp average turbidity as adjusted by Walters Creek (control) is 53% and is statistically significant. Regression Analysis Regression analysis usually presents a much more powerful way to detect change. The analysis as presented here is actually an analysis of covariance (ANCOVA) that is a combination of an ANOVA for the comparison of means between two periods as adjusted by regression on the covariates. The strength of this analysis over an ANOVA or t-test is especially true for covariates that have a strong linear relationship. In our paired watershed example, the covariates are log transformed turbidity at the control (Walters) and treatment (Chumash) watersheds. In a paired watershed design, the independent variable () is the water quality parameter of the control watershed, while the dependent variable (Y) is the same parameter for the treatment watershed. In an upstream/downstream monitoring design, the independent variable () is the upstream station and the dependent variable (Y) is the downstream station. For a before/after design, the independent variable () is the explanatory variable, such as water discharge. The regression analysis will detect whether or not the lines of best fit from the calibration and treatment periods differ in a paired watershed or upstream/downstream design, or the lines of best fit between the water quality variable and explanatory variable differ in a before/after design. A difference in the regression lines indicate a change due to land treatment. 8

9 To prepare data for a regression analysis in Excel, it is first necessary to generate two variables in addition to the two paired water quality variables. The first is an indicator variable, which assumes a value of 0 or 1 to differentiate the data between calibration and treatment periods (2 in Figure 11). The indicator variable allows the regression lines to have different intercepts. The second variable is created by multiplying the first indicator variable (0 or 1) by the independent variable (1*2 in Figure 11). The 1*2 term is called an interaction term and allows the two regression lines to have different slopes. Addition of these two independent variables is required to develop and compare the two separate regression lines. In the example data set, the control watershed variable (log Walters TSS) and two created variables are the independent or variables (1, 2, and 1*2 in Figure 11), while the treatment watershed variable is the dependent or Y variable. The independent variables need to Y 1 2 1*2 Log Chumash log Walters Pre(0) Post(1) log Walters Figure 11. Variables Required for Before/After Regression Analysis be in contiguous columns when using regression in Excel. The regression analysis in Excel is also accessed through the Analysis ToolPak under Tools/Data Analysis/Regression. Mark the check boxes for standardized residuals and residual plots for residual diagnostics. Residuals are the differences between the data points and the regression line. In regression analysis, the residuals should be normally distributed and uncorrelated, and have a constant variance. The resulting regression equation is Y = β β β + β (2) where Y is the estimate of the dependent variable, 1 is the independent variable, 2 is the indicator variable (0 for calibration period in paired watershed or upstream/downstream, and before in before/after; and 1 for treatment period or after), and β i, i=0,3 are the regression coefficients. This equation reduces to Y = β + β (3) for the calibration period and Y = β + β + ( β + β (4) ) for the treatment period. Is There a Difference? If β 3 is statistically significant (p<=α), then the calibration and treatment period slopes are different (by a magnitude of β 3 ). In the example, the difference in slopes are significant at α=0.07, the p value for β 3 ), or we can say the slope is different with 93% confidence. If β 2 is statistically significant then the y intercept of the lines are different (by a magnitude of β 2 ). In this example, the intercepts are statistically different at α=0.04. If statistical significance is set at the α=0.05 level, then the regression should be re-run 1 9

10 without the interaction (1*2) term since p= In this case, the calibration period equation would be the same as Equation 3 and the treatment period equation would be Y = β β + β (5) This would result in regression lines with the same slope (parallel) separated by a magnitude of β 2. Figure 12 shows the results for the example data set using the full model with interaction term. The results indicate that the turbidity for the treatment watershed was reduced relative to the control watershed (β 2, the change in intercept, is negative), but has a lower reduction at higher levels of turbidity (β 3, the change in slope is positive). This can be seen in the calibration and treatment period regression lines shown in Figure 14 as the greatest distance between the two lines is at the lower levels of turbidity, and the difference lessens at higher levels. When the interaction term is left out, β 2, the change in intercept term, becomes more statistically significant (p=0.01), and again, it is shown that the turbidity of the treatment watershed has been reduced relative to the control watershed as the change in intercept is negative (see Figure 13 for the Excel output). The graph in Figure 15 shows this difference visually. How Much is the Difference? The average difference for the full model with interaction term may be obtained by setting 1 in equations (3) and (4) to the average log transformed value of turbidity for the control watershed (Walters) for the entire period (both calibration and treatment periods to adjust for different conditions in these periods). For the example data set, SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 35 ANOVA df SS MS F Significance F Regression Residual Total Coefficients Standard Error t Stat P-value Lower 95% Intercept, β Variable 1, β Variable 2, β Variable 3, β Figure 12. Excel Regression Analysis Output for Equation with Interaction Term

11 Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 35 ANOVA df SS MS F Significance F Regression Residual Total Coefficients Standard Error t Stat P-value Lower 95% Intercept, β Variable 1, β Variable 2, β Figure 13. Excel Regression Analysis Output for Equation without Interaction Term this was calculated to be 2.573, and the predicted log transformed values for Chumash are using the calibration period regression equation and using the treatment period regression equation, or a percent decrease in the original scale of ( )]0 (6) 2. [ 878 If the reduced model without the interaction term is used, and using the same method to obtain the difference, the average reduction would be or 49% ( )]0 (7) 2. [ 894 or 49% log Chumash (Treatment Watershed) Turb YC YT Average Difference Calibration y = x T reatment y = x Calibration Treatment log Walters (Control Watershed) Turb Figure 14. Calibration and Treatment Period Regression Lines for Full Model log Chumash (Treatment Watershed) Turb YC YT Average Difference Calibration y = x T reatment y = 0.654x Calibration Treatment log Walters (Control Watershed) Turb Figure 15. Calibration and Treatment Period Regression Lines for Reduced Model 11

12 References Clausen, J.C. and J. Spooner Paired Watershed Study Design. Biological and Agricultural Engineering Department, North Carolina State University, Raleigh, NC. EPA-841-F p. Gilbert, R.O Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold. Grabow, G.L., J. Spooner, L.A. Lombardo, D.E. Line, and K.L. Tweedy Has Water Quality Improved?: Use of SAS for Statistical Analysis of Paired Watershed, Upstream/Downstream and Before/After Monitoring Designs. U.S. EPA-NCSU-CES Grant No Sanders, T. C., R.C. Ward, J.C. Loftis, T.D. Steele, D.D. Adrian and V. Yevjevich Design of Networks for Monitoring Water Quality, 3 rd edition. Water Resources Publications, Littleton, Colorado. Spooner, J., R.P. Maas, S.A. Dressing, M.D. Smolen, and F.J. Humenik Appropriate Designs for Documenting Water Quality Improvements from Agricultural NPS Control Programs. In: Perspectives on Nonpoint Source Pollution. EPA 440/ Pp Acknowledgements This paper was supported by U.S. EPA-NCSU- CES Grant No , National Nonpoint Source Watershed Project Studies. The authors would like to express their thanks to Dave Paradies and Karen Worcester, Morro Bay 319 National Monitoring Program Project for their collaboration. The authors also express their thanks to Dr. Francis Giesbrecht, Professor of Statistics, North Carolina State University for his review and contributions. 12