ABSTRACT. by Sarah Jonell Peterson

Size: px
Start display at page:

Download "ABSTRACT. by Sarah Jonell Peterson"

Transcription

1 ABSTRACT THE ROLE OF SAMPLING DENSITY IN THE ACCURACY OF WATER QUALITY ASSESSMENT: A CASE STUDY OF 9 OHIO WATERSHEDS AND THE WADEABLE STREAMS ASSESSMENT by Sarah Jonell Peterson State and federal agencies measure water quality in streams using different sampling regimes. The Ohio EPA uses geometric sampling of 80+ sites in 11 digit HUC watersheds, whereas the US EPA uses probabilistic sampling of 25 sites in Ohio. I analyzed OEPA fish (IBI) and macroinvertebrate (ICI) indices from 9 watersheds (3 agricultural, 3 urban, and 3 with little disturbance) along with US EPA data. Results suggest that the number of sites sampled by OEPA is insufficient to reproduce statistically similar IBI or ICI scores. Most sites should be sampled on third order streams and the least on first and fifth order streams. Similarly, more sites should be sampled in agricultural watersheds than in least disturbed or urban watersheds. IBI scores from OEPA and US EPA were positively correlated, but ICI scores were negatively correlated. Both OEPA and US EPA need to collect more data to manage and improve watersheds.

2 THE ROLE OF SAMPLING DENSITY IN THE ACCURACY OF WATER QUALITY ASSESSMENT: A CASE STUDY OF 9 OHIO WATERSHEDS AND THE WADEABLE STREAMS ASSESSMENT A Practicum Report Submitted to the faculty of Miami University in partial fulfillment of the requirements for the degree of Master of Environmental Science Institute for the Environment and Sustainability by Sarah Jonell Peterson Miami University Oxford, Ohio 2011 Advisor Dr. Donna McCollum Reader Dr. Robert Schaefer Reader Dr. Thomas Crist

3 Table of Contents Introduction and Background... 1 Water Quality Monitoring Establishment... 1 Differences in Sampling Site Density; Evolution of Research Questions... 2 Goals and Application of Water Monitoring Regimes... 6 Water Quality Standards and Components of Water Quality Sampling Study Design and Methodology Study area selection criteria and data acquisition Statistical Analysis Ohio geometric sampling Results Ohio geometric sampling Wadeable streams assessment Discussion and Implications Disturbed agricultural watersheds Disturbed urban watersheds Little disturbed watersheds Current sampling trends IES Requirements References Appendices Appendix A ii

4 List of Tables Table 1. Sampling strata based on drainage area Table 2. Summary of two main state sampling methods. Statewide has a larger spatial extent and lower density than does watershed-level sampling Table 3. Funding for water quality in EPA region V states Table 4. Summary of assessment in US EPA Region V states. For states with rotational assessment design having no time frame specified, there is not a specific rotation of sampling. Fixed station- target one set of sampling sites in a watershed, Targeted synoptic- repeat some sampling sites and randomly select others, Targeted intensive- select sampling sites in areas with historical impairments Table 5. Biocriteria and aquatic ecosystems sampled in US EPA Region V states Table 6. Characteristics of little disturbance (LD), disturbed agriculture (DA), and disturbed urban (DU) watersheds Table 7. IBI score ranges and rankings for streams used by OEPA and US EPA Table 8. Percentage allocation of proposed sampling sites for both IBI and QHEI Table 9. Percentage of streams within Ohio ecoregions with good, fair and poor IBI and ICI scores from WSA and from calculated from. TPL: Temperate Plains, SAP: Southern Appalachian Plains, NAP: Northern Appalachian Plains. Avg. OH TPL: average for Maumee, Mad, Stillwater, Wabash; Avg. OH SAP: average for Kokosing, Wakatomika, Muskingum, Avg. OH NAP: average for Olentangy and Chagrin Table 10. Causes of impairment identified by OEPA and US EPA. The numbers indicate the frequency of which the impairments appear and the Yes indicates that the US EPA also identified the given cause of impairment Table 11. Data sources for the study areas iii

5 List of Figures Figure 1. WSA sampling site allocation within US ecoregions Figure 2. Theoretical geometric sampling regime for a given drainage area Figure 3. Average water quality of each of the study areas Figure 4a. Average distribution of all sampling sites for Ohio watersheds, classified by land use Figure 4b. Current and optimal proposed number of sampling sites based on both current IBI and QHEI scores for 9 Ohio watersheds Figure 5. Relationship between proposed number of IBI and QHEI sites for all watersheds Figure 6. Allocation of proposed number of sites based on IBI by strata (stream order) for all watersheds. The horizontal line is a reference line which indicates the average number of current sampling sites in a study area Figure 7. Allocation of proposed number of sites based on QHEI by strata (stream order) for all watersheds. The horizontal line is a reference line which indicates the average number of current sampling sites in a study area Figure 8. Distribution of WSA sampling sites within Ohio Figure 9 (a, b). Relationship between IBI and ICI ratings for entire ecoregions and portions of ecoregions within Ohio Figure 10. WSA and OEPA percentages of good, fair and poor IBI and ICI scores in 9 Ohio watersheds and 3 ecoregions Figure 11. Relationship between IBI and ICI ratings for WSA and OEPA assessment iv

6 Acknowledgements I would like to thank my committee members, Dr. Donna McCollum, Dr. Robert Schaefer and Dr. Tom Crist for all of their help and support on this project. Also, Chris Yoder and Ed Rankin with the Midwest Biodiversity Institute for helping me develop this research project. All raw data was obtained from Ohio EPA and US EPA. v

7 List of Acronyms CWA CWH DA DU EPA EWH HUC IBI ICI LD LRW MBI MiWB NAP QHEI SAP STORET TMDL TPL WQ WSA WWH Clean water act Coldwater habitat Disturbed agricultural Disturbed urban Environmental Protection Agency Exceptional warmwater habitat Hydrologic unit code Index of biotic integrity Invertebrate community index Little disturbed Limited resource habitat Midwest Biodiversity Institute Modified index of wellbeing Northern Appalachian Plain Qualitative habitat evaluation index Southern Appalachian Plain Storage and retrieval warehouse Total maximum daily load Temperate Plain Water quality Wadeable streams assessment Warmwater habitat vi

8 Introduction and Background Water Quality Monitoring Establishment The condition of lakes and streams has both public health and ecological significance. From the perspective of human health, degraded surface waters can lead to water that is not potable or has increased risk of bacterial infection. Ecologically, degradation of surface water can lead to decreases in fish abundance, trophic condition, and overall biodiversity (Whittier and Paulsen 1992), as well as similar changes to other biotic communities such as macroinvertebrates and algae. Monitoring and assessment of surface waters have provided and continue to provide insight about how to protect ecological systems, as well as the resources and services they provide to humans. The Clean Water Act 305(b) mandates the US EPA to "maintain and restore the chemical, physical, and biological integrity of our Nation's waters" (US EPA 2011). In order to evaluate and ensure progress toward that goal, the law requires that states produce water quality inventories and report water quality data with a known significance. In addition, the Clean Water Act (CWA) mandates monitoring with diverse metrics, including designated uses, causes of impairments, permit compliance of point source discharges, sources of pollutants and total maximum daily loads (TMDL) (Detenbeck et al. 2005). These requirements have led states to devise monitoring programs that focus on the chemical, physical and biological components of aquatic ecosystems and to develop and assess remediation actions that address areas of concern. Water quality assessment differs at national and state levels in terms of density and frequency of sampling, particular characteristics measured and overall goals. The federal EPA places emphasis primarily on taking an inventory of water quality, whereas the state EPA places emphasis on both assessing and, when necessary, requiring improvement of water quality. The federal EPA collects biological (benthic macroinvertebrates), chemical (nitrogen, phosphorus, salinity and acidity) and physical (sediments, fish habitat, riparian vegetation, and riparian disturbance) water quality data (WSA 2005). Individual state EPAs or other environmental agencies have greater on-the-ground responsibility and may undertake more intensive water quality monitoring and assessment than does the federal EPA. The role of the US EPA is to set parameters and guidelines for water sampling protocols and expectations that govern state-level monitoring and assessment programs. The federal EPA determines quality assurance protocol for all data collection and assessment techniques. The US EPA also allocates grant monies for state monitoring programs and provides technical guidance on field work and technical 1

9 reports (US EPA 2009). Due to diverse types and amounts of data collected by states, the federal EPA dictates how water quality data are stored (STORET), synthesizes all of the states data, and makes the data available to the public. Core state monitoring requirements, determined by the federal EPA and the CWA, are designed to evaluate the overall water quality within the state, using multiple core indicators and additional supplemental indicators. Core state requirements include DO, temp, ph, conductivity, nutrients, land use, and habitat assessment. Supplemental indicators may include toxicity levels and biological community (US EPA 2010). The status and trends in water quality of statistically selected streams (through simple random, stratified, or nested designs) also are determined by the federal EPA through the Office of Research and Development. Differences in Sampling Site Density; Evolution of Research Questions The spatial density at which water sampling is completed may vary greatly between state and national agencies. The US EPA samples 1392 sites nation-wide every five years through a probabilistic sampling regime, with no provision that sites are repeatedly sampled over subsequent years. The Ohio EPA (OEPA) samples sites per study area, usually an 11-digit hydrologic unit (HUC), every 5-10 years, with sites generally the same in subsequent studies. Intensive sampling of study areas generates more precise and comprehensive data than less intensive sampling, but increases time and expense. The US EPA currently conducts the Wadeable Stream Assessment (WSA) every two years to determine how important a variety of stressors are to streams and the proportion of streams in good, fair, and poor condition. In 2004, when this method was first used, the WSA employed a probabilitybased survey design, defining streams as an infinite set of points within a linear network and randomly selecting sampling sites within this linear framework. A blocked random sampling design was used, with the blocks being the 9 US ecoregions and stream order. Ecoregions were determined by similarities in climate, vegetation, soil, and geology (US EPA 2005). All sites were in wadeable streams, so most of the streams, 91%, were perennial 1 st -4 th order (WSA 2005). The 2004 assessment, then, included a random sample of a minimum of 50 wadeable sites in each ecoregion. The actual number of sampling sites in the WSA varied within ecoregions, from 85 to 185 (Figure 1), for two reasons. Some states increased the density of sites to support state-scale monitoring of stream condition. Other regions had either fewer or more sites because of the WSA s concentration on wadeable streams (US EPA 2005); sampling sites were clustered in areas with an abundance of 1 st -5 th order streams and more spread out in areas with fewer. The number of sites also 2

10 varied among states. Parts of Ohio fall within three different ecoregions, the temperate plains (TPL, western Ohio), northern Appalachian (NAP, northeastern Ohio), and southern Appalachian (SAP, southeastern Ohio) regions (Figure 1). Each of these ecoregions had a different number of sampling sites within Ohio, but a total of 25 Ohio stream sites were sampled via WSA in An important consideration of this sampling regime is that data from a small number of sampled watersheds are used to infer impairment of stream reaches in watersheds that were not sampled (Detenbeck and Cincotta 2008). Figure 1. WSA sampling site allocation within US ecoregions. (US EPA 2005). The Ohio EPA protocol uses geometric, rather than probabilistic, sampling regimes to assess water quality. An OEPA study area is most often a group of subwatersheds identified by 11-digit HUCs). Theoretically, in Ohio s geometric sampling regime the first site is at the mouth of the hydrologic unit, then the number of sampling sites is doubled each time the size of the watershed area is halved. For example, within a 200 mi 2 study area, there will be one sample site with a watershed of 200 mi 2, two with watersheds of 100 mi 2, etc. (Figure 2). 3

11 Figure 2. Theoretical geometric sampling regime for a given drainage area. Although the location of sampling sites in a geometric sampling regime is related to stream order and drainage area, sampling is not limited to wadeable streams. Based on the drainage area at a given sampling site, the site can be placed into one of 5 strata (Table 1). Sampling protocol, especially for invertebrates and fish varies for primary headwater, headwater, wading and boat sites (OEPA 1999). 4

12 Table 1. Sampling strata based on drainage area (Ohio EPA 2008). Stream Order Strata Drainage Area (sq mi) Stream Category Primary headwater Headwater Headwater or Wading Wading or Boat Boat The number of sampling sites for a given Ohio study area, say a drainage area of about 200 mi 2, is therefore significantly greater than the number of sample sites under the national probabilistic regime for the same area. In Ohio there are 25 such study areas, each with between 80 and 140 sites; with approximately 5 study areas monitored per year, that is about 1000 sites sampled every 2 years compared to the WSA sampling intensity of only 25 sites in all of Ohio in This disparity in sampling site density has important implications for data analysis and assessment and management of surface waters. In Ohio, data collected in a designated study area/watershed are applied only within that study area, rather than across watersheds as is WSA data. At the 11-digit hydrologic unit (HUC) study areas that Ohio uses, one may still argue that many watersheds are encompassed, but the data are confined at least within the same larger watershed. Watersheds in a single ecoregion, however, may not be within the same even very large watershed; for example, part of the NAP ecoregion drains into Lake Erie while part drains to the Ohio River (WSA 2005). Thus, there may little similarity between sites, but data from those sites are combined to determine impairments and stream condition in the whole ecoregion. The difference in sample density between state and federal regimes raises questions about how many samples are needed in water quality assessment. If some sites are dropped from Ohio s geometric sampling regime due to budget restriction, at what level of deletion is information lost about the quality of streams and their impairments? Also, should these deletions occur at particular stream orders or can they be randomly deleted and provide the same amount of information? 5

13 This study will explore data from nine Ohio EPA watershed study areas and the WSA sites within Ohio. The first topic, on Ohio watersheds, focuses on the geometric sampling design followed in the state of Ohio, asking at what level of sampling density information is lost about the quality of streams and their impairments. The level of sampling is defined as both the density of sampling sites and the location of those sites within a study area (i.e. the stream order and abundance of sites on each tributary). This question has significant implications for watershed management and how, with limited resources, states can get the most accurate data with the least sampling effort. The second part of this project compares the OEPA state-wide geometric sampling regime to the US EPA national probabilistic sampling regime. Specifically, are the results from the WSA comparable to state-wide watershed geometric sampling, both in overall water quality metrics and sources and causes of non-attainment? Goals and Application of Water Monitoring Regimes The Environmental Monitoring and Assessment Program (EMAP) established and developed by the US EPA, is a research program devised to develop methods to monitor and assess ecological resources. It advanced the science of ecological monitoring and risk assessment by incorporating sampling standards and inclusion of chemical, physical and biological data. The goals of the EMAP include estimating: 1) the current status and distribution of ecological resources, 2) which of these resources have been degraded, 3) changes in resource condition and 4) causes of the changes (Whittier and Paulsen 1992). This assessment program incorporates stressor indicators, ones that can cause changes in exposure to pollutants or adverse environmental conditions (usually exerted by humans on the landscape) and response indicators, those that serve as direct measures of biological condition for a given stream. Although these goals also are addressed by OEPA geometric sampling, there are differences. The evaluation of ecological resources in both EMAP and OEPA include measures of habitat disturbance, nutrient concentrations and changes in ph; the effects of these stressor indicators are assessed by both agencies using macroinvertebrate and fish community response indicators. Both EMAP and OEPA emphasize the use of biota as one of the most important indicators of water quality. The main difference is scale: since EMAP sampling in watersheds within each state is limited, the results from a single sampling point may be extrapolated to evaluate water quality and identify causes or sources of impairment for its entire watershed or even streams outside its watershed. The EMAP protocol provides general information on the status of streams on a regional basis, but it is very important for states to 6

14 determine specific causes of impairment in order to provide plans for watershed management, such as TMDLs. Yoder and Barbour (2008) assert that the EMAP does not provide adequate information to meet individual states requirements for water quality monitoring and assessment, specifically in terms of numeric biocriteria. Ohio s biological monitoring program is one of the most progressive in the U.S. in that it uses numeric biocriteria, not narrative criteria. Narrative criteria are based on the presence or absence of particular biota in a given designated use water, rather than relative abundances of organisms identified to the lowest taxonomic level, as are used in numeric biocriteria. Numeric data can therefore provide more detailed information about types of impairment and are supported by a high level of statistical confidence due to specificity of the data. At the most rigorous level, numeric biocriteria define conditions for aquatic designated uses (US EPA 2011). Ohio is located in US EPA Region V, along with Minnesota, Wisconsin, Michigan, Illinois, and Indiana. Although the six states have different water quality monitoring approaches, they predominantly use one of two sampling and monitoring designs: 1) focus on rapid assessment statewide water resources every five years or 2) focus on intensive sampling within individual watersheds on a five year rotation (Yoder 2004). The former method provides information about water across an entire state, but does not have the sampling density or detail of the latter method. Conversely, the high sampling density of the latter method provides very detailed information, but in localized watersheds rather than the entire state. Both methods offer specific benefits as well as drawbacks (Table 2). Table 2. Summary of two main state sampling methods. Statewide has a larger spatial extent and lower density than does watershed-level sampling. (Yoder 2004). Statewide State Sampling Methods 7 Watershed Pros Cons Pros Cons overview of water quality across the entire state overlook sitespecific problems monitor stressor gradients time intensive inaccurate extent of indicators address site-specific concerns bias incomplete stressor gradients lacks statistical confidence

15 The density of sampling conducted by a state is a factor of the number of sites that can be sampled per unit effort and budget. Each of the states in Region V has a different amount of funding allocated to water quality monitoring and assessment (Table 3). This variation in available funding for water quality is an important factor in sampling regime design. For sampling regimes that emphasize a large quantity of sites over the entire state, rapid assessment methods are used. For sampling regimes that require high density of sites within a watershed, quantitative protocols are used (Yoder 2004). Table 3. Funding for water quality in EPA region V states. State Funding (millions $) Ohio 29.1 Indiana 53.9 Illinois 54.7 Michigan 39.5 Minnesota 36.7 Wisconsin 34.9 One situation all states share is they do not have enough resources (personnel, finances, technologies, and facilities) to meet the demands of monitoring and assessment (Yoder 2004). Monitoring and assessment are used to support data-driven, rather than issue-driven, water quality (WQ) management. Of all of the states in Region V, Ohio has the most complete and rigorous water quality monitoring and assessment protocol (Yoder 2004, Tables 4, 5). Ohio is also the only region V state that uses numeric biocriteria in its assessment, yet Ohio budgets less to monitoring than the other five states. Ohio is able to implement numeric criteria because of how they select sampling points within a watershed, one or two sampling sites may be located on a stream whereas in other states, the same reach of the stream may have upwards of 20 sites; yet the results are the same. Ohio efficiently allocates it personnel in order to conserve resources for biocriteria analysis (OEPA 2011). 8

16 Table 4. Summary of assessment in US EPA Region V states. For states with rotational assessment design having no time frame specified, there is not a specific rotation of sampling. Fixed station- target one set of sampling sites in a watershed, Targeted synoptic- repeat some sampling sites and randomly select others, Targeted intensive- select sampling sites in areas with historical impairments. State Assessment design Spatial Sampling Design Temporal Spatial Targeted- synoptic Fixedstation Targetedintensive Probabilistic Geometric HUC (digit) Ohio 5 year rotation Intensive priority 11 or x x x subbasins 14 Indiana 5 year rotation statewide/5 years x x x 8 Illinois 5 year rotation statewide/5 years x x x 8 Michigan Rotation 80% wadeable x x 11 Minnesota Rotation statewide x x x 8 Wisconsin 5 year rotation Intensive priority subbasins x x 11 9

17 Table 5. Biocriteria and aquatic ecosystems sampled in US EPA Region V states. State Biological Criteria Aquatic Ecosystem Fish Macroinvertebrate Primary headwater Headwater Wadeable Large River Great River Wetlands Lakes Great Lakes Ohio x x x x x x x x x Indiana x x x x x Illinois x x x x Michigan x x x Minnesota x x x x x x Wisconsin x x x x 10

18 Water Quality Standards and Components of Water Quality Sampling Use designations are employed as a measure of water quality throughout region V, but with varying specificities in different states; some states use general aquatic life designated uses while other states have refined use designations. Use designations are quantified via use attainability analyses (UAA) in five of the six states in region V, including Ohio. For Ohio, 5 broad categories are designated based on biocriteria (MEC 2008): a) Exceptional warmwater habitat (EWH) - characterized by high species diversity, especially of pollutant intolerant species, b) Warmwater habitat (WWH) - the most common restoration target, c) Modified warmwater habitat (MWH) - for streams with a degree of modification that has made WWH designations unattainable, d) Limited resource water (LRW) - for small streams that have undergone alterations which reduce or prevent aquatic life, and e) Coldwater habitat (CWH) - inhabited by a particular suite of colder water organisms. EWH is the designation for the best quality warmwater habitat and LRW is the designation for the poorest warmwater habitat. CWH is a different category than warmwater and designates good water quality with little disturbance. There are three broad categories of water quality sampling used to assess whether surface waters are meeting their designated use: physical, chemical, and biological. Physical components include: stream morphology, flow, and substrate. Chemical components may include: turbidity, TDS, nitrates, phosphates, ph, conductivity, dissolved oxygen, ammonia, metals, and organic enrichment (OEPA 2006). Biological components include macroinvertebrates, fish, and algal samples. This study is based on biological assessments of streams, using data from the following indices: a) Index of Biological Integrity (IBI) - A multimetric index used to quantify water quality based on twelve characteristics of the fish community. Values for this index range from 0-60, with higher values indicating greater fish diversity and better biotic integrity (US EPA). b) Invertebrate Community Index (ICI) - A multimetric index used to quantify water quality based on ten characteristics of the macroinvertebrate community. Values for this index range from 0-60 with higher values indicating greater biotic integrity and better water quality. c) Modified Index of Wellbeing (MiWB) - This index is based on fish mass and density. This index factors out 13 fish species that are pollutant tolerant so that there are low scores on polluted 11

19 streams. Low MiWB values indicate poorer water quality than do higher values. The maximum MiWB score is >9.4. d) Qualitative Habitat Evaluation Index (QHEI) - This index quantifies the quality of physical characteristics of a stream and its watershed. Substrate, in-stream cover, stream channel morphology, riparian zone and bank erosion, pool/glide and riffle/run quality, and gradient are all included in this index. QHEI scores range from 0-100, with higher scores correlated to better water quality. Study Design and Methodology Study area selection criteria and data acquisition The WSA study from was used for the probabilistic study, while nine study areas within Ohio were chosen to represent the geometric sampling regime. The selection criteria for Ohio watersheds included the size, dominant land use, average basin water quality, and geographic location of each watershed, as well as the availability of a recent OEPA report ( ; Appendix A ). All Ohio EPA watershed data was acquired from Ohio EPA Division of Surface Water Biological and Water Quality reports of the 9 watershed study areas. From each study the following variables were compiled for each sample site: river name, designated use, river mile, drainage area, IBI, MiWB, ICI, QHEI, attainment status, causes, sources, and coordinates. The reports also provided summaries of land use within each watershed. Information about the national WSA was acquired through the US EPA Wadeable Streams Assessment from field data was collected in The stressors and causes of impairment in the WSA are general per region, not site-specific as they are in the Ohio EPA Biological and Water Quality studies. OEPA Reports from the years 1999 to 2010 were examined to narrow the study areas to watersheds with drainage areas of approximately mi 2, the average size of Ohio study areas. From these, three watersheds were chosen from each of three categories based on predominant land use: Disturbed Urban (DU), Disturbed Agriculture (DA), and Little Disturbance (LD). Overall water quality differed within each category; DU water quality ranged from fair to good, DA water quality ranged from poor to good and LD ranged from fair to excellent to fair (Figure 3). 12

20 4 Excellent 3 Good 2 Fair Poor Kokosing 2-Wakatomika 3-Muskingum 4-Wabash 5-Stillwater 6-Portage/Maumee 7-Olentangy 8-Chagrin 9-Mad Figure 3. Average water quality of each of the study areas 13

21 The 9 selected study areas were also distributed as evenly as possible throughout Ohio to account for spatial or geographic variability, specifically the topography within watersheds (Figure 3). Each of the study areas was also approximately equal in drainage area, ranging from 153 to 674 square miles (Table 6). Land uses were designated based on the highest percentages of the land use of interest among the possible watersheds. In DU watersheds, the average percentage of urban land use was 16.5% compared to only 5.2% urban in DA and 6.3% in LD. In DA watersheds, the average percentage of agricultural land use was 84.7%, versus 47.7% in DU and 45.3% in LD. Finally, in LD watersheds, the percentage of forest was 44.9% compared to only in 10.9 % in DU and 6.0% in DA watersheds. For each Ohio study area, the number of sampling sites was determined and the location of each site was mapped using ArcMap. 14

22 Table 6. Characteristics of little disturbance (LD), disturbed agriculture (DA), and disturbed urban (DU) watersheds Watershed Study Area Kokosing (LD) Wakatomika (LD) Muskingum (LD) Wabash (DA) Stillwater (DA) Portage/Maumee (DA) Olentangy (DU) Chagrin (DU) Mad (DU) Agriculture/ Pasture (%) Urban (%) 10.6 <1 8.3 < Forested (%) < 10 9 Overall Water Quality Excellent Good Fair Poor Good Poor Good Fair Good Drainage Area (sq mi) Geographic Location Central- East Central-East Southeast West West Northwest Central Northeast Southwes t 11 or 12 Digit HUCs , 020, 030, , , 0802, 0803, 0804, , 020, 030, ,02,03,04,05, , 100, 110, , , 160, 170,

23 Statistical Analysis All statistical analyses were conducted using the base stats package in the R programming language (R Development Core Team, 2009) and SAS JMP 9.0 software. Ohio geometric sampling Two questions were explored relating to the number of sites sampled by Ohio EPA; 1) whether these sampling sites could be located randomly within a watershed, independent of stream order and 2) optimally, how many sites should be sampled per study area. For each of the stream order strata (Table 1) in each watershed, the sample mean, adjusted sample mean, standard deviation (SD) and variance of IBI and QHEI scores at each sampling site were calculated. ANOVA and multiple comparisons tests (α=0.05) of IBI and QHEI scores were conducted to determine if the mean scores of each strata within each study watershed were different. Differences in mean IBI and QHEI scores among the strata would suggest that sampling points should be allocated to specific stream order strata. To determine an ideal number of sites, given no limitation on time or money, an optimal number of sampling sites was calculated for each watershed. This proposed optimal number was defined as the number of sampling sites needed to consistently obtain scores within a margin of error (ME) of ± 4% of current IBI and QHEI scores. A proposed optimal number was calculated for each watershed using the equation: ME = z.025 σ/ n ME = margin of error = +4 z = z score where α=0.05 = 1.96 σ = standard deviation of IBI or QHEI scores in the watershed n = sample size in the watershed Optimal allocation of sampling sites to strata depended on two factors, the total number of sites sampled and the variability of scores within each stratum. These variables were incorporated into the following equation to determine the optimal sample size for each stratum in a watershed. This equation was used because it takes into account the current number if sampling sites in each stratum and incorporates those values into the optimal number of sampling sites. 16

24 n k σ k N = 5 Σ = n 1 σ 1 + n 2 σ 2 + n 5 σ 5 K=1 N = optimal sample size for a given stratum n = number of current samples within a stratum σ = standard deviation of a given stratum k = stream order strata After the optimal number of sampling sites in each stratum was determined, the allocation of sampling sites within and between watershed land use categories (LD, DA and DU) was compared using the program SAS JMP 9. The number of projected optimal sampling sites based on IBI scores was regressed against those based on QHEI scores to determine whether both indices were required to determine sampling intensity and allocation or if one of the two would suffice. An ANOVA was used to determine if there was a difference between the number of optimal sites differed among different, and within the same, land use category. Comparison of Ohio EPA and Wadeable Streams Assessment sampling The WSA reports the quality of streams using good, fair or poor ratings for the ICI and IBI, rather than reporting the numerical score. Therefore, to determine if the results from WSA and OEPA sampling differ, WSA results for Ohio ecoregions were compared to results from OEPA using the percentages of streams that received good, fair or poor IBI and ICI scores. For OEPA data, IBI scores were converted from numeric values to categories (Table 7) and condensed into three categories to mimic the WSA ratings. Sites rated as exceptional by OEPA were included in the good category and sites rated as very poor by OEPA were included with poor. The range of the fair category was expanded to include scores of 27 and 35, since those are not explicitly categorized by OEPA. The percentages of WSA good, fair, poor streams were compared to the percentages of OEPA good, fair, poor streams to determine if there was a difference between the two sampling methods. The TPL, SAP and NAP ecoregions encompass an area much greater than just that part of Ohio; in order to resolve this, the average IBI and ICI ratings were calculated for the entire ecoregion as well as for the WSA sites that fell within each ecoregion in Ohio (Average OH). 17

25 Table 7. IBI score ranges and rankings for streams used by OEPA and US EPA. Source: Ohio EPA Very Poor Poor Fair Good Exceptional OH EPA < WSA < >35 The location of WSA sites within the Ohio watersheds studied here were determined based on their latitude/longitude. A comparison of OEPA and WSA sampling site locations was made, as was a qualitative comparison of causes of impairment for surface waters within the state of Ohio. Results Ohio geometric sampling Due to these differences in variability among IBI and QHEI scores, ANOVA showed that the proposed optimal number of sampling sites differed among both land use category (Figure 4a) and watersheds within each category (p QHEI < , p IBI < ). Overall, for all three sampling site numbers (current, proposed IBI, and proposed QHEI), disturbed agricultural watersheds required the greatest sampling effort and disturbed urban areas required the least (Figure 4). The number of proposed sampling sites ranged from in DA watersheds, in LD and in DU. There was no consistent trend in whether more sampling sites were proposed by IBI or QHEI scores; DU watersheds required more sampling sites based on IBI scores whereas in DA and LD watersheds, more sampling sites were required based on QHEI scores. 18

26 Figure 4a. Average distribution of all sampling sites for Ohio watersheds, classified by land use. (DA: disturbed agriculture, LD: little disturbance, DU: disturbed urban) The proposed number of sampling sites in the study watersheds, based on both IBI and QHEI scores, was greater than the number of sites currently sampled in six of the nine watersheds. Only in the Kokosing, Olentangy and Chagrin watersheds was the projection lower, and then only for one of the two scores, not both IBI and QHEI. Based on IBI scores, the Wabash watershed (DA) required the densest sampling, with 221 proposed sites, and the Kokosing (LD) watershed required the least dense sampling, with 50 proposed sites. Based on proposed QHEI scores, the Wabash watershed again required the densest sampling while the Olentangy (DU) required the least sites. 19

27 Figure 4b. Current and optimal proposed number of sampling sites based on both current IBI and QHEI scores for 9 Ohio watersheds Overall, the number of proposed sampling sites based on current IBI and QHEI scores differed among land use categories and for watersheds within each land use category. For all watersheds and both IBI and QHEI, the proposed number of sampling sites was statistically higher than the current number of sites (p IBI = 0.01, p QHEI < 0.01). The proposed number of sites was higher than the current number sampled in the least developed and agricultural land use categories for both IBI and QHEI (LD: p IBI < 0.05, p QHEI <0.01, DA: p IBI <0.01, p QHEI <0.01). For urban watersheds, the proposed number of sites based on QHEI scores also differed (p=0.004) while the proposed number based on IBI scores did not differ from the current number of sites. There was no significant difference between the number of sites proposed by IBI and QHEI scores when all watersheds were compared (p=0.9325); in fact, these numbers were strongly positively related (r 2 = 0.76, p=0.001; Figure 5). There was also no significant difference when the number of proposed sites predicted by IBI scores were compared to the number predicted by QHEI scores within land use categories. The strongest relationship between the number of sites proposed by the IBI and QHEI was in disturbed agricultural watersheds (r 2 = 0.93, p=0.0001; Figure 5). 20

28 Figure 5. Relationship between proposed number of IBI and QHEI sites for all watersheds. Blue symbols are LD, green symbols are DA and red symbols are DU. Although the number of proposed sampling sites differed for each land use category, the trend of how many were allocated per stream order strata was similar amongst all watersheds and all land use categories. Very few or no sampling sites were required in first order streams (strata 1). Second, fourth, and fifth order streams required approximately equal numbers of sampling sites (p= 0.75). Third order streams required the most sampling sites (Figure 6). 21

29 Figure 6. Allocation of proposed number of sites based on IBI by strata (stream order) for all watersheds. The horizontal line is a reference line which indicates the average number of current sampling sites in a study area. Although the numbers of proposed sampling sites for each strata based on QHEI versus IBI data were different, the distribution of both was similar. Few sites are proposed on first and fifth order streams, with a majority of sampling to be conducted on third order streams (stratum 3). The overall number of sampling sites for each watershed category varied greatly, with watersheds in disturbed agricultural areas requiring a greater sampling density than watersheds in little disturbed or disturbed urban areas (Figure 7). 22

30 Figure 7. Allocation of proposed number of sites based on QHEI by strata (stream order) for all watersheds. The horizontal line is a reference line which indicates the average number of current sampling sites in a study area. Based on the proposed number of IBI sampling sites, for a given watershed the average allocation of sites within each stratum should be 0.90% in first order, 19.5% in second order, 47.2% in third order, 18.2% in fourth order and 14.2% in fifth order streams. Based on the QHEI proposed number of sampling sites the average allocation of sites within each stratum was 0.44% first order, 16.02% second order, 44.11% third order, 18.33% fourth order and 20.6% fifth order streams. These values are not significantly different from the proposed number of sampling sites for IBI (α=0.05, p=0.9577, Table). Table 8. Percentage allocation of proposed sampling sites for both IBI and QHEI Stream order strata IBI allocation (%) QHEI allocation (%)

31 Wadeable streams assessment The wadeable streams assessment sampled 25 sites within the state of Ohio, 6 of which were located the nine watersheds used in this study. The Kokosing and Chagrin contained two WSA sampling sites each while the Mad and Wakatomika contained one each (Figure 8). Figure 8. Distribution of WSA sampling sites within Ohio The US EPA quantified percentages of streams with good, fair, and poor IBI and ICI scores in each of the three ecoregions (Table 9). An ANOVA of these percentages compared to average percentages of WSA streams in Ohio that the OEPA had rated good, fair and poor showed no significant variation (Table 9, Figure 9). 24

32 Table 9. Percentage of streams within Ohio ecoregions with good, fair and poor IBI and ICI scores from WSA and from calculated from. TPL: Temperate Plains, SAP: Southern Appalachian Plains, NAP: Northern Appalachian Plains. Avg. OH TPL: average for Maumee, Mad, Stillwater, Wabash; Avg. OH SAP: average for Kokosing, Wakatomika, Muskingum, Avg. OH NAP: average for Olentangy and Chagrin. Average US EPA TPL Average OH TPL Average US EPA SAP Average OH SAP Average US EPA NAP Average OH NAP IBI- Good % IBI- Fair % IBI- Poor % ICI- Good % ICI- Fair % ICI- Poor % The regressions between the WSA ecoregions and OEPA were not significant; however, there were trends between the highest and lowest IBI and ICI percentages within each ecoregion. A Pearson product-moment regression, for IBI percentages there is a moderate positive correlation between WSA and OEPA percentages (r 2 TPL=0.56, r 2 SAP= 0.87, r 2 NAP= 0.98, Figure 9(a)). Conversely, for ICI there is a strong negative correlation between WSA and OEPA percentages (r 2 TPL= 0.88, r 2 SAP= 0.733, r 2 NAP= 0.61, Figures 9b, 10) in all ecoregions. a b Figure 9 (a, b). Relationship between IBI and ICI ratings for entire ecoregions and portions of ecoregions within Ohio 25

33 Southern Appalachian Plains (SAP) Northern Appalachian Plains (NAP) Temperate Plains (TPL) Figure 10. WSA and OEPA percentages of good, fair and poor IBI and ICI scores in 9 Ohio watersheds and 3 ecoregions. 26

34 Figure 11. Relationship between IBI and ICI ratings for WSA and OEPA assessment There was no difference between the scores for the entire ecoregions and the portions within Ohio. The WSA results for IBI and ICI in the TPL ecoregion did not differ significantly from the scores of the four OEPA watersheds located in the TPL ecoregion, the Wabash, Stillwater, Portage/Maumee and Mad River watersheds (p=0.9746). Similarly, the WSA results for IBI and ICI scores in the SAP ecoregion did not differ significantly from the IBI and ICI scores of the Kokosing, Wakatomika and Muskingum watersheds, those located in the SAP ecoregion (p=0.989). The WSA results for IBI and ICI scores for the NAP ecoregion and the IBI and ICI scores for the Chagrin and Olentangy did not differ significantly (α=0.05, p=0.5255). There is a weak positive relationship between WSA and OEPA IBI and ICI ratings (r 2 = 0.053, Figure 11). The causes of impairment identified by both the WSA and the Ohio EPA are similar, but the OEPA identifies more causes of impairment than does the WSA (Table 10). The OH EPA identified 21 causes of impairment while the WSA identified five. When causes of impairment were ranked by order of importance based on the frequency at which they occurred in Ohio watersheds the three most 27

35 important causes were not monitored by US EPA (Table 10). Only one of Ohio s five most important causes of impairments was recorded by US EPA. Table 10. Causes of impairment identified by OEPA and US EPA. The numbers indicate the frequency of which the impairments appear and the Yes indicates that the US EPA also identified the given cause of impairment Causes Ohio EPA US EPA Siltation 60 Organic enrichment 55 Habitat alteration 29 Nutrients/P 29 YES Hydromodification 25 Sedimentation 16 YES Flow alteration 11 Low DO 11 TDS 11 Toxics 11 Ammonia/Nitrate 10 YES Metals 10 Conductivity 8 YES Bacteria 7 Channelization 6 Eutrophication 6 Impoundment 4 Thermal modification 3 Riparian Removal 2 YES Low Flow 1 Organics 0 28

36 Discussion and Implications The level of sampling density required to assure accurate IBI and QHEI data varied within each land use category. Overall, disturbed agricultural watersheds required the largest number of samples (densest sampling) and disturbed urban watersheds required the least. Disturbed agricultural watersheds DA watersheds require more sampling for a multitude of reasons, possibly attributable to best management practices (BMPs), topography of the watershed and proximity of livestock and cropland to streams. Water quality has been impaired in agricultural areas by reduced flow, increased variability of sedimentation due to storage dams, fertilizer use, manure application, livestock in or very near streams, pesticides/herbicides, and reduced riparian habitat. These agricultural impacts occur at a very fine scale, since each may occur at the level of individual farms or fields. At such a fine scale, dense water quality sampling identifies more impacts of agriculture on a given watershed than does less dense sampling. The results reported from less dense sampling may lead to inaccurate assessment of agricultural impact on a watershed (Smith et al. 2010). The adoption of agricultural BMPs addressed several of the identified impairments. Agricultural BMPs include riparian setbacks/buffer strips, precision agriculture, conservation tillage, wetlands and vegetated swales. Wetlands are one of the most effective treatment methods for agricultural runoff, improving water quality by removing nutrients prior to water entering streams (Day et al. 2003). Despite incentives from USDA, however, BMPs are not mandated by OEPA and, therefore, are not implemented across all watersheds. As the number of BMPs within a watershed increases, so does their overall impact on water quality. As stream conditions improve or become less variable, fewer sampling sites are required. One examples of where BMPs might help is in the case of nitrogen and phosphorus enrichment from fertilizer, which was identified by both OEPA and WSA as one of the top five causes of impairment. Because both US EPA and OEPA recognize a national regulation for the amount of fertilizer use, fertilizer application and waste (manure) treatment could be managed to reduce N and P concentrations in streams. Generally, management strategies for agricultural watersheds include a hybridization of ecological engineering and restoration to simulate heterogeneity of the environment via nutrient cycling, species diversity, and colonization. 29

37 The management of agricultural watersheds is comprised of several different components each addressing a different cause of impairment. Nitrogen losses through runoff can be minimized through a multi-tiered approach of reducing nitrate fertilizers, managing manure, and applying appropriate quantities of nitrogen based on soil chemistry and crop selection. Alternating crops on fields from cornsoybean to alfalfa-grass crops also can reduce nitrogen losses (Day et al. 2003). Riparian buffers and wetlands offer two benefits, denitrification and nitrogen uptake by plants. For confined animal feeding operations (CAFOs) and other livestock agriculture, animal waste slurry shows very high biological oxygen demand and contains high levels of nitrogen, phosphorus, minerals and bacteria. Anaerobic irrigation is one method of managing waste by decomposing organic material in the waste slurry and spraying the remaining slurry on fields as fertilizer (Cheng 2003). A comprehensive waste management plan for CAFOs mandated by US EPA and USDA regulates feed and manure management as well as manure application, however, enforcement of these regulations is lacking. Close inspection for compliance and enforcing penalties for violations of permits would help reduce the impact CAFOs can have on a watershed. Cost effectiveness is the biggest obstacle facing treatment of animal waste slurry. Currently more sites are sampled in agricultural areas than in other land use categories. However, the results of this study show that the current sampling used by OEPA may not be sufficient to obtain reproducible water quality data. Of all of the states in US EPA region 5, Ohio has the most comprehensive stream sampling protocol; this has implications for methods that other states are using. If Ohio is not sampling sufficient sites to obtain reproducible data, states that sample fewer sites in watersheds are not likely collecting reproducible and precise data. In addition to the density of sampling within Region 5 states, the type of variable and data collected also differ. In order to collect reproducible and accurate data, identical sampling protocol should be used by all states. Qualitative and numeric biological indicators and indices (IBI, ICI) should continue to be used because aquatic organisms often show signs of stress before there are noticeable changes in physical and chemical components of a habitat. Disturbed urban watersheds Disturbed urban watersheds required the least sampling to obtain reproducible data probably for several reasons. Large urban areas are usually either on larger streams or rivers or the small streams have been obliterated, therefore stream conditions do not vary as much within urban areas as in other land use categories. Stream conditions are often of middle quality, poorer than less developed 30