EPRI COMMENTS ON PROPOSED HAPS MACT RULE

Size: px
Start display at page:

Download "EPRI COMMENTS ON PROPOSED HAPS MACT RULE"

Transcription

1 EPRI COMMENTS ON PROPOSED HAPS MACT RULE U.S. Environmental Protection Agency National Emissions Standards for Hazardous Air Pollutants from Coal and Oil- Fired Electric Utility Steam Generating Units and Standards of Performance for Fossil-Fuel-Fired Electric Utility, Industrial-Commercial-Institutional, and Small Industrial-Commercial-Institutional Steam Generating Units Docket ID: EPA HQ OAR CFR Parts 60 and 63 (Federal Register, Volume 76, Number 85, Tuesday, May 3, 2011; [EPA HQ OAR ; EPA HQ OAR , FRL ]; RIN 2060 AP52) Submitted by: ELECTRIC POWER RESEARCH INSTITUTE 3420 Hillview Avenue Palo Alto, CA August 4, 2011 For more information, please contact: Paul Chu

2

3 CONTRIBUTORS Contributors to this document include: S. Campleman R. Chang P. Chu C. Dene N. Goodman E. Knipping B. Nott G. Offen A. ter Schure N. Shick R. Wyzga iii

4

5 EXECUTIVE SUMMARY The Electric Power Research Institute (EPRI) is providing comments to the U.S Environmental Protection Agency (EPA) on the National Emission Standards for Hazardous Air Pollutants from Coal and Oil-Fired Electric Utility Steam Generating Units and Standards of Performance for Fossil-Fuel-Fired Electric Utility, Industrial-Commercial-Institutional, and Small Industrial- Commercial-Institutional Steam Generating Units, (referred to here as the proposed MACT rule ) and on supporting information in docket EPA-HQ-OAR The comments are divided into two sections: the first relating to the proposed maximum achievable control technology (MACT) limits for hazardous air pollutants (HAPs) and the underlying information on power plant fuels, emissions and controls; and the second relating to the environmental fate and transport, human health effects, and risk analysis of the HAPs. EPRI was established in 1973 as an independent, nonprofit center for public interest energy and environmental research. EPRI brings together member organizations, the Institute's scientists and engineers, and other leading experts to work collaboratively on solutions to the challenges of electric power. These solutions span nearly every area of power generation, delivery, and use, including health, safety, and environment. EPRI's members represent over 90% of the electricity generated in the United States. International participation represents nearly 15% of EPRI's total R&D program. EPRI has been active in conducting stack tests on fossil fuel-fired power plants and evaluating the results since In 2010, EPRI conducted data quality reviews of power plant stack monitoring data submitted to EPA in response to the Information Collection Request (ICR) for National Emission Standards for Hazardous Air Pollutants (NESHAP) From Coal- and Oil-Fired Electric Utility Steam Generating Units (the HAPs ICR). Over 250 reviews were performed on ICR stack tests from coal-, oil-, and petroleum coke-fired electricity generating units representing about half of all the units required to test. As a result of these reviews, EPRI has significant data and experience as a basis for constructive comment on the ICR data as it has been used to develop MACT limits. EPRI conducted as thorough a review of the proposed rule, the underlying data, and technical support documents as the comment period allowed. Key Comment on Proposed MACT Limits; Fuels, Emissions, and Controls Significant errors in the underlying emissions data complicate review of the proposed rule. The most significant challenge that EPRI faced in reviewing the proposed rule resulted from errors in the underlying emissions data and discrepancies in the MACT floor calculations that were not addressed by EPA s revision of the coal mercury MACT limits on May 18, These problems make it very difficult to comment on the substance of the proposal. For example, mercury measurement conversion errors led EPA to calculate a MACT limit for mercury emitted from coal-fired electric utility steam generating units (EGUs) rated greater than or equal to 8,300 British thermal units (Btus) based on emissions from only 40 EGUs. Due to measurement conversion errors that resulted in emissions 1,000 times too low, EPA concluded that the ICR Part II mercury data had significantly lower concentrations than the Part III data. v

6 Therefore, EPA selected many of the best performing EGUs from the Part II tests. However, when the errors were corrected, most of the Part II EGUs were no longer among the lowest 40 in the combined data set. Instead, Part III contained most of the best performing EGUs. While EPA corrected the initial data conversion errors, the Agency did not adequately address other parts of the rule that were impacted by the original error, as shown by this example. EPRI has identified many other errors in the data used to calculate MACT limits, as well as inconsistencies in the MACT calculation procedures. These issues, if corrected, are likely to result in changes to the proposed limits. Thus, EPRI requests that the Institute and all stakeholders be given an additional opportunity to properly review and prepare comments on the revised rule, after these errors are properly addressed by the Agency. EPRI s detailed comments are presented in Sections 2 and 3. In the following, we summarize specific technical points and recommendations that we offer to support EPA and all stakeholders in this rulemaking. Specific Comments on Proposed MACT Limits; Fuels, Emissions, and Controls Specific comments on the proposed MACT Limits; Fuels, Emissions, and Controls are presented below. The more detailed comments are in Section 2 and in Appendices A D. Technical shortcomings and inconsistencies in EPA s MACT floor calculation procedure should be resolved prior to issuance of a final rule. Many of the proposed MACT limits for new generating units are below the measurement capabilities of the test methods used in the ICR and the continuous emission monitors that are proposed for compliance monitoring. The principal reason is that EPA s procedure for determining the MACT floor for new units selected outlier and erroneous emissions values, and did not take into account method performance with actual stack gas samples. The new unit limits adopted by EPA do not account for emissions variability due to changes in operating conditions that can be expected over the life of a unit. New unit limits should not be based on flue gas measurements that are below detection limit, as those values do not account for measurement imprecision. No coal-fired EGU tested in the ICR would likely meet the new unit MACT limits for all three regulated HAPs total particulate matter, mercury, and hydrogen chloride (or the alternative acid gas surrogate, sulfur dioxide). The new unit limits are very challenging to achieve as few EGUs have multiple ICR measurements that are consistently below the proposed new unit limits. The use of the lowest test series average introduces biases, and EPA should use the average of all ICR data for setting the HAPs standards for both new and existing EGUs. EPRI evaluation of the ICR test data supports EPA s conclusion that organic HAPs (including dioxin/furan/polychlorinated biphenyl, and non-dioxin organics) are predominantly below detection limits. The proposed use of total particulate matter (TPM) as a surrogate parameter for HAPs metals is problematic due to shortcomings in the test methods for the two components of TPM: vi

7 filterable and condensable particulate matter (PM). Particulate monitors measure only the filterable portion of TPM, and the manual stack test methods for both filterable and condensable PM are of uncertain accuracy. In addition, EPA s rationale for including condensable PM in the MACT limit to represent selenium is not supported, as the available ICR data indicate that condensable PM emissions typically do not correlate with selenium emissions. Emissions of mercury from coal-fired EGUs differ significantly between fluidized bed combustion (FBC) units and conventional pulverized coal combustion (PC) units. These differences are associated with fundamentally different combustion technologies. EPA s decision to calculate the MACT limit for mercury from coal-fired units greater than or equal to 8,300 Btus based on only 40 EGUs was an artifact of the mercury errors that EPA later corrected in the supporting documentation on May 18, The appropriate procedure is to calculate the MACT floor based on 12% of all units in this category, as EPA did for the HAPs limits for hydrochloric acid and PM. Thus, the number of EGUs in the mercury MACT floor pool should be 127 rather than 40. Additional data are required to evaluate the use of dry sorbent injection as a control for removing hydrochloric acid (HCl) and hydrofluoric acid (HF). Based on the limited available data, there are concerns about whether EGUs firing medium- to high-chloride coals can achieve the HCl standard using dry sorbent injection, and whether there would be impacts to balance-of-plant operations. About 12% of the coal-fired facilities that submitted HAPs data to EPA in response to the ICR may qualify as area sources. The ICR did not require EGUs to test over the full range of operating conditions, and therefore the ICR data do not represent the entire range of emissions variability from power plants. Additional measurements are needed to adequately characterize the variability of HAPs and surrogate emissions during normal plant operations. Sources of emissions variability include fuels burned, startup and shutdown conditions, partial load operation, and other reasonably foreseeable changes to operating conditions. Limited measurements at one facility indicated that trace metal variability was comparable to the variability of filterable PM measurements. Distillate (No. 2) oil and residual (No. 6) oil differ significantly in their emissions of trace metals. Particulate-phase metals correlate with filterable PM (as well as total PM) emissions for oilfired units. Based on recently published research, EPA s assumption that 65% of nickel emissions from liquid oil-fired EGUs is in the form of insoluble, crystalline nickel with carcinogenicity equal to nickel subsulfide is overly conservative. Recent measurements determined that nickel emissions from residual oil combustion were primarily soluble nickel sulfate, with lesser amounts of nickel/magnesium oxide and nickel ferrite. The concentration of nickel sulfide compounds emitted is below the detection limit (± 3% of total nickel) of the methods used. No regulatory agency, including EPA, provides a cancer unit risk for nickel sulfate or soluble nickel. Fuel oil metals analysis methods are not sensitive enough to support compliance monitoring for limited-used oil EGUs. vii

8 The Electronic Reporting Tool (ERT) has severe limitations for reporting compliance test results and needs extensive revision before it is used for that purpose. Key Comment on Environmental Fate and Transport, Exposure and Human Health Issues, and Risk Analyses EPA has failed to conduct scientifically rigorous analyses of the health risks associated with exposure to mercury, other hazardous air pollutants and particulate matter by not documenting its assumptions, using substandard methods, and excluding the highest quality/most recently available information. As in the case of Proposed MACT Limits and Fuels, Emissions and Controls, EPRI s ability to comment was challenged by the lack of documentation for data and underlying assumptions in EPA s analyses of the fate, transport, and risk associated with exposure to mercury and other HAPs. Further, the Agency s methods, tools, and input assumptions lacked scientific quality in addressing such an important societal issue. As the risk analyses provide the basis for requiring controls on HAPs emissions and the benefits derived from those controls, it is of critical importance that the most scientifically defensible methods and data be used. For example, as pointed out below, the Community Multiscale Air Quality (CMAQ) and Mercury Maps (MMaps) models have limited capabilities to adequately determine mercury deposition and bioaccumulation in fish. EPA s risk assessment methodology contains numerous errors leading to overestimates. Despite these flaws and errors, the primary health benefit is stated to be fine particulate matter (PM 2.5 ) concentration reductions in areas which already meet the PM 2.5 standard. Specific Comments on the Environmental Fate and Transport, Exposure and Human Health Issues, and Risk Analyses Specific comments on Environmental Fate and Transport, Exposure and Human Health Issues, and Risk Analyses are presented below. The more detailed comments are in Section 3 and in Appendices E J. There is no evidence of mercury hot spots due to local deposition associated with coalfired power plants. In fact, EPA does not adequately define a hot spot and its definition has changed over time. Data indicate that 91 96% of mercury emitted from power plants travels more than 50 kilometers (km) from the plant site, thus making hot spots unlikely. EPA did not provide a rigorous scientific validation of the mercury deposition estimated by its CMAQ model. For example, EPA did not employ the Advanced Plume Treatment module that more accurately simulates wet and dry deposition. Nor did EPA include new findings on in-plume conversion of reactive gaseous mercury to elemental mercury. The latter findings were part of an EPA study but were not incorporated in the risk assessment. In some cases, CMAQ predicted negative values for wet deposition, which is physically impossible. EPA employs MMaps to establish the relationship between mercury deposition and fish tissue methylmercury (MeHg) levels. While acknowledging its limitations, the Agency states that it is unaware of any other tool to do such analyses. The Mercury Cycling Model (MCM) developed by EPRI is a more rigorous tool that has been used for this purpose. viii

9 EPA assumes a steady-state linear relationship between changes in mercury deposition and fish MeHg tissue levels. Data that demonstrate a steady-state linear reduction in fish tissue MeHg in response to a reduction in atmospheric mercury deposition within watersheds do not exist. EPA needs to provide scientific evidence for such linear relationships within watersheds and an assessment of other models to allow comparison with MMaps. EPA uses poorly documented population and exposure assumptions in deriving estimates of people at risk from exposure to mercury. For example, the estimate of the amount of fish consumed by various sensitive population groups is not supported or documented sufficiently to conduct sensitivity analyses, and is drawn from small populations. Further, EPA assumes that poverty is a direct indication of subsistence fishing and high end fish consumption, but there is no documentation supporting these assumptions. EPA provides no clear definition of subsistence, near subsistence, or high end fish consumption. EPA does not justify selective cooking loss concentration factors for estimating fish MeHg levels. Studies vary on this factor, which is critical in determining overall risk. EPA s choice of a 1.5 concentration factor increases daily MeHg intake by 33%. In conducting its health risk assessment, EPA continues to rely on a reference dose (RfD) based primarily on data from the Faroe Islands that claims to have demonstrated no threshold for mercury health effects, despite the fact that neither EPA nor others have been able to acquire and model the Faroe data to confirm this conclusion. Iraqi and Seychelles data, on the other hand, do indicate a threshold effect. Selection of a threshold versus no threshold effect is critical since mercury exposure levels in U.S. women are lower than those observed in the Faroe Island study, which was used to derive EPA s mercury RfD and to help derive the intelligence quotient (IQ) response estimates in EPA s Regulatory Impact Analysis. Further complicating this issue is the fact that the endpoint of the Faroe Islands study was not IQ impact, but neurobehavioral changes. Changes in IQ are not a well defined health consequence of methylmercury exposure, and those IQ relationships that have been developed were done for mercury in marine fish, not freshwater fish. EPA s risk assessment methodology contains flaws and errors in input assumptions that lead to an overestimate of risk, particularly at those power plants whose risks were calculated to be at or above 1 in a million. EPA s use of the arithmetic mean to determine emission factors for a lognormally distributed data set is inappropriate; a geometric mean should be used. EPRI identified suspected chromium contamination at three case study sites, as well as 14 EGUs; these suspect data should not be used in calculating risks. This raises questions regarding the implications of these flaws and errors for all other power plant emissions and for the overall MACT rule. Results of EPRI s previous risk analyses, using different emissions estimates and updated stack parameters, indicate that no coal-fired facilities exceed the 1 in a million cancer risk threshold. Despite these shortcomings in EPA s risk analyses for mercury and other HAPs, the primary health benefit of the proposed rule comes from reductions in PM 2.5 concentrations. However, over 99.5% of the health benefits would occur in regions of the United States where PM 2.5 is currently in compliance with the ambient standard. Compliance with an EPA ambient standard, by definition, means that the air is healthy to breathe; thus, any further reduction in PM 2.5 concentrations below the ambient standard should have minimal health benefits. In ix

10 fact, over 70% of the calculated health benefits are in areas where the PM 2.5 concentration is less than 10 micrograms per cubic meter. It appears that EPA may have double counted in assigning health benefits to the proposed HAPs MACT rule. EPA needs to clearly articulate the individual benefits assigned to all of its pending air rules, including the new PM ambient standard, the Cross State Air Pollution Rule, and the short-term sulfur dioxide (SO 2 ) and nitrogen dioxide (NO 2 ) primary ambient standards. EPA s use of studies to derive the dose-response function for PM 2.5 relied on only two studies, not a balanced review of the published literature. Many studies were dismissed as having populations of a selective nature but this is true of all studies, including those used by EPA. Thus, EPA should use the entirety of the results from the literature, not a selected subset. The methodology used by EPA for benefits assessment appears to be flawed, based on recent analyses. A reanalysis of data from one study suggested that there is not any change in life expectancy for a reduction of PM 2.5. EPA also ignores any differential toxicity of PM species, which needs to be discussed in the context of potential health benefits of SO 2 reductions (and thus sulfate reductions) as a result of compliance with the HAPs MACT rule. EPA states that it is concerned about the impacts of HCl and other acid gas emissions on the environment. This statement is based on a study of waterbodies in the United Kingdom. However, the U.K. study results are inappropriate to apply in the United States because U.S. coals differ from U.K. coals in chloride content (U.S. coals have much lower chloride concentrations) and U.S. soils differ from U.K. soils (U.S. soils are limited in areas containing histosols). Further, the study claims that chloride is not taken up by plants and soils; however, other researchers are finding that there is some retention that would lessen chloride impacts even further. Lastly, HCl emissions are negligible compared to other primary emissions (such as SO 2 ) that can lead to potential acidification of ecosystems. x

11 CONTENTS 1 BACKGROUND PROPOSED MACT LIMITS; FUELS, EMISSIONS, AND CONTROLS Review of proposed limits is hindered by significant discrepancies in MACT floor calculations and data errors Technical shortcomings and inconsistencies in EPA s MACT floor calculation procedure should be resolved prior to issuing a final rule Emissions limits are not all measurable by existing stack test methods No coal-fired EGU tested in the ICR would likely meet the new unit MACT limits for all three regulated HAPs Use of total PM as a surrogate for HAPs metals is problematic EPRI s evaluation of ICR dioxin/furan/pcb and speciated organics test data confirms EPA s conclusions Mercury emissions differ between fluidized bed combustion (FBC) and pulverized coal (PC) boilers The number of EGUs that represent the Top 12% best performing units for the mercury MACT floor pool for existing coal-fired EGUs should be 127 rather than Based on limited available data, it is unclear whether plants firing medium- to highchloride coals can achieve the HCl standard using dry sorbent injection Approximately 12% of the 439 coal-fired facilities listed in EPA s PART II and III ICR database are potential area sources Additional measurements are needed to adequately characterize the variability of trace metals with normal plant operations Trace metal emissions differ between distillate (No. 2) and residual (No. 6) oil combustion Particulate-phase metals correlate with filterable PM (as well as total PM) emissions for oil-fired units EPA s assumption that 65% of nickel emissions from liquid oil-fired EGUs are as carcinogenic as nickel subsulfide is overly conservative Fuel oil metals analysis methods are not sensitive enough to support compliance monitoring for limited-used oil EGUs The Electronic Reporting Tool is not ready for use in compliance test reporting ENVIRONMENTAL FATE AND TRANSPORT, EXPOSURE AND HUMAN HEALTH ISSUES, AND RISK ANALYSES Findings do not corroborate hot spots due to excess local deposition of mercury near U.S. fossil-fired power plants Modeled mercury deposition is subject to critical uncertainties EPA s calculated contributions of U.S. EGU mercury emissions to deposition and fish tissue levels represent upper bounds of actual contributions Scientific evidence describes mercury reactions in power plant plumes that alter the relative contribution of U.S. coal-fired EGUs to local and regional mercury deposition xi

12 3.5 EPRI s comprehensive sector-wide inhalation risk assessment on all 470 coal-fired generating facilities identified no cancer or non-cancer health risks above regulatory risk threshold, in contrast to EPA s 16 case studies assessment EPA s 16 case studies for inhalation risk assessment need to be re-evaluated due to erroneous data that affect EPA s final risk numbers EPA provides no clear definition of subsistence, near-subsistence, or high-end populations at risk EPA does not provide clear, documented support for selection and application of the cooking loss factor EPA does not clearly define criteria for assignment of census tracts to HUC12 watershed Using census tract assigned poverty as an indicator of subsistence fishing or highend fish consumption lacks justification Selection of watershed-level risk metrics not adequately addressed Unexplained uncertainties remain in underlying methodology used to estimate individual IQ loss due to methylmercury exposure Underlying primary study for monetizing IQ (effect of IQ on lifetime earnings) of limited validity Potential cardiovascular effects due to methylmercury exposure appear overstated given equivocal nature of studies EPA appears to rely on an inappropriate apportionment of health benefits for the HAPs rule analysis with the majority due to co-benefits of reduced mortality resulting from reduced PM 2.5 emissions and the double-counting of benefits, especially for short-term SO 2 and NO 2 NAAQS Utility acid gas emissions, which are predominantly HCl, are insignificant contributors to acidification of ecosystems in the United States A APPENDIX: MACT FLOOR AND ICR DATA QUALITY ISSUES... A-1 A.1 Incorrect Heat Rate... A-1 A.2 MACT Spreadsheet Calculations... A-2 A.3 Miscellaneous Data Errors... A-2 A.4 Missing ICR Test Data... A-3 A.5 Bibliography... A-3 B APPENDIX: METHOD SENSITIVITY FOR ICR DATA USAGE AND COMPLIANCE WITH PROPOSED MACT LIMITS... B-1 B.1 Introduction... B-1 B.2 Mercury and Non-mercury Metals... B-3 B.2.1 Metals by EPA Methods 29 and 30B... B-3 B.2.2 Continuous Mercury Monitor... B-12 B.3 Acid Gases and Sulfur Dioxide... B-14 B.3.1 HCl and HF by Methods 26/26A... B-14 B.3.2 HCl Continuous Emissions Monitor... B-19 B.3.3 Sulfur Dioxide by Continuous Emission Monitor... B-20 B.4 Filterable and Condensable Particulate Test Methods... B-20 xii

13 B.4.1 FPM by Methods 5/29 and OTM B-21 B.4.2 CPM by Methods OTM-28 and B-22 B.4.3 TPM by Methods 5 and B-23 B.4.4 FPM by Continuous Particulate Monitor... B-24 B.5 References... B-24 C APPENDIX: LIST OF POTENTIAL AREA SOURCE FACILITIES AND ASSOCIATED EMISSIONS ESTIMATES... C-1 D APPENDIX: INDEPENDENT REVIEW OF THE UPPER PREDICTION LIMIT (UPL) STATISTICAL METHODOLOGY... D-1 E APPENDIX: CASE STUDY RISK ASSESSMENT REVIEW OF EPA EMISSIONS CALCULATIONS AND ASSOCIATED DATA... E-1 E.1 Review of Chromium Concentrations and Emissions Data... E-1 E.2 Data Set Distribution Issues... E-2 E.3 Review of Chromium Data... E-3 E.4 Updated Sites Average Emission Values or Emission Factors... E-7 E.5 References... E-10 F APPENDIX: CASE STUDY RISK ASSESSMENT EPRI INHALATION RISK ASSESSMENT FOR SELECTED COAL-FIRED POWER PLANTS... F-1 G APPENDIX: ITEMIZED COMMENTS ON TECHNICAL SUPPORT DOCUMENT FATE AND TRANSPORT, HEALTH, AND RISK... G-1 G.1 Errors Identified... G-1 G.2 Additional Comments... G-4 G.2.1 Substantial uncertainty exists in characterizing fish tissue methylmercury concentrations used to assign exposure levels to watersheds... G-5 G.2.2 EPA does not provide a clear, well supported description of assumptions about fish intake in population subgroups... G-6 G.2.3 EPA provides no clear description of its watershed level risk modeling or the metrics used to quantify risk... G-8 G.2.4 Assigning higher-than-usual EGU deposition values to watersheds and subsistence populations lacks quantitative support... G-8 G.2.5 Given high uncertainty in the underlying data, EPA should provide more than a minimal description of risk results at the watershed level... G-9 G.2.6 Alternative estimates of populations at risk due to methylmercury exposure using an RfD-based approach... G-10 G.2.7 EPA provides no quantitative analyses describing the influence of variability and uncertainty on its final health risk estimates... G-11 G.2.8 Potential overlap between subsistence fisher and recreational angler populations leads to IQ benefit overestimation... G-12 G.2.9 EPA could improve clarity by quantifying and presenting uncertainties in underlying assumptions used to estimate aggregate and subpopulation IQ losses... G-13 G.2.10 Improve clarity in presentation of national aggregate IQ losses and high risk subgroup distribution of IQ benefit estimates... G-15 xiii

14 G.2.11 Memorandum of ENVIRON to EPRI regarding ENVIRON s review of the Utility NESHAPS water modeling approach... G-15 H APPENDIX: NATIONAL ATMOSPHERIC DEPOSITION PROGRAM MAPS OF NATIONAL TRENDS IN CHLORIDE AND SULFATE DEPOSITION, H-1 I APPENDIX: LONG-TERM PARTICULATE MATTER STUDIES... I-2 I.1 California Cohort Studies... I-2 I.2 Harvard Six-Cities Cohort Study... I-2 I.3 American Cancer Society Cohort Studies... I-3 I.4 Veterans Cohort Study... I-4 I.5 Worcester (MA) Cohort Study... I-4 I.6 Canadian Cohort Studies... I-5 I.7 Other Cohort Studies... I-5 I.8 Occupational Exposure Studies... I-5 I.9 Ecological Studies... I-6 I.10 Asian Studies... I-6 I.11 European Studies... I-7 I.12 Studies in Other Locations... I-8 I.13 Reviews (partial listing)... I-8 J APPENDIX: REPORT SUMMARY MULTI-PATHWAY HUMAN HEALTH AND ECOLOGICAL RISK ASSESSMENT FOR A MODEL COAL-FIRED POWER PLANT...J-1 J.1 Methods and Research Approach... J-1 J.2 Risk Assessment Results... J-3 J.3 Conclusions... J-4 J.4 References... J-4 xiv

15 1 BACKGROUND On May 3, 2011, the U.S. Environmental Protection Agency (EPA) published a notice of proposed rulemaking (40 CFR Parts 60 and 63: National Emission Standards for Hazardous Air Pollutants from Coal and Oil-Fired Electric Utility Steam Generating Units and Standards of Performance for Fossil-Fuel-Fired Electric Utility, Industrial-Commercial-Institutional, and Small Industrial-Commercial-Institutional Steam Generating Units ) to reduce emissions of mercury, arsenic, acid gases, and other hazardous air pollutants (HAPs) from coal- and oil-fired electric utility steam generating units (EGUs). In this set of comments, the Electric Power Research Institute (EPRI) addresses several fundamental scientific and technical questions raised by the proposed rulemaking and the various ancillary technical documents issued along with the rulemaking. The discussion is presented in three sections. Section 2 introduces key technical issues regarding the numerical maximum achievable control technology (MACT) limits proposed and their related analyses of fuels, emissions, and controls. Section 3 presents technical discussions of EPA s results regarding environmental fate and transport of HAPs emissions, consequent exposure and human health issues, and risk analyses carried out by EPA. Finally, Appendices A J support the summary discussions in Sections 2 and

16

17 2 PROPOSED MACT LIMITS; FUELS, EMISSIONS, AND CONTROLS In this section, EPRI comments on EPA s proposed limits on emissions of HAPs from new/reconstructed and existing EGUs, and on the testing and reporting requirements for complying with those limits. The proposed limits were developed by EPA based on stack test results reported in response to Parts II and III of the Information Collection Request (ICR). Thus, to evaluate EPA s proposal, it is critical to understand the limitations of those test data, which fall into three groups: The Part II ICR data historical tests performed by power plants over the previous 5 years prior to the ICR, for various purposes. These data may not be representative of the population of U.S. power plants, as EPA made no attempt to ensure that the numbers of tests reported for each HAP were in proportion to the frequency of occurrence of various categories of power plant in the entire U.S. fleet. The Part III ICR data from facilities that were required to perform stack tests in response to the ICR. These EGUs were selected by EPA as potentially best performing units in each family of HAPs, e.g., units with the newest particulate matter control for trace metals and wet FGDs with the highest sulfur dioxide (SO 2 ) removal efficiencies for acid gases. The units are therefore not representative of the current population of U.S. coal-fired power plants. The random 50 plants 50 coal EGUs selected by EPA as representative of U.S. coalfired power plants not selected in Part III of the ICR. The three categories of data were pooled to select best-performing units for calculation of MACT floors. The pooled data are likely not representative of all U.S. power plants. EPA s selection approach complicates efforts to determine the need for and value of subcategorizing sources, i.e., by coal rank, coal chloride concentration, etc. The selection process also makes it difficult for the Agency and EPRI to calculate representative values for HAPs emissions across the industry, which can be accomplished only through an assessment of a broader and more representative population of existing units. Another significant limitation of the collected ICR data is that they do not reflect the full extent of emissions variability over time. The field measurements conducted for the ICR were generally limited to three short-term snapshots of stack emissions, conducted over a period of several days, with the unit burning a single fuel type and operating at full load. The tests do not characterize stack emissions over an extended period, when burning different coals (facilities often burn coals from multiple sources), and during start-up and shutdown. Additionally, as a collection of short-term tests, the ICR data do not adequately reflect process variability associated with fuel composition, control device operation, and other process variables. All of EPRI s following comments on the proposed MACT limits should be recognized as being constrained by the limitations of the available data to fully characterize power plant emissions and variability. 2-1

18 2.1 Review of proposed limits is hindered by significant discrepancies in MACT floor calculations and data errors There remain significant discrepancies in the MACT floor calculations and errors in the underlying emissions data that were not addressed by EPA s revision of the mercury (Hg) MACT limits on May 18, These problems make it very difficult to comment on the substance of the proposal, because correcting these errors will likely change the MACT floors. A more detailed discussion and a list of errors identified to date are presented in Appendix A. Four types of errors were noted: 1. Incorrect heat rates: Heat rates were calculated incorrectly for 53 EGUs listed in the MACT floor spreadsheets. These errors resulted in emissions on a pound per megawatt-hour (lb/mwh) basis that are two to six times too low. EGUs affected by this error include several identified as lowest emitting in the MACT floor spreadsheets. 2. Inconsistent MACT floor selection and upper prediction limit (UPL) procedures: EPRI found discrepancies between EPA s description of the procedure used to derive MACT floor limits and the implementation of that procedure. EPA stated that the MACT floor for new or reconstructed EGUs is based on the lowest emitting unit for which test run data were available. However, that procedure was not followed consistently, and no explanation is provided for some of the discrepancies. The procedures used to calculate the UPL is not consistent across all parameters and EGU categories. No explanation is provided for these discrepancies. 3. ICR Data Errors: EPRI has identified many errors in Part II and Part III ICR data, some of which impact the calculation of MACT floors. It is critical that EPA implement quality control checks to identify and correct additional errors in the ICR data. 4. Missing Test Results: Results of ICR Part III tests from nine liquid-oil fired EGUs are missing from the MACT floor calculations. Adding these data will likely change the MACT floors for this category. In addition, EPRI requests clarification on whether the proposed limit for mercury in new integrated gasification combined cycle (IGCC) units should have been revised, as that beyondthe-floor value was based on the limit for new coal EGUs greater than or equal to 8,300 British thermal units per pound (Btu/lb) that was reissued by EPA on May 18, Technical shortcomings and inconsistencies in EPA s MACT floor calculation procedure should be resolved prior to issuing a final rule EPRI reviewed in detail the upper prediction limit (UPL) statistical approach used by EPA to develop MACT limits, as well as the revised MACT floor memorandum and spreadsheets. From that evaluation, we have identified several aspects of the UPL approach that would benefit from reconsideration. We also requested an independent review of the statistical methodology by Dr. Paul Switzer, emeritus professor of statistics at Stanford University. Dr. Switzer has over 45 years of experience in statistical theory and practice, including the application of statistical methods to environmental data. His findings are provided in more detail in Appendix D. 2-2

19 2.2.1 EPA s UPL procedure and compliance approach do not account for many important sources of emission variability EPA should revise its method of assigning emissions to EGUs lacking test data: A source of significant bias in the UPL calculations is EPA s practice of assigning emissions from one boiler to all other boilers at the same facility when no test data were available (i.e., when there is a common stack). This practice results in multiple units that are represented as having zero variability, which has the effect of biasing the pooled variability low. Facilities included in the MACT floor pools have up to six replicated emission values; thus, the impact on the pooled variability is significant. EPA s substitution approach is technically incorrect for the MACT floor calculations. Tolerance limits may be more suitable than a UPL to set MACT limits: The goal of the UPL calculation is to find a number which would only be exceeded 1% of the time by an EGU whose concentration distribution mimicked that of the baseline EGU pool, i.e., to estimate the 99th percentile of the baseline pool based on a limited sample from that pool. Even with an unbiased estimate, substantial uncertainty would still remain new data from the same pool would surely result in a different estimate of the 99th percentile. To be confident that the true baseline 99th percentile falls below a proposed emissions limit would require an appeal to statistical tolerance limits that explicitly recognize sampling uncertainty of the 99th percentile estimate. This uncertainty may be better addressed by using tolerance limits rather than a UPL. There are many factors that contribute to emission measurement variability that may not be adequately represented in the ICR data used to calculate MACT limits: Measurement variability would include sampling variability associated with obtaining a representative sample, while analytical variability would include precision of the method and accuracy with the matrix and potential interferences. Process variability related to the power plant process would vary with each HAP (or surrogate) as to its formation, chemistry, and fate in combustion and air pollution control unit operations. For example, mercury emissions are believed to be affected by the coal concentration (and its variability) of mercury, chloride, bromide, and sulfur, as well as the level of unburned carbon, the flue gas temperature and time profile, and the air pollution control design and performance. Selective catalytic reduction (SCR) catalysts for control of nitrogen oxides (NOx) may oxidize mercury, enhancing subsequent removal in the downstream flue gas desulfurization (FGD) system, if available. The re-emission of mercury in FGD systems is also not well understood and is likely related to the chemistry within the FGD absorber. The requirement for EGUs to comply with site-specific operating limits nullifies the rationale for using a UPL: EPA s stated rationale for using the UPL is to account for variability in future measurements on EGUs within the population of best-performing units. For existing EGUs, EPA attempts to account for within-unit and between-unit variability by including many EGUs (and for some limits includes multiple test series) in the UPL variance term. However, the compliance requirements for EGUs set out in Sec of the proposal [Federal Register (FR) 25103] indicate that the actual limit that EGUs must meet is not the MACT limit, but a site-specific operating limit based on initial performance testing. This initial test, which is generally performed only once every 5 years, does not include any of the sources of variability listed above. EPA cannot make any statement about the 2-3

20 statistical probability of meeting site operating limits, as those limits cannot be determined until the performance test is completed Discrepancies and inconsistencies in application of the UPL should be eliminated and the procedure should be clarified EPRI noted that the application of the UPL procedure is not consistent from one MACT floor spreadsheet to the next, and that the implementation of the procedure needs to be more clearly explained in the final rule. Issues that we noted are listed below. Inconsistent procedures should be resolved: As noted in Appendix A, some MACT limits included all test series in the pooled mean (e.g., coal mercury 8,300 Btu/lb) while others included only the lowest test series per EGU (e.g., total particulate matter [TPM] for coal EGUs). The data included in the pooled variance term also are not consistent: the mercury limit for existing coal 8,300 TBtu/lb was calculated using the pooled variance of all Part II and III test series, while the MACT floors for TPM and metals in existing coal EGUs were calculated from the variance of the lowest test series for each EGU. A consistent procedure that reflects both within- and between-source variability (i.e., includes all test series in both the mean and variance term) should be used in all cases. Sorting of EGUs produced inconsistent results in lb/mmbtu and lb/mwh units: The proposal states that The initial sort of the respective data to determine the MACT floor pool for analysis was made on the lb/mmbtu formatted data; this same pool of EGUs was then used for the lb/mwh analysis [FR 25041]. However, it is not stated that that EPA then sorted the initial list by lb/mwh values and used the lowest lb/mwh value as the basis of the new EGU limit (barring a data error or other issue). The result can be seen in the coal particulate matter (PM) MACT spreadsheet, where the heat rate errors noted in Appendix A have resulted in two very different sorting orders for the MACT pool EGUs in pounds per million British thermal units (lb/mmbtu) and lb/mwh. To remedy these inconsistencies, the initial sort for limits expressed in units of lb/mwh should be on lb/mwh values. The procedure described for limits based on an MDL requires clarification: The procedure for adjusting some MACT floors for new EGUs to be above a method detection limit (MDL) [FR 25044] does not clearly indicate that the MDL that is referred to is actually the lowest test run value in the 3-run test series of the lowest-emitting EGU. For example, in the coal hydrogen chloride (HCl) MACT floor worksheet (EPA-HQ-OAR ), cell C102 of the HCl_New_MW tab points to cell B7 a below detection limit (BDL) value that is more than two times lower than the other two BDL values in the same test series. The selected value cannot be called a representative MDL for the method. Multiplying that MDL by three, as EPA has done, results in a MACT limit that is exceeded by test series from the same EGU. The ICR database contains BDL-flagged HCl values 10,000 times higher than the one selected, indicating that it is highly unlikely that the BDL value selected is representative or achievable by all laboratories Accounting for measurement imprecision is critical in establishing emission limits EPA has requested comment on how to incorporate below-detection-limit data into the calculation of MACT limits (FR 25044). EPRI has been conducting research into appropriate methods to represent detection and quantitation levels in statistical calculations for many years. 2-4

21 EPA has correctly acknowledged this challenge and EPRI agrees that resolution to this question is critical. However, with the limited time available to comment, EPRI cannot provide a recommendation on this complex and challenging issue. EPRI is open to working with EPA and other stakeholders in developing robust and scientifically sound approaches to address this fundamental issue. One general recommendation that we can make based on our evaluation of the ICR data, is that EPA should recognize that laboratory MDLs are often not an appropriate indicator of the capability of stack test methods. For some HAPs, below detection limit test results were reported at much higher concentrations than typical MDLs (e.g., 10,000 times higher for HCl). These differences are due to matrix effects in samples exposed to flue gas that are not present in the clean laboratory samples used for MDL studies. Thus, the question is not what substitution procedure should be used to represent BDL values in the UPL calculation, but rather, what is the lowest concentration of the HAP that a majority of competent labs measured accurately in the ICR. To answer that question from the current ICR data requires a detailed examination of the data, such as EPRI has provided in Appendix B. For the new EGU limits, where the UPL is calculated from as few as three test runs conducted at a single facility by a single laboratory, it is incumbent on EPA to verify that those reported values are free of errors, analyzed according to standard laboratory procedures, and reported in accordance with the ICR requirements. An emission value below detection limits should not be used to calculate emissions for a lowest-emitting unit, because these values understate measurement variability and do not have acceptable accuracy and precision. Any EGU that does not have at least one result at a concentration that can be measured accurately based on the laboratory s own MDL study should be eliminated from consideration as the lowest-emitting unit. We would also ask EPA to consider whether a MACT limit based on an EGU where all test runs were below detection qualifies as a lowest-emitting unit, because these emissions are not quantifiable Key conclusions of Dr. Switzer s evaluation of the UPL A more detailed presentation of this evaluation is given in Appendix D. The ICR data used for estimation of the 99th percentile UPL are not consistent with statistical assumptions: The UPL estimation procedure treats the sampled data (baseline data) as if they consist of independent interchangeable measurements from a relevant population. However, these ICR data from the best-performing units selected for each MACT pool exhibit intra-unit correlations resulting from the heterogeneity of the selected units. Furthermore, the UPL calculation does not account for temporal autocorrelation inherent in the time series nature of the data. The Central Limit Theorem does not apply to the UPL calculation: EPA relied on the Central Limit Theorem to support UPL calculations based on a two-sample t-test for large (>15) data sets. However, this application of the Central Limit Theorem requires both the compliance data set, as well as the baseline data set, to be based on large samples, with each of the two data sets being comprised of independent measurements. The assumption that m=3 2-5

22 independent compliance measurements does not meet the requirement of large data sets for the Central Limit Theorem. Selecting the lowest test series from each EGU introduces bias: To calculate MACT limits for existing units for some HAPs (see Appendix A), EPA used the lowest test series average for each EGU to calculate the pooled mean. This approach would be justified only if it is anticipated that the same selection of lowest values from a series were anticipated for the compliance tests. However, that is not the way the proposal is written it calculates the probability of a single 3-run test series average exceeding the UPL. The UPL calculations fail to account for unequal numbers of test series among EGUs: According to the EPA calculations, an EGU with more test series will contribute proportionally more to the variability of the pooled data set than an EGU with only one test series. To avoid giving more weight to some units than to other units in the data pool, it is necessary to properly apply data weighting to eliminate this bias. Individual measurements that comprise the baseline data pool are incompatible with anticipated compliance measurements: The EPA approach [as stated on FR 25041] to calculation of a UPL is based on an average of three compliance test runs, each of which follows the underlying measurement protocol that was used for the component baseline data. However, compliance options in the proposal typically are at variance with this requirement. Most require compliance to be determined on a 30-day rolling average using a continuous emissions monitor (CEM). It should be realized that the statistical properties of these two monitoring approaches are different. Variability may be substantially underestimated: Where the ICR baseline data pool is comprised of brief snapshot measurements made over a short time interval, this data pool will not include the natural variability that might be expected over longer time periods. In those cases where the longer historical time perspective is lacking in the baseline data pool, the baseline variability used for the UPL calculations will be incompletely represented, and will not reflect the additional variability that the baseline EGUs would experience over time. 2.3 Emissions limits are not all measurable by existing stack test methods EPA requested comment (FR 25044) on the procedure used to calculate new MACT floors for parameters where the lowest emitting unit contained data flagged as below detection limit (BDL) or detection level limited (DLL). EPA also requested comment (FR 25044) on approaches to accounting for measurement imprecision in setting MACT limits. This comment addresses those requests, but also addresses the broader problem that many of the proposed MACT limits are set below the measurement capability of the test methods. For power plant owners to comply with the emission limits in the proposed rule, there need to be test methods available that can measure the HAPs or surrogates accurately at those stack gas concentrations. For each of the proposed MACT limits for existing and new/reconstructed coaland liquid oil-fired EGUs, EPRI evaluated whether an accurate measurement could be obtained by most of the ICR laboratories at those emission levels using the standard test methods required by the ICR. Table 2-1 summarizes the conclusions of EPRI s evaluation of method sensitivity. A detailed explanation of how these conclusions were reached is provided in Appendix B to these comments. 2-6

23 The principal reason that method sensitivity is not adequate to quantify HAPs or surrogates at many of the new unit limits is that EPA s procedure for determining the MACT floor for new units selects outlier emissions values, and does not take into account method performance with actual stack gas samples. EPA used one of two procedures to calculate the new unit limits. 1. A UPL was calculated from the lowest emitting test series (in units of lb/mwh) for which individual run data were available. 2. For HAPs with test run data flagged BDL or DLL, EPA compared the UPL against a value of three times the lowest emission value of the lowest test series. The greater of the UPL or 3 times the lowest emission value was used as the MACT floor. This floor value is referred to in the MACT floor spreadsheets as 3 times the method detection limit (MDL), or 3 x MDL. There are several reasons why these procedures may produce a new unit limit below the actual capability of the test method. The lowest emitting EGU is often one in which all runs are flagged BDL, indicating that emissions are below detection limit. However, those detection limits often are low not because the laboratory used unusually sensitive techniques, but because the reporting requirements specified in the ICR were not followed. A very common error, observed by EPRI in many ICR Part III Electronic Reporting Tool (ERT) files, is failure to sum the fractions of a multi-fraction sample correctly. For example, if rather than summing the detection limits of the front half and back half fractions of a Method 29 sample, as required for the ICR the emission was reported using the MDL of only one fraction, this would result in a much lower MDL value. Blank-correcting metals results to below the MDL is another common error that results in an unrealistic, low emission value. Some of the lowest emitting test series are low not because they have low emissions, but because they contain errors such as those listed in Appendix A. Using BDL- and DLL-flagged data in the new unit UPL calculation masks the actual variability of low-level measurements. A laboratory MDL is a single value that only differs among three emission measurements in a test series to the extent that the sample volumes or sample dilutions differ. Thus, a UPL calculation based on flagged measurements will underestimate measurement variability. Where procedure No. 2 was used, basing the MACT limit on 3 times the lowest test run of the lowest test series of the lowest emitting EGU resulted in a limit that could not be detected at that EGU on a consistent basis. That is the case for HCl for new coal units, where all measurements in all test series were non-detect and several of the detection limits exceeded the proposed limit. MDLs are typically determined in clean samples (purified water or clean particulate filters) that have not been exposed to flue gas. Analytical methods can measure to much lower levels in clean samples than in stack gas samples. It is important that EPA carefully review all emission data used in the MACT floor calculation to ensure that the data are free of errors and are reported according to ICR requirements. The resulting emission limit should be measurable with acceptable precision in actual field samples, using standard methods with a sampling duration that is practical for routine stack testing. To 2-7

24 determine that emission value requires consideration of the precision of methods in the field, not simply taking a laboratory MDL and applying a multiplier. The most appropriate means to determine method precision is to conduct multi-train stack tests at full-scale power plants, at the approximate flue gas concentration considered for the MACT limit. Recognizing that such test data generally do not exist and will take a considerable effort to collect, EPRI recommends setting the limits for existing EGUs such that most BDL-flagged ICR results are below the MACT limit. Any BDL-flagged values above the limit should be reviewed to determine whether the elevated detection limits are the result of unusual source characteristics, poor laboratory performance, or sampling, analytical, or reporting error. As discussed earlier, in setting limits for new/reconstructed EGUs, the MACT limit should not be based on BDL-flagged values. EPRI analysis of method sensitivity evaluated the ability of test methods to measure HAPs and surrogate parameters at the MACT limits. However, the proposal also requires facilities to establish operating limits for some HAPs based on initial performance tests, which would then become the effective emissions limits for that facility. As those operating limits would be lower than the MACT limits, there may be additional restrictions on the use of test methods beyond those indicated in Table

25 Table 2-1 Summary of Test Method Adequacy to Quantify Emissions at MACT Limits Coal-Fired EGU: Method Sensitivity is Adequate at: Liquid Oil-Fired EGU: Method Sensitivity is Adequate at: New Unit Limit 1 Existing Unit Limit New Unit Limit Existing Unit Limit Methods 29/30B Antimony No Yes Yes Yes Arsenic No Yes No Yes Beryllium No Yes No No Cadmium No Yes No Uncertain Chromium Yes 2 Yes Yes Yes Cobalt Uncertain 3 Yes Yes Yes Lead Uncertain Yes Yes Yes Manganese Uncertain Yes Yes Yes Mercury (Method 30B) Uncertain Yes 4 No Uncertain Mercury (Method 29) No Yes 4 No No Nickel Uncertain Yes Yes Yes Selenium Yes 2 Yes No Yes Total Metals No Total Non-mercury Metals Yes Yes Yes Continuous Mercury Monitor Mercury No Yes 4 No Yes Methods 26/26A Hydrogen chloride (HCl) No Yes Uncertain Yes Hydrogen fluoride (HF) Uncertain Yes Method 6C or CEM Sulfur dioxide Yes Yes Method 5 and Method 202 Total PM Yes Yes 1 Also applies to IGCC units, where these are the same as coal. The mercury limit for new IGCCs is not achievable by any of the above methods. 2 Based on proposed limits, which may have been determined incorrectly in the rule. 3 Uncertain means that the method may be able to measure at the proposed limit, but additional research is needed to verify performance. 4 Coal EGU 8,300 Btu/lb. 2-9

26 2.4 No coal-fired EGU tested in the ICR would likely meet the new unit MACT limits for all three regulated HAPs Based on EPRI s review of the ICR data, none of the coal-fired EGUs that reported data to EPA for TPM, mercury, and HCl (or the alternative acid gas surrogate, SO 2 ) would consistently meet all three new unit MACT limits. Only two EGUs had lowest test series average below each of the new unit limits. However, both of these units reported multiple test series; when those test series are included, the average of all the data is greater than the new unit limit. In addition, neither of these two units is typical of the broader U.S. coal-fired power industry: one is a stoker boiler and the second fires a waste coal Evaluation of standard-setting EGUs for achieving all HAPs emission limits EPA determined MACT limits from the lowest emitting EGU for each specific HAP or HAP surrogate. This approach does not consider whether a single facility is capable of meeting the other proposed standards. Table 2-2 compares the proposed MACT limits for new coal EGUs with the lowest test series average for the four sites that were used by EPA to produce the new unit standards. None of the four EGUs had emissions below all three limits; some did not test for all of the parameters. Values shaded in orange exceed the proposed new unit limit. NA indicates that measurements are not available in the Part II or III data series for that HAP in the current EPA MACT Floor spreadsheets. As shown in Table 2-2, Logan Unit 1 (the site used for the HCl limit) would not meet the limits for TPM or the SO 2 surrogate. Dunkirk Unit 1 (the site used for the TPM limit) would not meet the HCl or SO 2 limits and was not measured for mercury. Nucla Unit 1 (the site used for the mercury limit) would not meet the TPM limit and was not measured for HCl or SO 2. Port of Stockton, i.e., POSDEF Unit 1, (the site used for the SO 2 limit) has no other HAPs measurements. 2-10

27 Table 2-2 Comparison of New EGU MACT Limits with Lowest Test Series Average Emissions from Four Standard-Setting Coal Units HCl SO 2 TPM Mercury New Unit MACT Limit (lb/mwh) Logan 1 (HCl limit) Dunkirk 1* (TPM limit) POSDEF 1 (SO 2 limit) Nucla 1 (mercury limit) E E NA** NA 0.09 NA NA NA NA E-8 *Dunkirk has the lowest TPM value after the AES Hawaii heat rate error is corrected and the lb/mwh values are recalculated. Note that AES Hawaii would likely not meet the HCl or SO 2 limits, even with the error in the heat rate. ** Measurements are not available in the Part II or III data series for that HAP in the current EPA MACT Floor spreadsheets. Exceeds proposed new unit MACT limit Evaluation of ICR EGUs tested for all HAPs for achieving all HAPs emission limits EPRI s review of the ICR Part II and III test data indicated that 115 EGUs reported measurements for all three of the regulated parameters (mercury, HCl, and TPM). Using the lowest test series average for comparison, only 6 of the 115 EGUs would meet the new unit HCl limit, 10 would meet the mercury limit, and 46 would meet the TPM limit. Only two of the 115 (Seward Unit 1 and Spruance Generator 2) would meet the new unit MACT limits for mercury, HCl, and TPM, using the lowest test series average. However, both Seward and Spruance reported additional mercury measurements; the average of all the measurements is above the new unit limit. Thus, both Seward and Spruance likely would not consistently meet the new unit limit for mercury. Spruance also reported an HCl value above the new unit limit. Neither Seward nor Spruance is typical of U.S. power generating units. Seward Unit 1 is an underfed stoker boiler, with a lime spray dryer and fabric filter. Stokers are limited to smaller electrical output and have lower thermal efficiency than pulverized coal boilers. Using the lowest test series average, Spruance Generator 2 met all three limits, but two identical sister units, Seward Generators 3 and 4, would not meet the new unit limit for TPM. Spruance is also an atypical EGU, as it fires waste coal, which is a byproduct of coal cleaning in the Appalachian region Individual new unit limits are challenging As noted above, using the lowest test series average for comparison, only 6 of the 115 EGUs would meet the new unit HCl limit, 10 would meet the mercury limit, and 46 would meet the 2-11

28 TPM limit. EPRI further evaluated the additional measurements reported and whether these EGUs likely would consistently meet the new unit limits. This analysis was limited to HCl and mercury, as additional data for TPM were not available for evaluation. Of the 6 EGUs that would meet the new unit HCl limit, 4 EGUs have additional measurements above the limit, so would not consistently meet the new unit limit. The remaining 2 EGUs Spruance Generators 2 and 3 would meet the new unit limit using the lowest test series average, but their sister unit, Spruance Generator 4, would not meet the limit. For mercury, 3 of the 10 EGUs have additional measurements that have values above the limit, and thus would not consistently meet the new unit limit. The remaining 7 EGUs do not have a second measurement for comparison Summary The new unit limits for HCl and mercury are challenging to achieve on a continual basis, as few EGUs have multiple ICR measurements that are consistently below the individual new unit limits. As noted earlier in EPRI Comments, Section 2.2.4, since the use of the lowest test series average introduces biases, EPA should use the average of all ICR data for setting the HAPs standards for both new and existing EGUs. Further, by selecting the lowest emissions for each HAP (or its surrogate) from an extremely large pool of data sets that are dissimilar in design, there is a significant probability that all HAPs limits cannot be met simultaneously for any specific design. Consequently, EPA should develop an alternative approach to selecting the unit and data set for new unit limits i.e., the emissions from a best performing facility could be used to set all the various HAPs new unit standards. 2.5 Use of total PM as a surrogate for HAPs metals is problematic EPA has proposed to use TPM as a surrogate parameter to regulate emissions of non-mercury metals from coal-fired EGUs. TPM consists of filterable particulate matter (FPM) and condensable particulate matter (CPM). Past EPRI research has found that FPM is correlated to emissions of particulate-phase metals (e.g., chromium [Cr]). By proposing a MACT limit that includes CPM, EPA is apparently attempting to improve the surrogacy relationship between the relatively volatile element selenium and a PM surrogate. The extent to which selenium is captured in the CPM sampling apparatus (Method 202) is unknown. The merits of including CPM in the TPM limit are difficult to evaluate, as no test data are available for the exact CPM method that is required for compliance with the proposed MACT limits. The ICR required CPM to be measured using OTM-28. That method was changed significantly before it was promulgated in December, 2010 as revised Method 202. Most ICR test contractors used OTM-28; a few used the original Method 202. Because none of the ICR tests used revised Method 202, the conclusions of EPRI s evaluation of CPM method usefulness are uncertain. Review of the ICR data shows that selenium emissions do not have a strong correlation with CPM emissions. To the contrary, as shown in Table 2-3, the correlation of CPM emissions with all of the HAPs metals is poorer than their correlation with FPM or TPM. This is not surprising, 2-12

29 as ion chromatography analyses of the CPM catch from coal-fired power plants indicate that CPM is predominantly sulfuric acid [EPRI, 2000]. The selenium content of coal is typically less than 10 parts per million (ppm), while the sulfur content is usually in the range of tenths of a percent (1,000 ppm) to four percent (40,000 ppm). In all but a few ICR tests, Method 29 selenium emissions were less than one percent of OTM-28 CPM emissions. Thus, even if selenium is captured in the Method 202 apparatus, a CPM measurement will usually have no predictive value for selenium emissions. As indicated in Table 2-3, the correlation coefficients (r 2 ) of selenium with FPM and TPM are identical (0.17), indicating that the correlation between selenium and FPM is the largest contributor to the relationship between selenium and TPM. For the other metals, some correlate slightly better with TPM, others with FPM. The differences are minor. Figure 2-1 plots the average CPM emissions from the ICR Part III coal EGUs against the average selenium emissions for the same units. The data are shown on a log-log scale to better display the lower end of the concentration range. It is clear from this plot that there is a large amount of scatter in the ratios of CPM to selenium at every point in the concentration distribution. Except possibly at the very high end of emissions, there is no apparent trend of increasing CPM with increasing selenium. Table 2-3 Correlation of ICR Part III Coal EGU Non-mercury HAPs Metals with Particulate Matter CPM FPM TPM Element Data Count R 2 Data Count R 2 Data Count R 2 Antimony Arsenic Beryllium Cadmium Chromium Cobalt Lead Manganese Nickel Selenium Total Non-mercury HAPs Metals CPM condensable particulate matter; FPM filterable particulate matter; TPM total particulate matter. R 2 correlation coefficient using a power regression. All correlations are significant at the 95% confidence level. 2-13

30 Figure 2-1 Selenium Correlates Poorly with CPM Adding a CPM measurement to the FPM sampling train increases the total variability and reduces the sensitivity of the combined measurement system. A recent field study conducted by American Electric Power (AEP) and EPRI found that Method 202 (CPM) had much higher variability than Method 5 (FPM) for replicate test runs [EPRI, 2011]. In the ICR Part III tests at coal-fired EGUs, the relative standard deviations (RSDs) of CPM test series averaged slightly higher than the RSDs of FPM test series (24% compared to 20%). The detection limit of a combined Method 5/Method 202 sampling train is twice as high as that of a Method 5 train alone, as each gravimetric measurement adds uncertainty to the total. The accuracy of the new Method 202 has not been demonstrated adequately and requires additional research. Laboratory studies conducted by EPRI pointed to a potential negative bias in OTM-28, which did not effectively capture sulfuric acid aerosol [EPRI, 2009]. Only about half of the sulfur trioxide (SO 3 ) introduced into the sampling train was captured. This finding indicates that Method 202 may not be accurate under flue gas conditions that are expected to be increasingly prevalent in power plants as wet FGD systems (which tend to produce sulfuric acid aerosols) are added. A TPM limit is problematic for compliance monitoring, as continuous PM monitors measure only FPM. Since CPM emissions are a product of the coal sulfur content and other factors (e.g., ash properties, SCR SO 2 oxidation rate), it is not clear how a power plant would determine 2-14

31 compliance with the TPM limit without being able to monitor both FPM and CPM on a continuous basis. The difficulty of demonstrating and achieving compliance with TPM limits will depend on the percentage of TPM that is CPM and how variable the ratio of FPM to CPM is over time. As shown in Figure 2-2, the fraction of TPM that is CPM varies from under 1% to over 99% across the coal-fired EGUs tested in the ICR. The right-hand axis indicates that no more than 15% of ICR measurements fall within any one group shown in the histogram. The ICR database does not contain sufficient historical CPM data to determine how much CPM emissions will change at a unit over time. It is important to note that there are also uncertainties associated with measurement of FPM. EPRI [2011] found that the temperature of the particulate filter has a significant impact on the test results; a higher temperature (i.e., 320 F) results in lower FPM measurements compared to the standard Method 5 temperature of 250 F, presumably due to condensation of sulfuric acid on the filter in the latter case. Since many of the Part III ICR tests on dry stacks were conducted at the lower temperature due to lack of clarity on the ICR testing requirements [EPRI, 2010], this finding suggests that some of the ICR data are biased high. The type of filter used also affects the results: glass fiber filters (typically used in Method 5) collect more mass than do quartz fiber filters (typically used for Method 29 metals analysis). The higher amount is from absorption of acid gases on the alkaline glass fiber filter [EPRI, 2011]. In EPRI s study 6 out of 7 samples collected with a glass fiber filter had higher FPM loading than a sample collected simultaneously with a quartz fiber filter: the increase in FPM on the glass fiber filter ranged from 3 to 124 percent. EPRI s study has implications for how representative the ICR data are of industry emissions, as well as for future performance testing. EPA based the MACT floor for TPM on the lowest test series for each EGU, regardless of the method used to measure FPM (Method 5, Method 29, or OTM-27). As many of the lowest ICR emissions are from Method 29 tests run at 320 F, those results will not be representative of tests made using Method 5 at 250 F. As a result, there is considerable uncertainty regarding the true emissions of FPM from the power industry. The compliance method for FPM specified in the proposed rule is Method 5, while the ICR test data include FPM results from Methods 5, 29, and OTM-27. To eliminate the various method biases identified in the EPRI/AEP study, the requirement to run Method 5 tests at 320 F should also apply to Method 29 metals tests. In addition, there should be a requirement to use nonreactive filters. 2-15

32 Figure 2-2 Histogram of CPM as a Percentage of TPM: Part III ICR Coal-fired EGUs References EPRI, Alternative Methods for Measurement of Condensible Particulate Matter: Field Test Report. Palo Alto, CA: EPRI, Evaluation of Alternative Condensible Particulate Matter Methods. Palo Alto, CA: EPRI, Impact of Sampling Procedures on Results of Filterable and Condensable Particulate Stack Test Methods. Palo Alto, CA: EPRI s evaluation of ICR dioxin/furan/pcb and speciated organics test data confirms EPA s conclusions EPA has proposed work practice standards for control of dioxins, furans, and polychlorinated biphenyls (PCBs), as well as speciated organic HAPs. The basis for this decision was that ICR results showed a high frequency of emissions below detection limits, even after an 8-hour monitoring period, making monitoring impractical. EPRI s evaluation of the ICR dioxin/furan/pcb HAPs emission tests confirms EPA s finding that most congeners in this group were not detected [EPRI, 2010]. In addition, our evaluation found that contamination of the samples from non-power plant sources biased emissions high in many samples. The chemicals in this HAPs group are ubiquitous in the environment and the test 2-16

33 method is so sensitive that it is very difficult to avoid contamination of the sample during sampling and analysis. EPRI s evaluation of the ICR organics HAPs emission tests confirms EPA s finding that most organic HAPs were not detected [EPRI, 2010]. Our evaluation also found that some of the chemicals that were detected were affected by contamination of the sample with non-flue gas sources of HAPs. Several of the chemicals noted by EPA as frequently detected in the ICR data (benzene and formaldehyde) were also frequently detected in the field blanks and/or method blanks and are known contaminants or breakdown products of the sorbent used in sample collection Reference EPRI, Data Quality Evaluation of Hazardous Air Pollutants Measurements for the U.S. Environmental Protection Agency s Electric Utility Steam Generating Units Information Collection Request. Palo Alto, CA: Mercury emissions differ between fluidized bed combustion (FBC) and pulverized coal (PC) boilers EPRI s review of the ICR data indicates that mercury emissions from fluidized bed combustion (FBC) and conventional pulverized coal (PC) combustion differ from one another. FBC and PC combustion are fundamentally different types of combustion processes. Fluidized bed boilers generally burn lower quality fuels, and the crushed coal particles are suspended on upwardblowing jets of air within the boiler. In a typical PC boiler, coal is pulverized into very fine particles, blown into the boiler, and ignited to form a long flame. FBC units typically operate at lower furnace temperatures (typically ºF), burn larger size coal particles to a lower degree of combustion efficiency, employ longer residence times in the combustion process, use in-bed SO 2 capture via limestone or dolomite addition directly to the furnace, and often recycle spent bed material back to the boiler. PC units pulverize the coal to a very fine particle size to maximize combustion efficiency and minimize unburned carbon. FBC units typically have higher levels of unburned carbon present in the ash, which generally assist in increasing mercury capture, especially as many FBCs employ fabric filters for PM control. Table 2-4 compares ICR mercury emissions from FBC and conventional PC boilers. The mean values and standard deviations are calculated from the lowest test series (in lb/mmbtu) for each EGU, as listed in EPA s revised mercury MACT floor spreadsheet, dated May 18, Of the 40 EGU test series used to calculate the MACT floor for existing EGUs burning coal with greater than 8,300 Btu/lb, 14 (35% of the entire MACT floor data pool) were FBC units. Only 6% of all coal-fired boilers in the U.S. fleet are FBC units, with the remaining units mostly PC boilers. Of the FBC units that reported ICR Part II or Part III mercury measurements, 45% (14 of 31) were included in the 40 lowest emitting units. By comparison, 9% (26 of 302) of the PC units were included in the 40 lowest emitting units. These statistics indicate that FBC units are greatly over-represented in the mercury MACT floor pool. Mean mercury emissions for all PC units are more than twice the mean emissions for all FBC units. The 31 FBC units include 5 that fire lignite; these lignite units exhibited some of the highest mercury emissions for FBCs. Lignite coals typically have low chloride content, which 2-17

34 generally leads to higher percentages of elemental mercury in flue gas and lower overall capture of mercury in the associated ash material. The lower chloride and typically higher levels of mercury in Texas lignite coals likely explains the higher mercury emissions from FBC units firing those coals. If the 5 lignite units are excluded from the data set, the resulting FBC mean of 3.8E-7 lb/mmbtu mercury is about 10 times lower than the mean for non-lignite PC units of 3.0E-6 lb/mmbtu mercury. Table 2-4 Comparison of ICR Mercury Emissions from Fluidized Bed Combustion (FBC) and Pulverized Coal (PC) Boilers Boiler Type Number in ICR Data Set Mean (lb/mmbtu) Standard Deviation (lb/mmbtu) All ICR Data FBC (all coals) E-6 2.9E-6 FBC (without lignite) E-7 6.4E-7 PC (all coals) E-6 3.5E-6 PC (without lignite) E-6 3.5E-6 Figure 2-3 shows cumulative frequency distributions of average mercury emissions for the 302 PC units and the 31 FBC units in the revised coal mercury MACT spreadsheet. The two distributions are distinctly different, with a median mercury emission of 7.2E-8 lb/mmbtu for FBC units and 2.2E-6 lb/mmbtu for PC units. Almost 80% of the FBC units had mercury emissions below 1.2 pounds per trillion British thermal units (lb/tbtu), whereas only 35% of PC units tested had emissions below 1.2 lb/tbtu. Aside from the FBC lignite units, the highest emitting FBC units have mercury emissions over an order of magnitude lower than those from the highest emitting PC units. These comparisons indicate that FBC and PC EGUs are distinct populations with respect to mercury emissions. The observed differences are associated with properties of the fluidized bed and pulverized coal boilers including lower fuel quality, the flue gas time and temperature profile, and the higher levels of unburned carbon in FBC fly ash that generally enhance overall mercury adsorption, especially in the downstream fabric filter which is often employed at FBC facilities. 2-18

35 Mercury Emissions, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August E-04 PC 1.E-05 PC Lignite FBC FBC Lignite 1.E-06 1.E-07 1.E-08 1.E-09 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Percentile Figure 2-3 Cumulative Frequency Distributions of Mercury Emissions for PC and FBC Units 2.8 The number of EGUs that represent the Top 12% best performing units for the mercury MACT floor pool for existing coal-fired EGUs should be 127 rather than 40 EPRI s review of the Part II and III mercury emissions, as identified in the revised coal mercury MACT spreadsheet, dated May 18, 2011, indicates that the Part III units generally represent the best performing units for mercury emissions. For that reason, it makes sense to use the 127 best performing units to represent the Top 12% best performing units rather than 40. In the proposed rule, EPA concluded that the Part II mercury data had significantly lower concentrations than the Part III mercury data, and therefore the best performing units were not those selected for metals and particulate stack sampling in Part III of the ICR. However, this conclusion was based on a Part II data set containing measurement unit conversion errors that produced emissions in lb/mmbtu that were 1,000 times too low. These errors were corrected by EPA in a revised MACT floor spreadsheet and memorandum dated May 18, After those errors are corrected, most of the Part II tests are no longer among the lowest 40 EGUs in the combined data set. The Part III data set contains most of the best performing units. 2-19

36 Thus, the coal mercury MACT floor data set should follow EPA s procedure for the MACT floor data sets for HCl and TPM. The overall industry pool consists of 1086 EGUs. After subtracting the 32 lignite EGUs (specifically, those burning coals with less than 8,300 Btu/lb) to obtain 1054 non-lignite EGUs, 12% of 1054 yields 127 EGUs, which would be the appropriate number of EGUs to represent the Top 12% for the coal mercury MACT floor pool for EGUs burning coals with greater than 8,300 Btu/lb Part II and III ICR data comparison Table 2-5 lists the number of mercury emission values available for analysis in the May 18, 2011 MACT floor spreadsheet. In total, test results for 339 EGUs were reported for the Part II and III ICR. EPA assigned emissions results from tested EGUs to untested units at the same facility, resulting in emission values for 403 EGUs. There are approximately equal numbers of Part II (195) and Part III (208) EGUs included in the 403. Table 2-5 Mercury Data Sources Part II Data Part III Method 29 Part III CMM Part III Method 30B CMM continuous mercury monitor The available mercury data can be grouped into four categories. Part II data are from tests conducted in the 5 years prior to the Part III ICR request and submitted to EPA in response to the ICR questionnaire. They represent tests performed with various test methods and for many reasons, including compliance testing and control technology evaluation. Part III results are from tests performed in 2010 in response to the ICR, and include measurements using three test methods: Method 29, Method 30B, and continuous mercury monitors (CMMs). Part III data were reported to EPA on a consistent lb/mmbtu basis using standardized EPA templates and reporting tools. Part II data were reported to EPA in the measurement units used in the utilities test reports; thus, EPA had to convert many results to lb/mmbtu and lb/mwh. Errors occurred in EPA s conversion of the Part II data for many EGUs, with the most significant being a 1000-fold (low) conversion error for measurements reported to EPA in units of pounds per gigawatt-hour (lb/gwh). Upon receiving notification of those errors, EPA corrected them and issued a revised spreadsheet and MACT floor memorandum on May 18, As discussed in Appendix A, numerous other errors remain in the ICR data pool, some affecting mercury emissions. However, we have not attempted to correct those errors in EPRI s evaluation although as noted earlier in our comments, EPRI recommends that EPA address all of these errors. Figure 2-4 compares the distribution of mercury emissions for the lowest emitting 40 EGUs identified by EPA in the MACT floor spreadsheet that supported the original proposed rule of May 5, 2011 with a similar distribution for the lowest emitting 40 EGUs in the revised MACT floor spreadsheet issued on May 18, About half of the data sets used by EPA to develop 2-20

37 Mercury Emission, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August 2011 the original proposed rule were affected by conversion errors. Using the corrected mercury data set, there are only 8 Part II data sets among the 40 best performing units for mercury. 1.E-06 1.E-07 Based on the minimum test series average for each EGU 1.E-08 1.E-09 Revised Coal MACT Spreadsheet - May 18, 2011 Original Coal MACT Spreadsheet - March 16, E-10 0% 20% 40% 60% 80% 100% Data Percentile Figure 2-4 Comparison of Mercury Emissions from Original and Revised Lowest Emitting 40 EGUs Because these errors were not identified prior to the original proposed rule of May 5, many Part II test series were over an order of magnitude lower than the best-performing Part III test series. EPA concluded that many of the best-performing units were from the Part II data. EPA used this assumption to support the rationale that only 12% of the ICR data should be analyzed for the MACT floor, contrary to the methodology employed for TPM and acid gases. When the 1000x conversion errors are corrected, there is no indication that the Part II data have lower mercury emissions than the Part III data. To the contrary, the average of the Part II data is 3.6 lb/tbtu, as compared with 2.2 lb/tbtu for the Part III best-performing units. Figure 2-5 compares the mercury emissions distributions of the Part II, Part III best-performing, and the 50 units selected by EPA at random from the entire U.S. coal-fired fleet. The distribution of the Part II emissions is very similar to that of the random 50 EGUs. 2-21

38 Mercury Emissions, lb/mm Btu EPRI Comments on Proposed HAPs MACT Rule 4 August E E-05 Based on the minimum test series average for each EGU, after correction of the erroneous Part II data 1.0E E-07 Part III Random 50 Part II (All Units) 1.0E-08 Part III (Excluding Random 50) 1.0E-09 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Percentile Figure 2-5 Mercury Emission Distribution by ICR Source It is clear that lower mercury emissions occur more frequently for Part III EGUs than for Part II EGUs. For the 127 lowest emitting test series, about 40% are from the Part II data and about 60% from the Part III data. Hence, the Part III EGUs selected by EPA are generally the best performing units for mercury emissions. This is not surprising, since the Part II tests are predominantly from research projects conducted in anticipation of the Clean Air Mercury Rule (CAMR) at sites with higher mercury emissions. EPA s selection rationale for Part III metals targeted units with the newest particulate control devices, which are generally newer units with fabric filters and FGD systems. Some of the selected units also have permit limits for mercury and even have mercury controls (i.e., activated carbon injection) Comparison with national inventory estimates To evaluate the appropriateness of using a mercury floor pool of 127 EGUs versus 40, two national inventory assessments were reviewed and compared to the Part II and III data sets. EPRI assessed mercury emissions for all coal-fired units using 2007 plant configurations [EPRI, 2009]. For that effort, EPRI estimated fuel mercury (and chloride) concentrations from EPA s 1999 ICR coal analysis database [EPRI, 2000]. Correlations from the EPRI Emission Factors Handbook [EPRI, 2002], modified to include newer sources of data, were used to estimate mercury emissions corresponding to the 1999 ICR fuel mercury inputs. Fuel consumption per EGU was 2-22

39 Mercury Emissions, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August 2011 obtained from U.S. Department of Energy (DOE) records. Figure 2-6 shows the distribution of coal mercury and EPRI s estimate of mercury emission values from about 1,100 boilers. EPA also has produced a national mercury emission estimate that is available at Note: EPA did not include a reference or base year for its study, nor did it include coal mercury content. EPA s results were presented in tons of mercury emissions per year; EPRI converted the results to emission factors in lb/mmbtu for comparison with the other two inventories. As a point of comparison, EPRI [2000] estimated total U.S. mercury emissions from the power industry of about 47 tons per year (tpy) from coal analyses submitted to EPA in response to the 1999 CAMR ICR. The EPRI evaluation based on 2007 coal usage estimated 44 tpy, while the recent EPA estimate using the 2010 ICR data is 30 tpy, reflecting decreasing mercury emissions due to the increased use of SCR systems, FGD systems, and better particulate control devices. These estimates are based on extrapolations of test data from numerous sites, using fuel concentrations and control device performance. Note that for this large a number of facilities (900 to 1,100 depending on the data source), the 12th percentile is an emission factor of about 1E-6 lb/mmbtu. 1.E-04 1.E-05 1.E-06 1.E-07 12% of Total Units 1999 ICR Coal Mercury Concentrations EPRI Emissions Estimate (2007 basis) EPA National Emissions Estimate (2010 basis) 1.E-08 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Percentile Figure 2-6 Distribution of Coal Mercury and Emission Factor Estimates of Mercury Emissions for the U.S. Electric Power Industry Figure 2-7 shows the cumulative frequency distribution of the minimum average test series from the 339 unique EGUs for which Part II and III tests were reported, after correction of the 2-23

40 Mercury Emissions, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August 2011 measurement unit conversion errors. The mercury emission of the 40th lowest unit is 5E-8 lb/mmbtu, which is at the very lowest point of EPA s national estimate. In contrast, the mercury emission of the 127th lowest unit is about 1E-6 lb/mmbtu, which corresponds very closely to both the EPA and EPRI assessments of emissions across the entire industry. 1.E-04 1.E-05 1.E-06 1.E-07 Lowest 127 EGUs * Based on the minimum test series average for each EGU, after correction of the erroneous Part II data 1.E-08 1.E-09 Lowest 40 EGUs Coal-fired EGUs Numbered from Lowest to Highest Mercury Emissions* Figure 2-7 Distribution of Mercury Emissions from the Part II and III ICR Data Summary EPA chose to use the lowest 40 units to calculate the MACT floor based, in part, on an erroneous conclusion that the Part II data were significantly lower than Part III data. The Part III data are generally lower than the corrected Part II data and thus likely to represent the best performing units for mercury. Thus, 127 EGUs is the appropriate number of EGUs for setting the MACT floor for mercury References EPRI, An Assessment of Mercury Emissions from U.S. Coal-Fired Power Plants. Palo Alto, CA: EPRI, Emission Factors Handbook. Palo Alto, CA:

41 EPRI, Updated Hazardous Air Pollutants (HAPs) Emissions Estimates and Inhalation Human Health Risk Assessment for U.S. Coal-Fired Electric Generating Units. Palo Alto, CA: Based on limited available data, it is unclear whether plants firing mediumto high-chloride coals can achieve the HCl standard using dry sorbent injection The proposed MACT rule states that plants can control acid gases, as represented by HCl, by injecting a sodium-based powder (Trona or sodium bicarbonate) into the flue gas duct of a power plant, upstream of particulate control. The powder adsorbs the acid gases and the resulting product is then removed from the flue gas in the fabric filter or ESP. The following presents EPRI s observations and experience with this approach to reducing HCl at a site burning Powder River Basin (PRB) coal and another site burning an eastern bituminous coal. Both of these studies were short-term in nature, with the test period of several weeks, evaluating a number of different parameters each day. Thus these results are not able to provide any insight on the longterm performance and impacts of dry sorbent injection. EPRI is not aware of other demonstrations of this approach that have been conducted by independent, third parties. However, because of their citation in the technical supporting documents for the proposed MACT rule, we also discuss tests conducted by a sodium sorbent supplier Coal chloride levels Coal chloride concentrations vary by coal rank, with western bituminous and subbituminous coals generally having lower levels of chloride compared to bituminous and lignite coals. The fuel chloride data reported in the Part III ICR fuel analyses were reviewed to determine the range of chloride concentrations by coal type. These data, expressed on a lb/mmbtu basis in the fuel, are shown in Figure 2-8 for the 319 individual units that reported coal chloride data in the Part III ICR. Sites reporting that they fired a blend of different coal ranks were excluded from these analyses. Figure 2-8 shows that bituminous, western bituminous, subbituminous, lignite, and waste coal fuels have distinctly different levels of chloride. Table 2-6 provides additional statistics for coal chloride by coal type. Each coal type is described in more detail below. 2-25

42 Coal Chloride, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August Bituminous Waste Coal Lignite Subbituminous Western Bituminous 0.01 MACT Floor for Existing Coal Units, % 20% 40% 60% 80% 100% Data Percentile Figure 2-8 Part III ICR Coal Chloride Data (based on site average) Table 2-6 Coal Chloride Concentrations by Coal Rank for Part III ICR Coal Data Bituminous Western Bituminous Lignite Subbituminous Waste Coal Count Mean (lb/mmbtu) Standard Deviation (lb/mmbtu) * 0.014* * *These values are likely to be high due to coal chlorine analysis method limitations for some analytical methods used in the ICR Western coals (PRB subbituminous and western bituminous) Subbituminous and western bituminous coals have the lowest measured mean chloride levels: lb/mmbtu and lb/mmbtu, respectively. Figure 2-8 shows that approximately 50% of the reported subbituminous coal chloride concentrations were less than the proposed lb/mmbtu HCl MACT floor value for existing units. The maximum coal chloride 2-26

43 concentration in the Part III data set for subbituminous coal was 0.02 lb/mmbtu. Thus, overall reductions of approximately 90% could be required for some subbituminous coals to achieve the proposed HCl MACT floor limit for existing coal units. Approximately 15% of subbituminous coals could require more than 80% reduction to meet the MACT floor (i.e., 0.01 lb/mmbtu inlet to lb/mmbtu outlet). All but three of the units firing western bituminous coals reported coal chloride levels below the proposed lb/mmbtu HCl stack limit. At the highest coal chloride concentration reported (0.018 lb/mmbtu) approximately 89% reduction would be required for those coals. EPRI participated as a technical advisor in the first commercial demonstration of the TOXECON process at We Energies Presque Isle Power Plant. The project was cofunded by DOE under the Clean Coal Power Initiative. The primary objective of the project was to demonstrate the TOXECON configuration downstream of a hot-side electrostatic precipitator (ESP) in this case with activated carbon injection (ACI) for mercury control. A secondary goal was to determine the ability of powdered Trona injection to reduce SO 2 by 70% and also to realize some NOx emission reductions. These pollutant reductions were to be obtained while simultaneously injecting activated carbon for mercury control [ADA-ES, 2008]. The tests showed that the Trona could reduce SO 2, but that it also converted enough nitrogen oxide (NO) to nitrogen dioxide (NO 2 ) to produce a visible brown plume. Trona injection also reduced mercury capture by ACI significantly. Without Trona injection, the Presque Isle unit was able to maintain 90% mercury reduction at ACI rates of 1 to 2 pounds per million actual cubic feet (lb/mmacf). During Trona testing, the unit was unable to achieve a 90% mercury removal over a short 1-hour test, even when the ACI rate was increased to 4.6 lb/mmacf. As these tests were conducted before EPA made known its intent to regulate acid gases from EGUs, the project did not measure HCl emissions especially not as a function of activated carbon sorbent injection rate. Therefore, these tests do not inform us about the amount of Trona or sodium bicarbonate that would be needed to meet the proposed HCl limit, nor whether that amount would cause a brown plume and/or plugging, as well as impact mercury capture by ACI. Tests by Solvay Chemicals (a major supplier of Trona and sodium bicarbonate) at a low-sulfur Eastern bituminous-fired power plant (reporting ~ 0.5 1% sulfur in the coal and ppm HCl in the flue gas) may provide some clues. In this one test campaign, Solvay Chemicals reported 50% HCl removal at a sodium bicarbonate injection rate sufficient to capture 10% of the SO 2 i.e., a relatively low injection rate [Davidson, 2010]. Since brown plumes have been observed at plants using sodium sorbents for SO 2 control, but not at plants using these sorbents for SO 3 control, it is possible that low sodium injection levels may not produce brown plumes or plugging; this may apply for western bituminous coals with coal chloride concentration slightly greater than the proposed limit for HCl. However, this may not be possible for subbituminous coals with chloride concentration greater than about twice the proposed limit, where HCl removals up to 90% may be required. The situation for eastern bituminous coals, with their higher chlorine content, is discussed below. To go beyond a reliance on engineering judgment, it is important to determine via independent tests fully open to public scrutiny whether these single results obtained on an eastern bituminous coal are representative of results for the fleet of PRB-fired power plants. Only then will the public and the industry truly understand the ability of sodium sorbent injection combined with a fabric filter to reduce HCl emissions to the proposed limit at plants burning western coals, and to do so without impacting plume opacity and/or mercury emission reductions. 2-27

44 2.9.3 Eastern bituminous coal The mean chloride concentration in bituminous coals is lb/mmbtu, which is significantly higher than that of subbituminous coal. Approximately 1% of the reported average bituminous coal values are lower than the lb/mmbtu proposed MACT limit. Roughly 95% of bituminous coal-fired power plants would require greater than 80% chloride reduction to meet a lb/mmbtu limit (0.01 lb/mmbtu inlet to lb/mmbtu outlet) and 25% would require greater than 98% reduction. At the highest reported coal chloride level for bituminous coals of 0.25 lb/mmbtu, greater than 99.2% reduction would be required. In , EPRI conducted tests of both sodium bicarbonate and hydrated lime dry sorbent injection (DSI) upstream of a pilot baghouse at Hudson Generating Station owned by Public Service Electric and Gas [EPRI, 1998]. This plant burned a coal that typically contained less than 1.0% sulfur and less than 0.1% chlorine; the corresponding flue gas HCl concentration at the pilot s inlet duct was ppm. The results of these tests, presented in Figure 2-9, and show that HCl reductions up to 90% could be achieved, but only at sorbent-to-hcl injection ratios of about For a 500 MWe plant with 100 ppm HCl, an NSR of 12 is equivalent to increasing the ash loading by 50 60% or by tons/day (normal ash loading is about 500 tons/day). The Hudson site, with an HCl concentration in the mid-range for eastern bituminous coals, would have required 97 98% HCl removal to achieve the proposed MACT limit (equivalent to ~ 2 ppm). Plots of pollutant removal versus sorbent injection rate typically become very flat at high injection rates i.e., increase very slowly if at all with increasing sorbent injection so it is not clear from the available data if hydrated lime or sodium bicarbonate could routinely provide greater than 90% reductions at any injection rate. 1 Calculated as the ratio of moles of sodium or calcium injected per mole of HCl relative to that required for complete reaction of the alkali with the HCl; this ratio is known as the normal stoichiometric ratio (NSR). 2-28

45 Figure 2-9 HCl Control as a Function of Normal Stoichiometric Ratio (NSR) Other balance of plant concerns with dry sorbent injection (DSI) Medium-to-high levels of SO 3 in the flue gas degrade the mercury removal effectiveness of ACI. While alkali injection may counter this effect by reducing SO 3 concentrations, too much sodiumbased alkali can potentially offset this benefit. As discussed above for the Presque Isle TOXECON demonstration, Trona injection can cause a visible brown plume and degrade mercury removal by ACI. The appearance of a brown plume from Trona injection is due to the formation of NO 2 (a reddish brown gas). The presence of NO 2 in the stack plume above a certain threshold concentration (depending on stack diameter the larger the stack diameter, the more visible the plume) can tinge the plume reddish brown, increasing its opacity and visibility. This phenomenon has also been observed in western subbituminous coal-fired units without SCR in quite a few tests of DSI in Colorado in the 1980s, including some fairly long-term (55 day) tests [EPRI 1986] at City of Colorado Springs Nixon Power station. While hydrated lime injection would provide the benefits of removing SO 3 without the adverse impacts of Trona, it is unknown whether hydrated lime could meet the proposed limit at any injection rate. It is also unknown whether hydrated lime would increase pressure drop across the baghouse to unacceptable levels in the long term if injected at rates needed to reduce HCl emissions to the proposed limit. Based on the one test campaign described in [Davidson, 2010], a sodium sorbent injection rate high enough to provide ~ 70% SO 2 removal was needed to realize 2-29

46 90% HCl capture. These SO 2 removal rates are similar to those obtained at the Presque Isle demonstration where the brown plume was observed, so other plants may experience the same consequence if sodium sorbent injection is used to obtain high HCl removals. However, we have few data on the impact of Trona injection in bituminous coal applications when it is used for SO 2 and HCl capture. EPRI understands that some vendors have announced the realization of high sulfur oxides (SOx) (and therefore high co-benefit HCl) removal rates by grinding or milling the sodium sorbent just before introducing it into the feed system, and then injecting this finer material upstream of the air preheater [Day, 2007]. The combined effect of fine grind and injection upstream of the air preheater is believed to increase mixing and residence-time sorption reaction kinetics. Potentially, this provides greater SOx and HCl reductions. However, this method has led to air preheater pluggage by sodium bisulfate deposits at plants with medium-high SO 3 concentrations in the flue gas related to medium-high sulfur content in the coal and/or the presence of an SCR system [Campbell, 2008]. At a glass factory, sodium sorbent injection at 700 F upstream of the hot-side ESP has also caused deposit buildup on the ESP inlet s perforated plate [Kong et al., 2008]. A Solvay Chemical field test program different from the one cited under the PRB discussion reported 97+% HCl removal with either Trona or sodium bicarbonate at a site burning (typically low-sulfur) central Appalachian and Columbian coals. The high HCl removals were achieved when injecting enough sorbent to remove 80 90% of the SO 2 [Kong et al., 2008]. The site was atypical in that it was equipped with both a hot- and a cold-side ESP, and sorbent was injected upstream of the hot-side ESP. Since the hot-side ESP is located upstream of the air preheater and collected most of the injected sorbent, it might have protected the air preheater from pluggage. Solvay Chemical also reported no increase in opacity (i.e., no brown plume caused by NO 2 emissions) at this site. It is unclear if this was due to the injection and removal of the sorbent at high temperature, as well as the short contact time between the sorbent and the flue gas. Most power plants are not equipped with hot side ESPs. Instead, sorbent is normally collected in a baghouse or cold-side ESP located downstream of the air preheater, where temperatures are much lower and contact times between the sorbent and the flue gas are normally much longer. A side effect of sodium injection is its impact on the fly ash/sorbent mixture collected by the particulate control device [EPRI, 2010]. Sodium compounds in the collected mixture are highly soluble. EPRI tests have shown that % of the sodium sulfates that would be formed by the injection of Trona or sodium bicarbonate leach out of the fly ash/sorbent mixture on contact with water. These EPRI tests have also shown that the presence of soluble sodium compounds can increase the mobility of trace constituents such as arsenic, chromium, selenium, and vanadium. In fly ash samples from power plants using sodium sorbent for SO 3 reduction, alkali levels (measured as sodium oxide [Na 2 O]) were below the optional maximum of 1.5% for using the ash as a partial cement replacement in concrete. However, for samples from power plants using sodium sorbent for SO 2 control, the alkali levels were %, well above the optional maximum. In the cases where sodium compounds are used for HCl control at plants with high HCl levels (100 ppm or more), the projected increase in ash sodium content could be greater than 30%. 2-30

47 2.9.5 Summary As full-scale application of dry sorbent injection for HCl capture is extremely limited, further investigations are needed over a range of power plant designs, coal types, and air pollution controls to fully evaluate its HCl (and SO 2 ) removal performance and impacts on balance-ofplant operations. Based on EPRI s limited available short-term data, it is unclear whether power plants firing medium- and high-chloride coals would be able to meet the HCl (or SO 2 ) standard using dry sorbent injection. There are also potential side effects of dry sorbent injection including reduced mercury capture efficiency with activated carbon, solid waste management issues, and the possible formation of a brown plume which require further investigations to fully characterize References ADA-ES, TOXECON Retrofit for Multi-Pollutant Control on Three 90-MW Coal- Fired Boilers; Topical Report: Performance and Economic Assessment of Trona-Based SO 2 /NOx Removal at the Presque Isle Power Plant. Topical Report prepared for DOE-NETL and We Energies by ADA Environmental Solutions, Inc., August 25. Campbell, T., Mercury Control with Activated Carbon: Results from Plants with High SO3, Presented at the Power Plant Air Pollutant Control Mega Symposium, Baltimore, MD (August 2008). ADA Environmental Solutions, Inc. Davidson, H., Dry Sorbent Injection for Multi-pollutant Control Case Study, Presented at the CIBO Industrial Emissions Control Technology VIII Conference, Portland, ME (August 2010). Solvay Chemicals Inc. Day, K., Duct Injection for Controlling SO 3, SO 2, CO 2, Hg & NOx, Presented at the Electric Power Conference, Rosemont, IL (May 2007). O Brien & Gere. EPRI, Full Scale Demonstrations of Dry Sodium Injection Flue Gas Desulfurization at City of Colorado Springs Ray D. Nixon Power Plant. Proceedings: 1986 Joint Symposium on Dry SO 2 and Simultaneous SO 2 /NOx Control Technologies, Raleigh, NC, June, EPRI, Assessing Air Pollution Control Options at the Hudson Station of Public Service Electric and Gas. Palo Alto, CA: TR EPRI, Impacts of Sodium-based Reagents on Coal Combustion Product Characteristics and Performance. Palo Alto, CA: Kong, Y., J.M. de La Hoz, M. Wood, M. Atwell, and T. Lindsay, Dry Sorbent Injection of Sodium Bicarbonate for SO 2 Mitigation, Presented at Power-Gen International 2008, Orlando, FL (December 2008). Solvay Chemicals Inc. and Ullman Associates Inc Approximately 12% of the 439 coal-fired facilities listed in EPA s PART II and III ICR database are potential area sources Under the currently proposed utility National Emission Standards for Hazardous Air Pollutants (NESHAP) rule, EPA has elected not to distinguish between major sources and area sources. Under the Clean Air Act guidelines, major sources are defined as facilities that emit more than 2-31

48 10 tpy of any individual PAH compound, or more than 25 tpy of all HAPs compounds in aggregate. Facilities that do not qualify as major sources are classified as area sources. EPRI reviewed the emissions from each EGU included in the 2010 Part II and III ICR to determine which coal-fired facilities potentially could be classified as area sources. These determinations were based on the maximum potential to emit, taking into account actual annual heat input values developed as part of a previous EPRI emission modeling project [EPRI, 2009] and using actual capacity factor data from the Edison Electric Institute (EEI) [EEI, 2011]. The final list of potential area source coal-fired facilities is provided in Appendix C. Of the approximately 439 coal-fired facilities listed in EPA s Part II and III ICR database, 51 (approximately 12%) are potential area sources. Note that this list includes only those facilities with emissions data reported in response to EPA s 2010 ICR. Additional facilities that were not tested in either the Part II or III ICR could also potentially be classified as area sources; however, EPRI does not have the necessary emissions data to make this determination. Such additional sources would likely include stations with smaller total megawatt (MW) capacity that fire coals with low chlorine and fluorine content, such as PRB subbituminous or western bituminous coals. Hydrogen chloride and hydrogen fluoride are the HAPs emitted in the largest quantities; these are the compounds that typically cause coal-fired EGUs to exceed the 10 and 25 tpy major source criteria. Therefore, EPRI s analysis focused on using measured emissions data for HCl and HF from units tested as part of the 2010 ICR to calculate maximum annual emissions for each ICR test unit. Since the 10 and 25 tpy criteria for major sources apply at the facility level rather than the unit level, EPRI summed emissions across all units at a facility to derive an annual facility total. If emissions were not available for all units, ICR test data for similarly configured sister units were averaged and used to estimate emissions for units not tested. The reported lb/mmbtu emission factors for HCl and hydrogen fluoride (HF) were then used in conjunction with 2007 actual annual heat input data per unit (i.e., trillion Btu heat input for 2007) from a previous EPRI project and corresponding EEI capacity factor data from 2007 to estimate a maximum annual emission rate per unit. Facilities with total HCl and HF emissions less than 10 tpy of each species or less than 25 tpy in aggregate were classified as potential area sources. At low environmentally relevant levels of HCl and HF, acute health effects (including those used to derive values for risk assessment) include irritation of the nose, eyes, and throat. Low level chronic exposures may include cellular damage in the respiratory system (HCl and HF) or skeletal abnormalities (HF) [CalEPA, 2003] Emissions of HAPs metals, although individually much smaller than HCl and HF emissions, can contribute a few tons per year in aggregate at the largest MW capacity facilities. Therefore, as a final check, facilities that were identified as potential area sources and were also large MW capacity stations with HCl plus HF emissions greater than or equal to 10 tpy were evaluated further. EPRI investigated whether adding total HAPs metal emissions from the ICR database (where available) to the acid gas emissions would put any of the facilities over the 25 tpy limit. None of the facilities that qualified as potential area sources based on their HCl and HF emissions exceeded the 25 tpy limit when metals were included. 2-32

49 References CalEPA, California Environmental Protection Agency. Air Toxics Hot Spots Program Guidance Manual for Preparation of Health Risk Assessments, California EPA Office of Environmental Health Assessment. August. EEI, Edison Electric Institute (EEI) unit-level capacity factor data for 2007 through 2009, accessed EPRI, Updated Hazardous Air Pollutants (HAPs) Emissions Estimates and Inhalation Human Risk Assessment for U.S. Coal-Fired Electric Generating Units. Palo Alto, CA: Additional measurements are needed to adequately characterize the variability of trace metals with normal plant operations EPRI [2011] recently conducted a study to assess the long-term variability of trace metals emissions from a coal-fired boiler and to determine if metals emissions could be correlated to PM emissions. The study was conducted at a coal-fired electric utility boiler burning a blend of eastern bituminous and PRB coals, and equipped with SCR for NOx control, an ESP for PM control, and a wet FGD to control SO 2 emissions. This power plant was selected because an ongoing PM CEM study is being conducted at the facility. Trace metals were continuously monitored using the Cooper Environmental Services Xact 640 multimetal CEMs system. Over 712 hours (about one month) of continuous emissions data for metals and PM were collected around March The results of this study showed that the metals emissions at the stack exhaust did not vary significantly with extended averaging periods from 3 hours up to 168 hours. Selenium emissions exhibited approximately the same relative variance over the test period as FPM. This study characterized emissions for the March time frame at this one facility, and this one-month study may not fully characterize this facility s longer-term (i.e. over a whole calendar year) emissions variability. In addition, further investigations are necessary to characterize the variability at additional power plants firing different coal types including only eastern bituminous coal compared with a PRB blend as the trace metal emissions variability of one site likely does not represent the entire industry. During some time periods, the predicted metals concentrations correlated well with the measured metals concentrations and showed similar averages and relative variances. However, during other time periods, factors in addition to PM appeared to impact trace element emissions significantly. Some of the variance during these periods can be attributed to sampling issues (i.e., transport flow rate), but sometimes changes in the metals concentrations could not be attributed to changes in instrument operation and there was no corresponding change in the PM concentration. Earlier work by EPRI [2002] found that the trace element concentration in the coal is another independent variable impacting trace element emissions. It is likely that additional factors impact the fate and distribution of trace elements in flue gas. Further studies are needed to gain a complete understanding of the correlation between stack metals concentrations and PM concentration in flue gas. As noted earlier, the ICR database provides only a snapshot of HAPs stack emissions. Emissions variability related to power plant processes will differ for each HAP (or surrogate), 2-33

50 depending on its chemistry and fate in combustion and air pollution control unit operations. For example, particulate-phase metal emissions are believed to be affected by the coal metal concentration, the overall particulate matter collection efficiency, and factors that affect the concentration of the metals onto fine particulates, which are less efficiently captured in most ESPs. These factors are not well understood and can relate to the general volatility of each metal, as well as to other constituents in the fly ash. One significant source of process variability is the coal, since many facilities burn a variety of coals from multiple sources and mines. To properly characterize trace element emissions variability, more and longer investigations are needed at a number of additional power plants burning coals of various ranks References EPRI, Emission Factors Handbook. Palo Alto, CA: EPRI, Multi-Metals Emission Variability: Assessment of Continuous Multi-Metals Measurements from a Coal-Fired Boiler. Palo Alto, CA: Trace metal emissions differ between distillate (No. 2) and residual (No. 6) oil combustion EPRI s review of the ICR data concluded that emissions of trace metals, including nickel, differ between distillate (No. 2) and residual (No. 6) oil. Distillate and residual oil are different grades of fuel oil produced by refining crude oil. They contain different levels of ash, sulfur and trace metals, and thus would be expected to have different emission characteristics. In the proposed rule, EPA used emissions data from all liquid oil-fired units (excluding sites that co-fired natural gas) to develop the MACT standard for new and existing units. ICR data from distillate and residual oil were combined to develop proposed regulations for total metals (including mercury), as well as for HCl and HF. In crude oil refining, different products are produced for various uses by chemical reactions and physical separations that reflect differences in the molecular structure of the hydrocarbons involved. Distillate oil has physical properties distinctly different from those of residual oil, as shown in Table

51 Table 2-7 Comparison of Distillate and Residual Oil Properties* Parameter Distillate (No. 2) Oil Residual (No. 6) Oil Flash point, F Water and sediment, % Saybolt Universal viscosity, 100 F Sulfur, % (typical) 0.2 < 1 (low S), < 4% (high S) Ash, % (typical) < $/gal (Feb 2011 typical) $3 $2 * Source: Perry, RH and Chilton, CH, Chemical Engineers Handbook, McGraw Hill Book Company, 5th ed., p Distillate oil is more refined than residual oil, and consequently sells at a significant premium. In power generation, the consumption of residual oil dwarfs the consumption of distillate oil. All six of the ICR EGUs that burned distillate oil reported capacity factors of less than 1%, reflecting the high cost of burning this fuel. In developing the MACT standards for the liquid oil category, EPA grouped emission results from distillate and residual oil together: the seven EGUs used to calculate the MACT floor included five that burned distillate and only one that burned residual oil. Two units from Mitchell Power Station [Office of Regulatory Information Systems (ORIS) Code 3181] were incorrectly reported as firing residual oil when they in fact burned distillate oil. Figure 2-10 shows the cumulative frequency distribution of total metal emissions from the six test sites firing distillate oil versus the 40+ units firing residual oil. The best-performing 12% of the residual oil-fired EGUs have emissions that are nearly ten times higher than those of the five distillate oil-fired units. This trend is directly related to fuel metals content, as none of the distillate oil EGUs employs particulate emission controls. In contrast, about one-third of the residual oil sites have some form of PM control. Figure 2-11 shows that emissions of nickel, the major metallic element present in fuel oils, are also far lower from distillate oil-fired EGUs than from residual oil-fired EGUs. EPRI s evaluation indicated no significant differences between distillate and residual oil in the emission of acid gases (HCl and HF). 2-35

52 Figure 2-10 Total Metal Emissions from Distillate and Residual Oil-fired Units 2-36

53 Nickel Emissions, lb/mmbtu EPRI Comments on Proposed HAPs MACT Rule 4 August E-02 Residual Oil-Fired Units Distillate Oil-Fired Units 1.E-03 1.E-04 1.E-05 1.E-06 1.E-07 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Percentile Figure 2-11 Nickel Emissions from Distillate and Residual Oil-fired Units 2.13 Particulate-phase metals correlate with filterable PM (as well as total PM) emissions for oil-fired units EPRI evaluated the relationship between HAPs trace metal emissions and particulate matter (TPM and FPM) emissions, to determine whether particulate parameters could be used as surrogates for metal emissions. Figure 2-12 shows average nickel emissions versus average FPM emissions for 41 EGUs for which both sets of measurements were available. A power function fit of the data shows a regression coefficient (r 2 ) of 0.61 (Table 2-8); the correlation is statistically significant at a greater than 95% confidence level. 2-37

54 Figure 2-12 Nickel Emissions as a Function of FPM Emissions for Oil-fired EGUs Table 2-8 shows the results of correlation analyses for the other HAPs trace metals. All HAPs metals measured in the ICR have a higher correlation with FPM than with TPM. Mercury, a volatile metal in combustion power systems, does not correlate with either FPM or TPM. For a population of 41 data pairs, a regression coefficient below 0.1 is not statistically significant at the 95% level. Based on these conclusions, FPM has better prediction properties than TPM for all non-mercury HAPs metals. The correlations of individual and total non-mercury metals with FPM for oil-fired EGUs are statistically significant to the 95% level, similar to the FPM correlations for coal-fired EGUs. 2-38

55 Table 2-8 Correlations Between Particulate-Phase Trace Element Emissions and Particulate Emissions at Oil-fired EGUs Method 29 FPM TPM R 2 Number of Data Pairs R 2 Number of Data Pairs Antimony Arsenic Beryllium Cadmium Cobalt Chromium Lead Nickel Manganese Mercury Selenium Total Non-Hg Metals R 2 correlation coefficient using a power regression 2.14 EPA s assumption that 65% of nickel emissions from liquid oil-fired EGUs are as carcinogenic as nickel subsulfide is overly conservative In the risk assessment conducted to support the MACT rulemaking, EPA assumed that 65% of nickel emissions from liquid-oil fired EGUs is in the form of insoluble, crystalline species that are as carcinogenic as nickel subsulfide (Ni 3 S 2 ) [EPA-HQ-OAR ]. This assumption is overly conservative. Recent measurements by EPRI and others found that nickel emissions from residual-oil combustion are primarily soluble nickel sulfate, with lesser amounts of nickel/magnesium oxide and nickel ferrite. No regulatory agency, including EPA, currently provides a cancer unit risk or other dose-response value for nickel sulfate or other soluble nickel compounds for use in risk assessment. In recent measurements [Huggins et al., 2011] to evaluate the chemical form of nickel on particulates emitted from liquid oil-fired boilers, EPRI and others determined the nickel in the PM samples was primarily nickel sulfate (NiSO 4 6H 2 O), with lesser amounts of nickel/magnesium oxide [(Ni, Mg)O] and/or nickel ferrite (NiFe 2 O 4 ). Potentially carcinogenic nickel sulfide compounds were absent, within the detection limits of the methods (± 3% of total nickel), in all 21 PM samples investigated. X-ray diffraction (XRD) and X-ray absorption fine structure spectroscopy (XAFS) were used to determine the nickel species in 21 PM samples collected at the stacks of eight residual oil-fired EGUs located in Florida, Hawaii, and New 2-39

56 York. The stacks were sampled isokinetically using a modified EPA Method 17 sampling train maintained above 290 C, well above the sulfuric acid dew point temperature. The results of this study are described in detail in a peer reviewed paper entered into the rulemaking docket. Earlier work by the University of Louisville and EPRI [EPRI, 1999] indicated that 3 26% of the total nickel emissions were composed of sulfidic forms of nickel. However, it could not be determined whether nickel subsulfide was present due to the limitations of the indirect (i.e., operationally defined) speciation method employed, sequential extraction. The lack of nickel subsulfide data and uncertainty about the carcinogenicity of the various nickel species were cited by EPA in its assumption that nickel emissions from residual oil combustion are 65% as carcinogenic as nickel subsulfide. In contrast, Huggins and coworkers [Huggins et al., 2011] used direct and definitive speciation techniques to identify the nickel species to be primarily nickel sulfate, and to a lesser extent nickel/magnesium oxide and nickel ferrite. The absence of nickel sulfides is significant because sulfidic nickel compounds are generally considered to be the most highly carcinogenic nickel compounds. To date, soluble nickel compounds, including nickel sulfate, have not demonstrated the ability to induce cancer in either animal bioassays or epidemiological studies [Goodman et al., 2009; Haber et al., 2000; NTP, 1996; TERA, 1999]. Limited evidence suggests that soluble nickel compounds may promote the carcinogenicity of insoluble nickel compounds as found in nickel refining exposure scenarios (presence of substantial nickel subsulfide). However, the California EPA Office of Environmental Health Hazard Assessment (OEHHA) derived cancer risk estimates for all nickel compounds (soluble and insoluble), but based on evidence from insoluble nickel (subsulfide, nickel refinery exposures) [CalEPA, 2009]. These findings indicate that EPA s assumption that the nickel compound mixture emitted from U.S. liquid oil-fired power plants is 65% as carcinogenic as nickel subsulfide is overly conservative and should be re-assessed References CalEPA, California Environmental Protection Agency. Technical Support Document for Cancer Potency Factors: Methodologies for Derivation, Listing of Available Values, and Adjustments to Allow for Early Life Stage Exposures. California EPA Office of Environmental Health Assessment. May. EPA-HQ-OAR Non-Hg Case Study Chronic Inhalation Risk Assessment for the Utility MACT Appropriate and Necessary Analysis. March 16, EPRI, Nickel Speciation Measurements at Oil-Fired Power Plants. Palo Alto, CA: TR Goodman J.E., Prueitt R.L., Dodge D.G., Thakali S., Carcinogenicity Assessment of Water-Soluble Nickel Compounds, Critical Reviews in Toxicology 39 (5), Haber L.T., Erdreich G.L., Diamond A.M., Maier R., et al., Hazard Identification and Dose-response of Inhaled Nickel Soluble Salts, Regulatory Toxicology and Pharmacology, 31,

57 Huggins F. E., Galbreath K.C., Eylands K.E., Van Loon L.L., et al., Determination of Nickel Species in Stack Emissions from Eight Residual Oil-Fired Utility Steam-Generating Units, Environmental Science and Technology, 45 (14), EPA-HQ-OAR NTP, National Toxicology Program. NTP Technical Report on the Toxicology and Carcinogenesis Studies of Nickel Sulfate Hexahydrate (CAS No ) in F344/N Rats and B6C3F1 Mice (Inhalation Studies). NTP TR 454, NIH Publication No , U.S. Dept. of Health and Human Services, National Institutes of Health, Washington, D.C. July. TERA, Toxicology Excellence for Risk Assessment. Toxicological Review of Soluble Nickel Salts. March Fuel oil metals analysis methods are not sensitive enough to support compliance monitoring for limited-used oil EGUs EPA is proposing that limited-use oil EGUs may use fuel analyses, rather than stack test data, to comply with the HAPs emission limits for mercury, hydrogen chloride, and non-mercury metals. Each EGU must submit a request to EPA to qualify for this alternative compliance method. However, the detection limits of the analytical methods for non-mercury metals in oil are too high to support compliance verification. In the ICR Part III database, most of the fuel oil metal concentrations are non-detect values, with detection limits ranging from 0.25 to 10 parts per million by weight (ppmw). This range is equivalent to approximately 10-6 to 10-4 lb/mmbtu, while the proposed MACT metal emission limits for existing oil-fired units range from 10-7 to 10-6 lb/mmbtu. Table 2-9 shows the number of sample runs and the number of runs below detection limit for non-mercury metals in the ICR Part III fuel database. The analytical data summarized in Table 2-9 were obtained using the analytical methods specified by EPA in the initial ICR document. Fuel samples from several oil-fired EGUs were analyzed with ASTM D6357, the method specified in the ICR for coal, not oil. This method gave lower detection limits, on the order of 0.01 ppmw, or approximately 10-7 lb/mmbtu. However, the use of ASTM D6357 for oil samples may give inaccurate results for metals because the ashing step that works well for coal can cause the loss of metals in oil samples, resulting in erroneously low metal concentrations. 2-41

58 Table 2-9 Individual Sample Run Counts for Metals in ICR V4 Part III Fuel Oil Samples Element Count of All Fuel Oil Metal Measurements (Including Non-detect Values) Count of Non-detect Values Antimony Arsenic Beryllium Cadmium Chromium Cobalt Lead Manganese Nickel Selenium * The counts in this table represent individual sample runs, not EGU averages Fuel oil metals concentrations would need to be quantifiable below 0.01 ppmw to demonstrate compliance. The analytical techniques currently used for fuel oil samples by most laboratories cannot measure accurately at those levels. To allow fuel oil monitoring to be used as a compliance method for limited-use EGUs, a new or improved standard certified method for fuel oil will be required The Electronic Reporting Tool is not ready for use in compliance test reporting EPA has requested input on the use of the Electronic Reporting Tool (ERT) for compliance test reporting [FR 25036]. EPRI is very familiar with the ERT, having received and reviewed over 250 draft ERT files from the Part III ICR. EPRI summarized concerns relating to the ERT in a March 17, 2011 letter to EPA, in response to a request to comment on the draft guidance document: Recommended Procedures for Development of Emissions Factors and Use of the WebFIRE Emissions Factor Database. Many of those comments also apply to the proposed use of the ERT for compliance test reporting. Some of the issues of greatest concern for the MACT proposal are as follows: Usability is very poor: ICR respondents expressed a high degree of frustration with the data entry format and time requirements. Manual entry of data introduced transcription errors that were not present in the laboratory reports. No error checking or validation capability: Very few of the 250 draft ERT files reviewed by EPRI were free of major errors. Often, the program failed to complete emissions calculations with no indication of the reason. 2-42

59 Does not support data quality evaluation: The ERT does not store information needed to understand sample results, such as results of analysis of method and field blanks and laboratory quality control samples. Data flags confuse detected and non-detected results: A major degradation in data usability is caused by EPA s requirement to assign a DLL (detection level limited) flag to emission measurements with one analytical fraction above detection limit and others below detection limits. Based on the ICR guidance, both of the Method 29 samples shown below would be considered non-detects by EPA: Sample A: Sample B: front half = 0.15 µg, back half = < 0.1 µg, Total = 0.25 µg DLL front half = 3,200 µg, back half = < 0.01 µg, Total = 3,200 µg DLL Although EPA has announced enhancements to the ERT, no modifications have been provided for public review. Therefore, EPRI cannot comment on whether future revisions will better support compliance test reporting. EPRI is not familiar with the proposed alternative reporting tool, Emissions Collection and Monitoring Plan System (ECMPS). 2-43

60

61 3 ENVIRONMENTAL FATE AND TRANSPORT, EXPOSURE AND HUMAN HEALTH ISSUES, AND RISK ANALYSES There are many critical issues, errors, and problems related to the underlying data, calculations, assumptions, and models used to predict the environmental fate, transport, exposure, risk and human health outcomes, as put forward in the proposed MACT rule and its supporting documents, that are not addressed by EPA. These issues, errors, and problems make it very difficult to comment on the substance of the proposal, but nevertheless require a detailed discussion and evaluation, which is provided in this section. Appendices E J offer supporting information. 3.1 Findings do not corroborate hot spots due to excess local deposition of mercury near U.S. fossil-fired power plants As part of its quantitative risk analysis of HAPs emissions, EPA evaluated the potential for excess mercury deposition that might result in mercury hot spots within 50 km of U.S. EGUs [EPA, 2011]. The Agency used the Community Multiscale Air Quality (CMAQ) modeling system to predict mercury deposition on a national scale for 2005 and With this information in hand, EPA calculated the excess mercury deposition within 50 km of individual EGUs. First, EPA calculated the average U.S. EGU-attributable mercury deposition within a 500-km radius around an individual EGU. Then, EPA calculated the average U.S. EGU-attributable mercury deposition within 50 km of an individual EGU to characterize local plus regional deposition near the facility. Finally, by subtracting the 500-km deposition from the 50-km deposition, the Agency arrived at a value for excess local mercury deposition attributable to each EGU, and called that a hot spot. For 2005, EPA s calculations showed that average, excess local mercury deposition for all U.S. EGUs was about 120% of average U.S. EGU regional deposition. For the top 10% of EGU mercury emitters, local deposition was around 3.5 times the regional average. By 2016, excess local mercury deposition remained about 3 times the regional average for these top emitters, even though absolute amounts of deposition declined. EPA uses incorrect estimates of mercury deposition using the CMAQ model (see EPRI Comments, Section 3.2) to estimate excess deposition. Furthermore, the Agency uses the term hot spot without formal definition, provides no threshold level, ignores non-egu sources of mercury in its hot spot calculations, and fails to provide detailed information in the Technical Support Document (TSD) [EPA, 2011] needed to evaluate calculations of excess deposition. Based on scientific evidence, comments below show that hot spots due to utility emissions are highly unlikely. In claiming the presence of hot spots, EPA needs to provide a formal, technical, and scientifically sound definition of what it considers to be a mercury deposition hot spot. This definition should include a single, national threshold value (not a region-specific value) which, when exceeded, identifies a possible hot spot. The Agency then needs to provide detailed information to allow evaluation of its calculations. 3-1

62 Hot spot definition In EPA s use of the term, a mercury hot spot simply refers to excess mercury deposition. The Agency fails to provide a justification for this usage, which lacks rigor and offers no threshold or reference point for use in defining excess. Furthermore, since all EGUs add to local deposition, subtracting regional U.S. EGU deposition will inevitably result in a positive or excess value. A mercury hot spot is more commonly described as a specific location that is characterized by elevated concentrations of mercury (exceeding a well-established criterion, such as a reference concentration [RfC]) when compared to its surroundings. There are important considerations to take into account when defining and identifying hot spots, especially since mercury sources frequently overlap and each location has site-specific characteristics. Identifying mercury hot spots should not be constrained to locations where concentrations can be attributed to a single source or sector, as EPA does [Evers et al., 2007]. Others have defined hot spots as a spatially large region in which environmental concentrations far exceed expected values, with such values (i.e. concentrations) being 2 to 3 standard deviations above the relevant mean [Sullivan, 2005]. Finally, EPA previously defined a (utility) hot spot as a waterbody that is a source of consumable fish with methylmercury tissue concentrations, attributable solely to utilities, greater than the EPA s methylmercury water quality criterion of 0.3 mg/kg (milligrams per kilogram) [EPA, 2005]. It is unclear why EPA changed from defining a hot spot by fish tissue methylmercury concentration to defining a hot spot by depositional excess. It is also unclear why EPA doesn t constrain its hot spot definition with a critically important standardized expected value Modeling and measurement studies show no utility hot spots EPA s use of a 50-km radius to calculate hot spots is flawed. For example, modeling studies show that deposition of mercury emitted from power plants is not confined to a 50-km radius around the plants. For example, Seigneur, et al. [2006] calculated that emissions from five randomly selected power plants contributed less than 8% (plume model), less than 14% (Eulerian model at 84-km resolution), or less than 10% (Eulerian model at 16.7-km resolution) to total mercury deposition within a 50-km radius of the source plants. According to plume model calculations, more than 96% of mercury emitted from these plants traveled beyond 50 km from the sources. Likewise, grid-based Eulerian models predicted that more than 91% (coarse resolution) or more than 95% (fine resolution) of mercury emitted from the plants traveled beyond 50 km. Measurements indicate that mercury deposited near power plants comes from various sources. Landing, et al. [2010] measured mercury wet deposition from November 2004 through December 2007 at three sites located primarily downwind at distances of 4.7, 5.5, and 24.5 km from coal-fired power plant Crist in Pensacola, FL. During this period, Plant Crist emitted about 230 pounds of mercury annually, about 85% of which was reactive gaseous mercury [EPRI, 2010]. Landing et al. [2010] estimated that 22 33% of wet-deposited mercury at these sites came from coal combustion, including regional and local sources. The remaining 67 78% came from the global background. Using the same data from these same wet deposition sites, Caffery et al. [2010] found that mercury deposition and concentrations did not differ in a statistically significant manner among these three sites. Furthermore, these mercury deposition and concentrations values were similar to those from Mercury Deposition Network (MDN) sites 3-2

63 along the Northern Gulf of Mexico coast, which are more than 50 km away from Plant Crist. In December 2009, a wet scrubber came online at Plant Crist and has operated continuously since then. Using mercury to trace metal (arsenic and selenium) ratios in precipitation collected in the same MDN in (post-scrubber) 2010, Krishnamurthy, et al. [2011] reported that mercury deposition due to local and regional sources had changed between -10 to +6% at these sites, relative to historic measurements. These changes were thought to represent upper bound estimates, since the researchers assumed that all mercury, arsenic, and selenium measured in wet deposition was from local and regional coal combustion sources although this not the case. This finding is also in strong contrast with the fact that Plant Crist s wet scrubber reduces total mercury emissions by about 70%, but reduces emissions of reactive gaseous mercury (RGM, Hg 2+, divalent mercury; the water-soluble and precipitable form, believed by EPA to deposit locally) by about 85%. Taken collectively, these findings show that increased local deposition possibly due to EGUs, and deposition changes due to changes in EGU emissions, are small and within the range of natural variability. Furthermore, mercury concentrations are not always highest at sites closest to a major source. For example, Kolker, et al. [2010] demonstrated that concentrations of atmospheric reactive gaseous mercury, gaseous elemental mercury (GEM, Hg 0 ), and fine particulate mercury (Hg- PM 2.5 ) were lower when measured 25 km from a 1114 MW coal-fired EGU than when measured 100 km away. These findings contradict the idea, implicit in EPA s hot spot analysis, that RGM decreases with distance from a large point source. EPA refers readers to the TSD for more detailed information about mercury hot spots. Unfortunately, the TSD presents no information, summary statistics, and/or actual calculations showing how excess deposition within 50 km of an EGU source is obtained. By assessing only mercury deposition attributable to EGUs, EPA fails to provide a context for all other sources of mercury deposition. Nor does the Agency explain why deposition from the top 10% of EGU mercury emitters does not decline, despite substantial reductions in modeled mercury emissions from those sources between 2005 and References Caffrey J.M., Landing W.M., Nolek S.D., Gosnell K.J., Bagui S.S., Bagui S.C., Atmospheric Deposition of Mercury and Major Ions to the Pensacola (Florida) Watershed: Spatial, Seasonal, and Inter-annual Variability, Atmospheric Chemistry and Physics, 10, EPA, CFR Part 63 [OAR ; FRL ] RIN 2060 AM96. Revision of December 2000 Regulatory Finding on the Emissions of Hazardous Air Pollutants From Electric Utility Steam Generating Units and the Removal of Coal- and Oil-Fired Electric Utility Steam Generating Units From the Section 112(c). Final rule, March 29. EPA, EPA-452/D Technical Support Document: National-Scale Mercury Risk Assessment Supporting the Appropriate and Necessary Finding for Coal- and Oil-Fired Electric Generating Units, March, Appendix G, pages EPRI, Atmospheric Deposition of Mercury, Trace Metals and Major Ions to a Watershed Around a Coal-Fired Power Plant. Palo Alto, CA:

64 Evers D.C., Han Y-J, Driscoll C.T., Kamman N.C., Goodale M.W., Lambert K.F., Holson T.M., Chen C.Y., Clair T.A., Butler T., Biological Mercury Hotspots in the Northeastern United States and Southeastern Canada, Bioscience, 57 (1), Kolker A., Olson M.L., Krabbenhoft D.P., Tate M.T., Engle M.A., Patterns of Mercury Dispersion from Local and Regional Emission Sources, Rural Central Wisconsin, USA, Atmospheric Chemistry and Physics, 10, Krishnamurthy N., Landing W.M, Caffrey J.M., Rainfall Deposition of Mercury and Other Trace Elements to the Northern Gulf of Mexico. Presented at the 10th International Conference on Mercury as a Global Pollutant, Halifax, Nova Scotia, Canada, July 27. Landing W.M., Caffrey J.M., Nolek S. D., Gosnell K. J., and Parker W. C., 2010 Atmospheric Wet Deposition of Mercury and Other Trace Elements in Pensacola, Florida, Atmospheric Chemistry and Physics, 10, Seigneur C., Lohman K., Vijayaraghavan K., Jansen J., Levin L., Modeling Atmospheric Mercury Deposition in the Vicinity of Power Plants, Journal of the Air & Waste Management Association, 56, Sullivan T., The Impacts of Mercury Emissions from Coal Fired Power Plants on Local Deposition and Human Health Risk. Presented at the Pennsylvania Mercury Rule Workgroup Meeting, October Modeled mercury deposition is subject to critical uncertainties EPA used the CMAQ model v4.7.1 to predict U.S. total mercury deposition on a national scale for 2005 and 2016 [EPA, 2011a]. CMAQ simulates the numerous physical and chemical processes involved in the formation, transport, and destruction of ozone, particulate matter, and air toxics such as mercury. Inputs to the model include emissions and meteorology, as well as data describing initial conditions and boundary conditions. The Agency fails to extensively evaluate the CMAQ model against real-world measurements, nor does EPA evaluate the use of CMAQ to match point sources to specific watersheds in order to identify hot spots. In addition, EPA vaguely and poorly explains model inputs in its proposed rulemaking and supporting documents. Thus, many uncertainties remain that influence CMAQ performance in predicting mercury deposition under the 2005 and 2016 scenarios. These uncertainties include, but are not limited to: Initial and boundary conditions: Since mercury deposition is driven primarily by chemistry, it is necessary to understand the impact of initial and boundary conditions [Lin and Pehkonen, 1999; Lin et al., 2007; Pongprueksa et al., 2008]. In-plume mercury reduction: Alternative chemical reactions that can occur in EGU stack plumes, such as in-plume reduction of Hg 2+ to Hg 0, must be addressed. Advanced modeling capabilities: Updates to enhance CMAQ capabilities (such as the Advanced Plume-in-grid Treatment) must be considered. Depositional processes: Modeling of wet and dry depositional processes requires further investigation. 3-4

65 Hence, CMAQ is useful for predicting regional and national patterns of deposition, but it has limitations when used for modeling small areas of localized deposition and thus, for identifying hot spots. EPA needs to address these issues described above in its evaluations of mercury deposition predicted by CMAQ to see how they affect model performance and output results. Any evaluations must completely specify input variables, parameters, and data used. The conclusion is that model performance can be improved and any improvements will change the predicted total mercury deposition Boundary and initial conditions EPA didn t provide a sensitivity analysis on boundary conditions. However, it is known that predicted mercury deposition relies heavily on the amount of gaseous elemental mercury used to define the boundary and initial conditions of a model. For example, Pongprueksa, et al. [2008] demonstrated this effect while simulating atmospheric mercury on a regional scale. They found that increasing the amount of gaseous elemental mercury in the boundary condition by 1 nanogram per cubic meter (1 ng/m 3 ) caused the predicted monthly deposition of total mercury to increase by 1270 nanograms per square meter (ng/m 2 ) in the continental United States. Model initial conditions have a similar, but weaker effect. Pongprueksa, et al. [2008] found that increasing the amount of gaseous elemental mercury in the initial condition by 1 ng/m 3 increased predicted deposition in the continental United States by 250 ng/m 2. Similar sensitivity analyses have not been provided by EPA, but need to be reported, most appropriately as part of a more comprehensive model performance evaluation. This is especially important because mercury emissions from Asia the region immediately upwind of North America that affects U.S. mercury deposition significantly and also affects it the most compared to other regions are expected to continue to increase [Jaffe et al., 2005; Jaffe et al., 2008; Pacyna et al., 2010; Pironne et al., 2010, Streets et al., 2009; Weiss-Penzias et al., 2006]. This will have implications for the amount of mercury in the boundary and initial conditions. However, these emission changes have not been accounted for in EPA s model exercise, thus leading to an overestimate of U.S. EGU-attributable deposition in Another aspect of boundary and initial conditions is related to how much U.S. EGUs contribute to total global emissions of mercury. Anthropogenic sources of mercury to the atmosphere have been extensively studied, with the most recent global estimates of about 2500 megagrams (Mg) emitted annually. About 18% of these global anthropogenic emissions come from EGUs, with U.S. EGUs making up approximately 2.5 and 1.2% of this global anthropogenic total in 2005 and 2010, respectively [Pacyna et al., 2010; Pironne et al., 2010; Streets et al., 2009; EPA, 2011b]. Hence, considering the relatively small contribution of U.S. EGUs, defining boundary and initial conditions accurately and correctly is even more important In-plume reduction of Hg 2+ to Hg 0 EPA s failure to include in-plume reduction of reactive gaseous mercury (Hg 2+ ) to gaseous elemental mercury (Hg 0 ) is a significant shortcoming of its analyses. The chemical reactions that reduce reactive gaseous mercury to gaseous elemental mercury are another source of uncertainty in mercury atmospheric modeling, and the choice of reduction mechanism can influence model predictions, as shown by Lin, et al. [2007], Pongprueksa, et al. [2008], and Lohman, et al. [2006]. In a sensitivity analysis of the CMAQ-Hg model, Lin, et al., [2007] and Pongprueksa, et al. [2008] replaced aqueous Hg(II) HO 2 reduction by either 3-5

66 reactive gaseous mercury reduction by carbon monoxide (CO) (5 x cm 3 molecule -1 s -1 ), or reactive gaseous mercury photoreduction (1 x 10-5 s -1 ). Using either alternative reaction allowed the CMAQ-Hg model to predict mercury wet deposition more closely in agreement with deposition measured by the MDN. Lohman et al. [2006] simulated in-plume chemical transformations using the Reactive & Optics Model of Emissions (ROME), using two reduction pathways: a pseudo-first-order decay of reactive gaseous mercury of 0.3 h -1, and an empirical reaction of reactive gaseous mercury with SO 2 of 8 x cm 3 molecule -1 s -1. Results showed better agreement between the simulations and the measurements of mercury concentrations in power plant plumes. Reduction of reactive gaseous mercury to gaseous elemental mercury has been reported in power plant plumes. Supporting data include atmospheric concentrations of speciated mercury measured downwind of power plant stacks and model predictions [Edgerton et al., 2006; Lohman et al., 2006]. A detailed description of in-plume reduction reactions is provided in EPRI Comments, Section 3.4. At the very least, EPA needs to provide a sensitivity analysis that shows how inclusion of in-plume reduction of reactive gaseous mercury to gaseous elemental mercury changes model results Advanced Plume in grid Treatment EPA didn t assess model performance using available CMAQ updates, although advances in modeling capabilities help to reduce uncertainties in predicting mercury deposition. For example, the Advanced Plume-in-grid Treatment (APT) is a CMAQ update that allows better resolution of sub-grid-scale processes associated with emissions from elevated point sources, such as EGUs. CMAQ-APT has shown improved performance in predicting mercury deposition, as well as in predicting the behavior of NOx, SO 2, ozone (O 3 ), and PM [Vijayaraghavan et al., 2006, 2009]. Using CMAQ-APT to model mercury in the stack plumes 2 of the top 30 mercury-emitting power plants in the United States, Vijayaraghavan et al. [2008] demonstrated improved performance in predicting mercury wet deposition compared with a purely Eulerian grid-based model, partial correction of wet deposition over-predictions downwind of coal-fired power plants in the northeastern United States, and decreases of approximately 10% in simulated dry and wet deposition over large areas of the eastern United States with larger decreases occurring near power plants selected for APT analysis. EPA needs to include the latest updates to CMAQ, in this case APT, that have clearly been shown to improve model performance and to partially correct over-predictions of deposition, 2 Plumes are embedded within a three-dimensional grid-based Eulerian air quality model and the model is extended to include a comprehensive treatment of mercury processes i.e., gas-phase adsorption of reactive gaseous mercury on atmospheric particulate matter and the reduction of reactive gaseous mercury to elemental mercury by sulfur dioxide (as a proxy for in-plume chemical reduction of Hg 2+ to Hg 0 ; see also Section 3.4). 3-6

67 especially near power plants. Such inclusions will result in more realistic predicted total mercury deposition Depositional processes EPA fails to provide first-hand information on wet and dry deposition processes (such as wet/dry deposition ratio) used in their model, although this information is important to provide accurate predictions of mercury wet and dry deposition. Hence, it is reasonable to assume that the CMAQ-Hg model was run with default settings for mercury chemistry, predicting total mercury deposition of about 35% through wet processes and 65% through dry processes (wet/dry deposition ratio of 0.5), with little seasonal variation between January and July [Pongprueksa et al., 2008]. However, modifying mercury chemistry in the model to include seasonal factors (such as solar radiation, precipitation, and availability of oxidizing agents) can introduce seasonal variation in overall deposition, but the wet/dry deposition ratio remains the same [Pongprueksa et al., 2008]. The wet/dry deposition ratio predicted by CMAQ-Hg does not match mercury deposition measurements. Namely, direct and indirect measurements show that the wet/dry deposition ratio for mercury in the continental United States averages around 3 (ranging between 0.1 and 16.7) that is, 6 times higher than CMAQ-Hg predicts [Engle et al., 2010; Lombard et al., 2011; Lyman et al., 2007; Lyman et al., 2009; Zang et al., 2009]. EPA needs to assess how applying measured wet/dry deposition ratios for mercury impacts deposition predicted by CMAQ-Hg, and how predicted values compare to, for example, MDN data Grid cell size EPA fails to provide detailed context and background information on the effect of grid size on CMAQ model results. To predict atmospheric deposition, CMAQ averages emissions data over an area known as a grid cell. The Agency used a 36-km grid resolution (i.e., 36 x 36 km) to establish incoming air quality concentrations (boundary conditions) along the boundaries of 12 x 12 km grids (144 km 2 ). EPA used only the 12 x 12 km grids in determining the impact of changes in mercury emissions on changes in mercury deposition. This choice, along with the boundary condition issues outlined in EPRI Comments, Section 3.2.1, raises numerous concerns. First, CMAQ predicts only an average mercury concentration for an entire grid cell. For example, if there is only one mercury source in a grid cell, then that source s emissions will be averaged over the entire grid cell. Such averaging causes an artificially fast dilution and may result in smoothing out areas of high and low deposition. APT (see EPRI Comments, Section 3.2.3) resolves this problem. Second, although the ability to identify large areas of localized high deposition is important in the current proposed rulemaking, using a 12 x 12 km grid provides a resolution that is too coarse for pinpointing smaller areas of localized high deposition. Third, anglers are likely to catch fish from several waterbodies. Thus, a grid larger than the current 12 x 12 km would better account for such common fishing patterns. Conversely, a larger grid would also decrease model ability to simulate smaller areas of localized high deposition. 3-7

68 Finally, EPA needs to provide detailed and rigorous background information regarding the effects of grid size on CMAQ model results, in the context of over- and underestimation of predicted changes in deposition. Again, EPA needs to provide a much more detailed and rigorous overview of the inputs and variables used in CMAQ that are crucial to computing predicted mercury (total, wet, and dry) deposition values. The current description [EPA, 2011a], as pointed out previously, is vague, detail-sparse, and a mere general description that doesn t allow for a critical review of the model and its performance References Edgerton E.S., Hartsell B.E., Jansen, J.J., Mercury Speciation in Coal-fired Power Plant Plumes Observed at Three Surface Sites in the Southeastern US, Environmental Science & Technology, 40, Engle M.A., Tate M.T., Krabbenhoft D.P, Schauer J.J., Kolker A., Shanley J.B., Bothner M.H., Comparison of Atmospheric Mercury Speciation and Deposition at Nine Sites across Central and Eastern North America, Journal of Geophysical Research, 115, D18306, doi: /2010jd EPA, 2011a. EPA-454/R Air Quality Modeling Technical Support Document: Point Source Sector Rules. U.S. Environmental Protection Agency Office of Air Quality Planning and Standards, Air Quality. EPA, 2011b. EPA-452/D Technical Support Document: National-Scale Mercury Risk Assessment Supporting the Appropriate and Necessary Finding for Coal- and Oil-Fired Electric Generating Units. March. Jaffe D., Prestbo E., Swartzendruber P., Weiss-Penzias P., Kato S., Takami A., Hatakeyama S., Kajii Y., Export of Atmospheric Mercury from Asia, Atmospheric Environment, 39, Jaffe D., Strode S., Fate and Transport of Atmospheric Mercury from Asia, Environmental Chemistry, 5, 121, doi: /en Lin C.-J., Pehkonen S.O., The Chemistry of Atmospheric Mercury: A Review, Atmospheric Environment, 33, Lin C.-J., Pongprueksa P., Bullock Jr. O.R., Lindberg S.E., Pehkonenf S.O., Jangg C., Bravermang T., Hoh T.C., Scientific Uncertainties in Atmospheric Mercury Models II: Sensitivity Analysis in the CONUS Domain, Atmospheric Environment, 41, Lohman K., Seigneur C., Edgerton E., Jansen J., Modeling Mercury in Power Plant Plumes, Environmental Science & Technology, 40, Lombard M.A.S., Bryce J.G., Mao H., Talbot R., Mercury Deposition in Southern New Hampshire, , Atmospheric Chemistry and Physics Discussions, 11,

69 Lyman S.N., Gustin M.S., Presto E.M., Marsik F.L., Estimation of Dry Deposition of Atmospheric Mercury in Nevada by Direct and Indirect Methods, Environmental Science & Technology, 41, Lyman S.N., Gustin M.S., Prestbo E.M., Kilner P.I., Edgerton E., Hartsell B., Testing and Application of Surrogate Surfaces for Understanding Potential Gaseous Oxidized Mercury Dry Deposition, Environmental Science & Technology, 43, Pacyna E.G., Pacyna J.M., Sundseth K., Munthe J., Kindbom K., Wilson S., Steenhuisen F., Maxson P., Global Emission of Mercury to the Atmosphere from Anthropogenic Sources in 2005 and Projections to 2020, Atmospheric Environment, 44, Pirrone N., Cinnirella S., Feng X., Finkelman R.B., Friedli H.R., Leaner J., Mason R., Mukherjee A.B., Stracher G.B., Streets D. G., Telmer K., Global Mercury Emissions to the Atmosphere from Anthropogenic and Natural Sources, Atmospheric Chemistry and Physics, 10, Pongprueksa P., Lin C.-J., Lindberg S.E., Jang C., Braverman T., Bullock Jr, O.R., Hog T.C., Chuh H.-W., Scientific Uncertainties in Atmospheric Mercury Models III: Boundary and Initial Conditions, Model Grid Resolution, and Hg(II) Reduction Mechanism, Atmospheric Environment, 42, Streets D.G., Zhang Q., Wu Y., Projections of Global Mercury Emissions in 2050, Environmental Science and Technology, 43, Vijayaraghavan K., Karamchandani P., Seigneur C., Plume-in-grid Modeling of Summer Air Pollution in Central California, Atmospheric Environment, 40, Vijayaraghavan K., Karamchandani P., Seigneur C., Balmori R., Chen S.-Y., Plume-ingrid Modeling of Atmospheric Mercury, Journal of Geophysical Research, 113, D24305, doi: /2008jd Vijayaraghavan K., Zhang Y., Seigneur C., Karamchandani P., Snell H.E., Export of Reactive Nitrogen from Coal-fired Power Plants in the U.S.: Estimates from a Plume-in-grid Modeling Study, Journal of Geophysical Research, 114, D04308, doi: /2008jd Weiss-Penzias P., Jaffe D., Swartzendruber P., Dennison J.B., Chand D., Hafner W., Prestbo E., Observations of Asian Air Pollution in the Free Troposphere at Mt. Bachelor Observatory in the Spring of 2004, Journal of Geophysical Research, 110, D10304, doi: /2005jd Zhang L, Wright L.P., Blanchard P., A Review of Current Knowledge Concerning Dry Deposition of Atmospheric Mercury, Atmospheric Environment, 43, EPA s calculated contributions of U.S. EGU mercury emissions to deposition and fish tissue levels represent upper bounds of actual contributions Using the CMAQ model, EPA estimated how much mercury deposited to each U.S. watershed came from U.S. EGUs. Then the Agency used the ratio of EGU deposition to total mercury deposition to estimate the health risk of ingesting fish whose tissues contain methylmercury 3-9

70 (MeHg) from U.S. EGU sources. This apportionment of total risk relies on the Mercury Maps (MMaps) approach developed by EPA s Office of Water. The MMaps approach establishes a proportional relationship between mercury deposition over a watershed and resulting fish tissue methylmercury levels, assuming that a number of criteria are met [EPA, 2011a; EPA, 2011b]. However, the origin of atmospheric mercury that is deposited to watersheds in the United States is still poorly understood. The relative contributions of local, regional, and global anthropogenic sources as well as natural sources of mercury are likely to vary across the United States. It is important to characterize these sources to assess the efficacy of the present rulemaking. Current research shows that models of mercury atmospheric fate and transport overestimate the local and regional impacts of some anthropogenic sources, such as U.S. EGUs. Thus, calculated contributions to mercury deposition and fish tissue methylmercury levels from these sources represent upper bounds of actual contributions [Seigneur et al., 2003; Seigneur et al., 2004]. EPA fails to provide a detailed discussion of its results based on currently available scientific data; these results should be presented as estimates of lower and upper bound limits Modeled and measured relative contribution of EGUs to mercury deposition Global modeling studies show that only 20 33% of all the mercury deposited within the continental United States comes from North American anthropogenic sources. Those sources include EGUs, which contribute about 27% of the anthropogenic mercury emitted in the United States [Seigneur et al., 2004]. Seigneur et al. [2004] used a global chemical transport model and a continental chemical transport model (TEAM) to calculate the contribution of North American anthropogenic sources to total mercury deposition for low, average, and high mercury emission scenarios. Their calculations yielded a range of 25 32%, which they defined as the upper and lower bounds of U.S. anthropogenic contribution to mercury deposition within the country. In another global modeling study, Travnikov [2005] found that 30 33% of mercury deposited in North America is of North American origin, while 21 24% comes from Asia. Using the GEOS-CHEM model, Selin and Jacob [2008] found that North American anthropogenic emissions contributed, on average, 20% of the total mercury deposited within the continental United States, corroborating previous findings by Selin, et al. [2007]. Most of the reactive gaseous mercury deposited by wet processes originates in the global atmospheric mercury pool. In their GEOS-CHEM study, Selin and Jacob [2008] found that 60% of reactive gaseous mercury deposited by wet processes within the United States comes from scavenging mercury (removing it from the gas stream) in the free troposphere. The rest of the reactive gaseous mercury deposited by wet processes comes from scavenging within the U.S atmospheric boundary layer, where oxidation of gaseous elemental mercury is the principal source of reactive gaseous mercury. Overestimation arises in EPA s predictions of mercury wet deposition. In the Air Quality Modeling Technical Support Document: Point Source Sector Rules (AQM TSD), table III-3, page 9, the Agency reports that modeled mercury wet deposition shows a mean bias of 34% (annual average normalized) and a mean error of 52% (annual average normalized). In other words, wet deposition is overestimated by 34%. 3-10

71 Unrealistic wet deposition values flag problems with the performance of EPA s CMAQ model. In table III-3, predicted total mercury wet deposition for the 4th quarter is reported at 0.80 (minus 0.80) micrograms per square meter (ug/m 2 ). Although net dry deposition can sometimes be negative, a negative value for wet deposition is physically impossible, even after manipulation or assimilation of precipitation data, which EPA does not report. The reported wet deposition bias for the 4th quarter is also negative ( 1.27) and could be generated only by negative concentrations of mercury that are non-physical. It appears that the negative value is not a typographical error, but instead an error somewhere in the analysis likely the result of inaccurate post-processing or manipulation of the data, or some other mistake that needs to be corrected which can t be pinpointed due to the lack of detail provided in the AQM TSD. It would be prudent for EPA to present their predicted deposition values in light of the modeled and measured data available. In addition, EPA needs to check for errors in the CMAQ model, correct them, redo the modeling, and re-compare the new results with those currently presented Comments on MMaps approach The MMaps approach establishes a proportional relationship between mercury deposition to a watershed and resulting fish tissue methylmercury levels, assuming that certain criteria are met. under certain conditions (e.g., Hg deposition is the primary loading to a watershed and near steady-state conditions have been reached), a fractional change in Hg deposition to a watershed will ultimately be reflected in a matching proportional change in the levels of MeHg in fish. [TSD, page 6] The MMaps approach and underlying analyses (see Section 1.3 and Appendix E), support a proportional relationship between mercury deposition and fish tissue MeHg levels within a given watershed, such that changes in deposition will be reflected in changes in fish tissue levels. [TSD, page 48] EPA acknowledges limitations of the MMaps approach. The first limitation is that MMaps is based on the assumption of a linear, steadystate relationship between concentrations of MeHg in fish and present day air deposition mercury inputs. We [EPA] expect that this condition will likely not be met in many waterbodies [RIA, page 5-24] However, the Agency concludes Despite these limitations, EPA is unaware of any other tool for performing a national-scale assessment of the change in fish MeHg from reductions in atmospheric deposition of Hg. [TSD, page 78] However, these statements ignore contradictory evidence and EPA makes assumptions that are not supported by the scientific data available to date: Other models are available: EPA fails to recognize, evaluate, and possibly use EPRI s Mercury Cycling Model (MCM) in either its steady-state or dynamic version. This model was developed expressly to evaluate the relationship between changes in atmospheric mercury deposition to waterbodies and changes in fish tissue methylmercury levels. EPRI s 3-11

72 MCM has been found to be applicable and useful under several environmental conditions [Chen et al., 2008; Chen and Herr, 2010; Harris et al., 2011], showing its suitability for the purposes EPA used MMaps. The assumption of a proportional relationship between atmospheric mercury deposition and fish tissue methylmercury levels lacks scientific support: TSD, Section 1.3 and Appendices E and F are general descriptions of MMaps that provide no scientific support for the assumption of a proportional relationship. EPA assumes a proportional relationship, but fails to specify what it may be 1:1, 1:2, or something else. Data that demonstrate a linear reduction in fish tissue methylmercury in response to a reduction in atmospheric mercury deposition do not exist. Data from the Mercury Experiment to Assess Atmospheric Loading in Canada and the United States (METAALICUS) study [Harris et al., 2007] and other studies [Orihel et al., 2007] describe deposition increases into low trophic-level lakes, not deposition decreases. These studies are partial demonstrations in individual watersheds that may show non-linear responses to changes in mercury deposition. EPA does mention the METAALICUS study [TSD, pages 69 70], but continues to assume a linear, direct relationship between changes in mercury deposition and changes in fish methylmercury levels. For example, EPA states that U.S. EGU-attributable Hg fish tissue levels are directly based on U.S. EGU Hg deposition (at the watershed level). [TSD, page 44] and provides Figure 2-17 accompanied by the statement: This plot allows consideration for whether there appears to be a correlation between these two factors at the watershed level. [TSD, page 45] It is clear from the plot that there is no demonstrated relationship between mercury deposition and fish tissue methylmercury concentrations at the watershed level [that is, between watersheds], which EPA correctly points out [TSD, page 48]. But without further evidence, EPA claims that this relationship is expected to hold within a given watershed. [TSD, page 48] In fact, the U.S. Geological Survey national waterway study showed that sheet flow and drainage, not deposition, dominated input to the waterbodies it surveyed [Scudder et al., 2009]. Sheet flow and drainage could well contain mercury, complicating the relationship that EPA claims is linear and direct. Hence, MMaps provides no insight into whether U.S. EGU-attributable methylmercury levels in fish tissue are directly based on U.S. EGU atmospheric mercury deposition. The Agency needs to provide 1. the exact proportional relationship used in modeling, 2. scientific evidence for a relationship between mercury deposition and fish tissue methylmercury concentrations within watersheds, and 3-12

73 3. an assessment that compares results of other available models with those of MMaps to see whether other models that predict the relationship between mercury deposition to a watershed and resulting fish tissue methylmercury levels give accurate and more realistic results than MMaps. There is no provision for lag time in response to deposition change: EPA acknowledges that response lag time influences the perceived benefits of decreasing mercury deposition from U.S. EGUs and that MMAPs fails to incorporate this information. the lag period changes in fish tissue (and hence changes in IQ [intelligence quotient]) can range from less than 5 years to more than 50 years, with an average time span of one to three decades. [EPA, 2005] If a lag in the response of MeHg levels in fish were assumed, the monetized benefits could be significantly lower. [RIA, page 5-78] the MMaps model does not provide any information on the lag of response. [RIA, page 5-25] Scientific evidence for a lag time in response to deposition change is compelling. Results of the METAALICUS study show that there is a lag time (and a non-proportional response): after 3 4 years, mercury levels in biota increased 30 40% in response to a mercury deposition increase of 120%. Although EPA cites these outcomes [TSD, page 70], it chooses to use the MMaps model which does not incorporate lag time. Numerous factors influence lag time, including watershed characteristics [Grigal et al., 2002], watersheds may act as legacy sources releasing mercury when disturbed [Yang et al., 2002], the magnitude of emission reductions and subsequent changes in atmospheric deposition need to be weighed against the amount of mercury already in an ecosystem [Krabbenhoft et al., 2007], the distance of an ecosystem from mercury sources [Lindberg et al. 2007], and mercury deposited to aquatic ecosystems becomes less available for uptake by biota over time [Orihel et al., 2008]. The MMAPs method assumes that steady state has been achieved, when in reality mercury emissions and deposition are changing: Atmospheric deposition of mercury can enter a waterbody in one of two ways. The first is through direct deposition onto the waterbody s surface. The second is by way of deposition onto the terrestrial portion of the watershed (soils and vegetation), some of which eventually travels by way of evasion, runoff, and erosion into the waterbody. Therefore, lag times would need to be included in the modeling and be able to vary from watershed to watershed and sometimes even from waterbody to waterbody within a watershed. Another problem with the instantaneous steady state assumption is that the emission rates of mercury due to U.S. sources have been decreasing for more than a decade, while emissions due to sources outside the U.S. have been 3-13

74 increasing (see also EPRI Comments, Section 3.2.1). Therefore, the system is not at steady state, a basic premise of the model (see Appendix G, 2.11 for further information). Given the demonstrated lag time in response to deposition change, it is logical to conclude that a lag time needs to be incorporated in MMaps to adjust the current overestimation of how much fish tissue methylmercury levels decrease in response to decreases in mercury deposition attributable to U.S. EGUs. As a consequence, MMaps overestimation of monetized health benefits will be partially corrected. As discussed above, MMaps assumes a direct and linear, steady-state relationship between fish tissue methylmercury levels and present-day inputs from atmospheric mercury deposition without any lag time. Based on the lack of scientific support for these critical assumptions, it is reasonable to conclude that MMaps is not suited to predict the health benefits of reducing mercury deposition from fossil-fired EGUs unless the changes outlined above are incorporated References Chen C.W., Herr J.W., Simulating the Effect of sulfate Addition on Methylmercury Output from a Wetland, Journal of Environmental Engineering, 136 (4), , doi: /_asce_ee Chen C.W., Herr J.W., Goldstein R.A., Model Calculations of Total Maximum Daily Loads of Mercury for Drainage Lakes, Journal of the American Water Resources Association, 44 (5), EPA, Standards of Performance for New and Existing Stationary Sources: Electric Utility Steam Generating Units. 40 CFR Part, 60, 63, 72, and 75. p 184. EPA, 2011a. Regulatory Impact Analysis of the Proposed Toxics Rule: Final Report. March 20. EPA, 2011b. EPA-452/D Technical Support Document: National-Scale Mercury Risk Assessment Supporting the Appropriate and Necessary Finding for Coal- and Oil-Fired Electric Generating Units. March. Grigal D.F., Inputs and Outputs of Mercury from Terrestrial Watersheds: A Review, Environmental Review, 10, Harris, R.C., Pollman C., Landing W., Hutchinson D., Evans D., Axelrad D., Morey S.L., Sunderland E., Rumbold D., Dukhovskoy D., Adams D., Vijayaraghavan K., Mercury Cycling, Bioaccumulation and Human Exposure in the Gulf of Mexico. Presented at the 10th International Conference on Mercury as a Global Pollutant, Halifax, Nova Scotia, Canada, July 27. Harris R.C., Rudd J.W.M., Amyot M., Babiarz C.L., Beaty K.G., Blanchfield P.J., Bodaly R.A., Branfireun B.A., Gilmour C.C., Graydon J.A., Heyesk A., Hintelmann H., Hurley J.P., Kelly C.A., Krabbenhoft D.P., Lindberg S.E., Mason R.P., Paterson M.J., Podemski C.L., Robinson A., Sandilands K.A., Southworth G.R., St. Louis V.L., Tatem M.T., Whole-ecosystem Study Shows Rapid Fish-mercury Response to Changes in Mercury Deposition,

75 Keeler G.J., Landis M.S., Norris G.A., Christianson E.M., Dvonch J.T., Sources of Mercury Wet Deposition in Eastern Ohio, USA, Environmental Science & Technology, 40, Krabbenhoft D.P., Engstrom D., Gilmour C., Harris R., Hurley J., Mason R., Monitoring and Evaluating Trends in Sediment and Water Indicators. In Harris R., Krabbenhoft D., Mason R., Murray M.W., Reash R., Saltman T. (Eds.), Ecosystem Responses to Mercury Contamination: Indicators of Change. New York: Society of Environmental Toxicology and Chemistry (SETAC) North America Workshop on Mercury Monitoring and Assessment, CRC, pp Orihel D.M., Paterson M.J., Blanchfield P.J., Bodaly R.A., Gilmour C.C., Hintelmann H., Temporal Changes in the Distribution, Methylation, and Bioaccumulation of Newly Deposited Mercury in an Aquatic Ecosystem, Environmental Pollution, 154, Orihel D.M., Paterson M.J., Blanchfield P.J., Bodaly R.A., Hintelmann H., Experimental Evidence of a Linear Relationship between Inorganic Mercury Loading and Methylmercury Accumulation by Aquatic Biota, Environmental Science & Technology, 41, Scudder B.C., Chasar L.C., Wentz D.A., Bauch N.J., Brigham M.E., Moran P.W., Krabbenhoft D.P., Mercury in fish, bed sediment, and water from streams across the United States, : U.S. Geological Survey Scientific Investigations Report , 74 p. Seigneur C., Vijayaraghavan K., Lohman K., Karamchandani P., Scott C., Global Source Attribution for Mercury Deposition in the United States, Environmental Science & Technology, 38, Seigneur C., Karamchandani P., Vijayaraghavan K., Shia R.-L., Levin L., On the Effect of Spatial Resolution on Atmospheric Mercury Modeling, Science of the Total Environment, 304, Selin N.E., Jacob D.J., Seasonal and Spatial Patterns of Mercury Wet Deposition in the United States: Constraints on the Contribution from North American Anthropogenic Sources, Atmospheric Environment, 42, Selin N.E., Jacob D.J., Park R.J., Yantosca R.M., Strode S., Jaegle L., Jaffe D., Chemical Cycling and Deposition of Atmospheric Mercury: Global Constraints from Observations, Journal of Geophysical Research, 112, D02308, doi: /2006jd Travnikov O., Contribution of the Intercontinental Atmospheric Transport to Mercury Pollution in the Northern Hemisphere, Atmospheric Environment, 39, Yang H., Rose N.L., Battarbee R.W., Boyle J.F., Mercury and Lead Budgets for Lochnagar, a Scottish Mountain Lake and Its Catchment, Environmental Science & Technology, 36,

76 3.4 Scientific evidence describes mercury reactions in power plant plumes that alter the relative contribution of U.S. coal-fired EGUs to local and regional mercury deposition Recently published measurements, field studies, and modeling studies find more gaseous elemental mercury in coal-fired power plant plumes traveling downwind of their sources than would be predicted from stack emissions alone. In these studies, the ratio of gaseous elemental mercury to reactive gaseous mercury in plumes some distance from their source is higher than predicted. Such observations suggest that reactive gaseous mercury rapidly reduces to gaseous elemental mercury as the plume disperses. The reactions implied alter the relative contribution of U.S. coal-fired EGUs to local and regional mercury deposition and should be included in any model calculations of predicted total mercury deposition. EPA does not consider this scientific evidence in its proposed rule, supporting documents, or model used to predict the contribution of mercury emissions from U.S. EGUs to total mercury deposition. EPA needs to enhance the mercury chemistry routines in its CMAQ model to implement in-plume conversion of reactive gaseous mercury to gaseous elemental mercury, as there is significant scientific evidence supporting this conversion EPRI and EPA in-plume field studies There is field evidence to support the occurrence of in-plume conversion. In the fall of 2002, EPRI began a series of in-plume field studies to measure the results of apparent mercury chemical reactions in plumes from coal-fired EGUs. The first in-plume study was conducted at Plant Bowen, GA, using a fixed wing aircraft as an airborne sampling platform. Analyzing plume samples for speciated mercury, researchers found that reactive gaseous mercury levels decreased slightly in samples taken 12 miles downwind of the stack, as compared to levels in stack samples. The ratio of gaseous elemental mercury to reactive gaseous mercury was 84% of the in-stack ratio; in other words, elemental mercury concentrations in the plume were 16% higher than those measured in the stack. Researchers suggested a combination of deposition and/or chemical changes in the plume to explain these results [Prestbo et al., 2004]. In the summer of 2003, EPRI conducted a second in-plume study at Plant Pleasant Prairie, WI. This study found a 44% reduction in the fraction of reactive gaseous mercury between the stack exit and the first sampling location (1500 feet downwind), and a 66% reduction from the stack to 5 miles downwind, with no additional reduction between 5 and 10 miles downwind [EPRI, 2005; EPRI, 2006]. Finally, in February 2008, EPRI conducted a third in-plume study at Plant Crist, FL. During this 10-day study, EPRI worked closely with Southern Company and the EPA Office of Research and Development. This time the sampling platform was a lighter-than-air dirigible chosen by EPA on the theory that it would enhance in-plume sampling time, compared to a fixed wing aircraft. Analyses of in-plume, flue gas, and coal samples showed around 4% conversion of reactive gaseous mercury to gaseous elemental mercury in the plume at about 1 km downwind of the stack tip. These observations agree with those from the previous two EPRI in-plume studies [Landis et al., 2009; Ter Schure et al., 2011]. Tables 3-1 and 3-2 summarize all three in-plume studies and their results. 3-16

77 Table 3-1 Summary of EPRI-Sponsored In-Plume Field Studies Plant Bowen Pleasant Prairie Plant Crist Field Campaign Date 09/21 10/19/02 08/24 09/04/03 02/18 03/01/08 Owner Southern Company WE Energies Southern Company Coal Type Burned KY and WV eastern bituminous Powder River Basin (PRB) subbituminous blend South American U.S. eastern Boiler Type Tangentially-fired, low NOx burners Opposed-fired, low NOx burners Wall-fired, low NOx burners Units: Number and MWs 4: 2 x 700 MW, 1 x 900 MW, 1: out-of-service 2: 2 x 617 MW 4: 1 x 320 MW, 1 x 500 Mw, 2 outof-service Units 1 & 2 SCR (bypassed on Unit 2), ESP Unit 1: CESP Unit 6: CESP Control Configurations Unit 4: ESP Unit 2: SCR, CESP Unit 7: SCR, CESP Active Stacks Two One One In-Plume Sampling Platform Turboprop DHC Twin Otter Turboprop DHC Twin Otter Skyship 600B Sampling Distances from Plume Upwind; downwind: 1500 ft, 6 nm, and 12 nm Upwind; downwind: 1500 ft, 5 mi, and 10 mi Upwind, downwind at variable distances, representing plume dilution ratios of ~ 500, ~2500, ~ 7500, and ~ In-Plume Measurements Hg 0, RGM, Hg Part, NOx SO 2 Hg 0, RGM, Hg Part, NOx Hg 0, Total Hg, Hg Part, NOx, SO 2, CO 2, RGM (manual) Mercury Speciation (% of total) 8 31% Hg 0, 69 92% Hg 2+ (depending on unit) 66% Hg 0, 34% Hg 2+ 83% Hg 0, 16% Hg 2+, 1% Hgp 3-17

78 Table 3-1 (continued) Summary of EPRI-Sponsored In-Plume Field Studies Plant Bowen Pleasant Prairie Plant Crist Sampling Protocol Upwind sample first, then 10 passes at each downwind distance Upwind sample first, then 10 passes at each downwind distance Upwind sample first, then hovering at each downwind dilution ratio distance for 60 minutes In-Stack Measurements Total Hg, Hg 0 using OH-method and CMMs; SO 3, total particulates Total Hg, Hg 0 using OH-method and CMMs; NOx Total Hg, Hg 0 using OH-method and CMMs; NOx, SO 3, Trace elements (EPA method 29) Other Measurements Daly coal and ESP hopper ash samples None Daily coal samples, gaseous ions, particulate ions 3-18

79 Table 3-2 Measured Percent In-Plume Conversion of Hg2+ to Hg0 at Corresponding Effective Stack Distance (straight-line down wind distance) for the Three In-Plume Studies Plant Bowen Pleasant Prairie Plant Crist Hg 2+ Conversion to Hg 0 (%) < 16 ~ Effective Stack Distance (km) ~ ~ ~ Field observations and modeling studies Scientists generally assume that reactive gaseous mercury present in plumes from coal-fired EGUs deposits relatively close to its source, while most of the gaseous elemental mercury in plumes disperses into the global atmospheric mercury pool [Schroeder and Munthe, 1998; Mason and Sheu, 2002; Pehkonen and Lin, 1998]. But recent field and modeling studies demonstrating in-plume reduction of reactive gaseous mercury to gaseous elemental mercury suggest alternative dynamics [Vijayaraghavan et al., 2008; Edgerton et al., 2006; Lohman et al., 2006 and references therein; Zhao et al., 2006; Pongprueksa et al. 2008]. Edgerton, et al. [2006] measured gaseous elemental mercury, reactive gaseous mercury, and particulate mercury in 41 plumes at three South Eastern Aerosol Research (SEARCH) sites with nearby EGUs in the southeastern United States. In 21 samples, total mercury (reactive gaseous mercury + gaseous elemental mercury) was conserved from stack emission to measurement site. The dominant species was gaseous elemental mercury, constituting 84% of the samples. But emission estimates predicted that gaseous elemental mercury would represent only 42% of the mercury observed in the plumes. In the same study, Edgerton, et al. [2006] observed that ratios of reactive gaseous mercury to sulfur dioxide were lower by a factor of 2 4 than expected in 41 in-plume samples free of precipitation. In-plume reduction of reactive gaseous mercury to gaseous elemental mercury might account for that observation. To further investigate the in-plume reduction of mercury, Lohman, et al. [2006] simulated chemical transformations in 9 of the 41 in-plume samples described above using the ROME model. These simulations did not reproduce the depletion in reactive gaseous mercury measured by Edgerton, et al [2006], which could not be explained by dry deposition or by cloud chemistry. However, when incorporating two possible reduction pathways for reactive gaseous mercury into ROME, results showed better agreement between the simulations and the measurements. The two reduction pathways were a pseudo-first-order decay of 0.3 h -1, and an empirical reaction with SO 2 of 8 x cm 3 molecule -1 s -1. Although Pongprueksa, et al. [2008] addressed conditions in the ambient atmosphere not in plumes they found improved agreement between CMAQ-simulated mercury deposition and mercury wet deposition measured at MDN sites when they implemented reduction of gaseous reactive mercury by CO and/or photoreduction, compared to reduction by the hydroperoxyl 3-19

80 radical (HO 2 ). The empirical second-order rate constant of 5 x cm 3 molecule -1 s -1 for CO was comparable to the empirical reduction with SO 2 described above by Lohman et al. [2006]. The photoreduction mechanism had an average first-order rate of 1 x 10-5 s -1 (ranging from 10-6 to10-3 ) Chemical mechanism studies Chloride is one of the most important ligands to react with reactive gaseous mercury. Scientists assume that reactive gaseous mercury in an aqueous environment exists primarily as mercuric chloride (HgCl 2 ). But the mechanisms of interaction between mercury in flue gas and chlorinecontaining species, SO 2, nitrogen oxide (NO), and water (H 2 O) remain ambiguous. In bench reactor tests using simulated flue gas, Zhao, et al. [2006] showed that mercury oxidation rates decreased in the presence of SO 2, NO, and water vapor, while mercury reduction rates increased. Removing water from the flue gas blend caused these inhibition or promotion effects to disappear. In other tests, mercury reduction rates increased as flue gas temperature rose, while reduction rates declined as dichlorine (Cl 2 ) concentrations increased [Zhao et al., 2006]. Chemists explain these observations as follows: SO 2, NO, and H 2 O affect the concentrations of chloride ions and free chlorine in flue gas; hence, they determine how effectively chlorine will form the chlorinated sites on unburned carbon needed to oxidize gaseous elemental mercury Plume dilution chamber studies Scientists use a plume dilution chamber (PDC) to simulate plume conditions in the atmosphere by measuring mercury species in flue gas entering the chamber and tracking chemical transformations of those species as the plume dilutes. To date, EPRI has conducted several plume dilution chamber studies [Prestbo et al. 2004; Laudel and Prestbo, 2001]. They include recent studies at Plant Pleasant Prairie, WI (August 2003) and Plant Bowen, GA (October 2002), as well as those at the Energy & Environmental Research Center (EERC) at the University of North Dakota (March 2000) and at WEPCO Presque Isle Power Plant, WI (February 1995). When simulating rain in the PDC, researchers typically observe a continuous, gradual increase in gaseous elemental mercury suggesting sulfur dioxide-mediated conversion of reactive gaseous mercury in water droplets to gaseous elemental mercury, followed by diffusion to the droplet interface and transfer to the gas phase. In an alternative scenario with no chemical conversion, dissolved gaseous elemental mercury diffuses into the droplet interface, where mass transfer to the gas phase occurs over the course of the simulation. Given the extremely low water solubility of gaseous elemental mercury, the second explanation is much less likely than the first. After reviewing the scientific evidence, EPRI finds substantial support for the finding that there is more gaseous elemental mercury in coal-fired power plant plumes traveling downwind of their sources than would be predicted from stack emissions alone. However, EPA does not use this scientific evidence in its model to predict the contribution of mercury emissions from U.S. EGUs to total mercury deposition. EPA needs to enhance the mercury chemistry routines in its CMAQ model to implement in-plume conversion of reactive gaseous mercury to gaseous elemental mercury. 3-20

81 3.4.5 References Edgerton E.S., Hartsell B.E., Jansen, J.J., Mercury Speciation in Coal-fired Power Plant Plumes Observed at Three Surface Sites in the Southeastern US, Environmental Science & Technology, 40, EPRI, Evaluation of Mercury Speciation in a Power Plant Plume. Palo Alto, CA: EPRI, Mercury Chemistry in Power Plant Plumes. Palo Alto, CA: Landis M., Ryan J., Oswald E., Jansen J., Monroe L., Walters J., Levin L., ter Schure, A.F.H, Laudal D., Edgerton E., Plant Crist Mercury Plume Study. Presented at Air Quality VII, Arlington, VA, October 27. Laudal D.L, Prestbo E., Investigation of the Fate of Mercury in a Coal Combustion Plume Using a Static Plume Dilution Chamber. Final Report for U.S. Department of Energy. Contract No. DE-FC-26-95FT Lohman K., Seigneur C., Edgerton E., Jansen J., Modeling Mercury in Power Plant Plumes, Environmental Science & Technology, 40, Mason R.P., Sheu G.R., Role of Ocean in the Global Mercury Cycle, Global Biogeochemical Cycles, 16, 1093, doi: /2001gb Pehkonen S.O., Lin C.-J., Two-phase Model of Mercury Chemistry in the Atmosphere, Atmospheric Environment, 32, Pongprueksa P., Lin C.-J., Lindberg S.E., Jang C., Braverman T., Bullock Jr, O.R., Hog T.C., Chuh H.-W., Scientific Uncertainties in Atmospheric Mercury Models III: Boundary and Initial Conditions, Model Grid Resolution, and Hg(II) Reduction Mechanism, Atmospheric Environment, 42, Prestbo E., Levin L., Jansen J.J., Monroe L., Laudal D., Schulz R., Dunham G., Aljoe W., Valente R.J., Michaud D., Swartzendruber P., Interconversion of emitted atmospheric mercury species in coal-fired power plant plumes. Presented at the 7th International Conference on Mercury as a Global Pollutant, Ljubljana, Slovenia; RMZ-Materiali in Geookolje, 2004, 51, Prestbo E.M., Calhoun J., Brunette B., Palidini M., Laudal D., Schulz R., Use of a Dilution Chamber to Measure Stack Emissions and Near-Term Transformations. Presented at EPRI Plume Workshop, EPRI, Palo Alto, CA, USA. Schroeder W., Munthe J., Atmospheric Mercury An Overview, Atmospheric Environment, 32, Ter Schure, A., Caffrey J., Gustin M., Holmes C., Hynes A., Landing B., Landis M., Laudel D., Levin L., Nair U., Jansen, J., Ryan J., Walters, J., Schauer J., Volkamer R., Waters D., Weiss P., An Integrated Approach to Assess Elevated Mercury Wet Deposition and Concentrations in the South Eastern United States. Presented at the 10th International Conference on Mercury as a Global Pollutant, Halifax, Nova Scotia, Canada, July

82 Vijayaraghavan K., Karamchandani P., Seigneur C., Balmori R., Chen S.-Y., Plume-ingrid Modeling of Atmospheric Mercury, Journal of Geophysical Research, 113, D24305, doi: /2008jd Zhao Y., Mann M.D., Olson E.S., Pavlish J.H., Dunham G.E., Effects of Sulfur Dioxide and Nitric Oxide on Mercury Oxidation and Reduction under Homogeneous Conditions, Journal of the Air & Waste Management Association, 56, EPRI s comprehensive sector-wide inhalation risk assessment on all 470 coal-fired generating facilities identified no cancer or non-cancer health risks above regulatory risk threshold, in contrast to EPA s 16 case studies assessment In 2008, EPRI initiated a comprehensive evaluation of HAPs emissions and potential inhalation risks attributable to those emissions from coal-fired electric utilities, based on updated sectorwide data for all units with capacity greater than 25 MW [EPRI, 2009]. As part of this research, EPRI prepared revised emission estimates for the entire, current coal-fired fleet (as of base case year 2007) to combine with stack air dispersion modeling for derivation of health risk estimates for each unit. Emissions estimates for mercury and non-mercury HAPs (e.g., trace elements, acid gases, and select organic compounds) were derived through a series of steps reported in EPRI [2009] and summarized below: Developed HAPs and mercury concentration datasets for coals categorized by coal source and geography (county/state/coal region). Developed coal type specific consumption rates for each EGU, including those units burning coal blends, using publically available fuel data. Obtained individual power plant parameters for 2007 base year operations (e.g., control technologies, physical configurations, stack particulate emission rate, stack parameters, etc.). Updated existing emission correlations and emission factors. As detailed in the EPRI report [2009], these updated correlations, HAP-specific emission factors, plant configuration parameters, and fuel consumption data (including blended coal composition data) were used to estimate annual emissions (mass/year basis) for each power plant unit. These annual emissions estimates (e.g., lb/tbtu in 2007) served as specific input for the tiered inhalation health risk assessment conducted by EPRI and AECOM in EPRI and AECOM followed guidance from EPA s Office of Air Quality Planning and Standards (OAQPS) in designing the study s tiered approach, based primarily on guidelines published in EPA s Air Toxics Risk Assessment Reference Library [EPA, 2004] with additional input from OAQPS staff. In summary, the tiered approach evaluated chronic non-cancer, acute non-cancer, and cancer risk for a comprehensive group of HAPs in the following three scenarios: Tier 1: Screening level inhalation risk assessment on all 470 coal-fired U.S. generating facilities with a total of 825 stacks (base year 2007) using EPA s SCREEN3 model which is based on ISCST3 dispersion algorithms and applies a generic set of meteorological conditions. 3-22

83 Tier 2: Inhalation risk assessment using the EPA Human Exposure Model (HEM3- AERMOD) for a subset of 198 power plants identified as highest risk in Tier 1. Risk evaluation for facilities having stacks located within 50 km of one another: Twolevel analysis of 100 facility groups (consisting of two to 10 power plants) with the potential for overlapping plumes. Summary results for the 470 individual coal-fired power plants included the following: Comparison of the overall Tier 1 modeling to the more-refined Tier 2 modeling indicated that Tier 2 risk was substantially lower 10% of corresponding Tier 1 risk. Even at the 95% percentile level, Tier 2 risk was only 24.1% of the corresponding Tier 1 risk. No individual power plant assessment resulted in a modeled health risk exceeding the EPA recommended thresholds: non-cancer hazard quotient (HQ) > 1, or cancer risk greater than 1 x 10-6 (1 in a million). The 10 facilities with the highest cancer risk had values ranging from 7.14 x 10-7 to 9.78 x 10-7, with all values below the 1 x 10-6 threshold. The 10 facilities with the highest chronic non-cancer risk had values ranging from HQ to 0.668, with all values below the HQ > 1 threshold. The 10 facilities with the highest acute non-cancer risk had values ranging from HQ to 0.295, with all below the HQ > 1 threshold. The primary chemical drivers of cancer risk were arsenic (average of 76%) and hexavalent chromium (17%), with minor contributions from other trace metals (7%). The primary chemical drivers of the chronic non-cancer risk were chlorine (average of 97%) and hydrogen chloride (1%). The primary chemical drivers of the acute non-cancer risk were arsenic (average of 52%) and acrolein (9%), with additional contributions from hydrogen chloride, chlorine, and hydrogen fluoride. Summary results for the two-level assessment of the 100 facility groups with potentially overlapping plumes included the following: The screening assessment, using simple addition of component facility maximum risk, identified 22 facility groups for refined analysis. Further analysis of these 22 facility groups based on refined HEM3-AERMOD modeling with risks summed across the group on a receptor-by-receptor basis identified two groups of facilities with potentially overlapping plume domains that could result in risks above the cancer threshold. The refined, combined cancer risk for the 22 facility groups ranged from 3.95 x 10-7 to 1.21 x The two highest facility groups marginally approached the 1 x 10-6 cancer risk threshold four facilities on the Illinois/Indiana border at 1.03 x 10-6 and five facilities on the Ohio/Pennsylvania border at 1.21 x

84 Given the conservative aspects of this analysis, such as the presumed 70-year lifetime exposure at a fixed outdoor location, the actual inhalation risks are likely to be well below the significance level. In summary, a comprehensive tiered inhalation risk assessment using EPA-prescribed methods with improved emission factors, fuel data, and confirmed stack parameters did not identify significant health risks (cancer or non-cancer) among U.S. coal-fired power plants (as they existed in 2007). These results contrast with those presented by EPA for its non-mercury case studies on 16 (15 coal-fired) power plants [EPA, 2011]. As further described in EPRI Comments, Section 3.6, several issues appear to underlie these differences, indicating the need for EPA to reevaluate its assessment and to undertake a Tier 3 risk assessment for any facility of concern. In a Tier 3 multi-pathway risk assessment using EPA-prescribed methods along with improved data and analytical functionality, EPRI [EPRI, 2011] found no significant health or ecological risks for aquatic or terrestrial receptors from mercury and arsenic emissions at a modeled coalfired generating facility. Appendix J presents a summary of this multi-pathway risk study References EPRI, Updated Hazardous Air Pollutants (HAPs) Emissions Estimates and Inhalation Human Health Risk Assessment for U.S. Coal-Fired Electric Generating Units. EPRI, Palo Alto, CA: EPRI, Multi-Pathway Human Health and Ecological Risk Assessment for a Model Coal- Fired Power Plant. AECOM Report to Electric Power Research Institute (EPRI), Palo Alto, CA. August. EPA, Air Toxics Risk Assessment Reference Library, Volume 2, Facility-Specific Assessment, U.S. EPA Office of Air Quality Planning and Standards. EPA B. April. Strum M., Thurman J., Morris M., MEMORANDUM to Docket EPA-HQ-OARL , Non-Hg Case Study Chronic Inhalation Risk Assessment for the Utility MACT Appropriate and Necessary Analysis, March EPA s 16 case studies for inhalation risk assessment need to be reevaluated due to erroneous data that affect EPA s final risk numbers Summary of EPA s risk assessment To support its appropriate and necessary analysis of coal- and oil-fired EGUs, EPA evaluated the maximum lifetime chronic inhalation risk for cancer and non-cancer health outcomes from utility emissions of HAPs other than mercury. The Agency selected several facilities as Chronic Inhalation Risk Assessment (CIRA) case studies, and documented the methods and results of these case studies in Strum, et al. [2011]. Before ICR [EPA, 2009] emissions data were available, EPA selected an initial set of eight case study facilities. 3 These facilities had the highest estimated cancer and non-cancer risks, based on 3 Xcel Bayfront, SC&E Canadys, Dominion Chesapeake Energy Center, Exelon Cromby Generating Station, Spruance Genco, PSI Energy Wabash River, HECO Waiau, and Dominion Yorktown 3-24

85 the 2005 National Emissions Inventory (NEI) data analyzed using HEM-3. For these facilities, EPA reported that nickel (Ni), hexavalent chromium (Cr+6), and arsenic (As) were the cancer risk drivers; non-cancer risks, such as those from hydrogen chloride, were not significant. After ICR emissions data became available, EPA selected eight additional case study facilities, 4 based on magnitude of emissions, heat input values (throughput), and level of emission control. Thus, the full set of case studies included 16 facilities, all selected for elevated risk potential. To calculate chronic inhalation exposure to emissions from these 16 case study facilities, EPA first computed actual emissions and potential emissions 5 on an EGU unit-by-unit basis. Then the Agency modeled emissions dispersion using AERMOD. In this exercise, each boiler or combination of boilers at a facility was modeled as an individual emission point, and census block centroids within 20 km of each facility were used as model receptors. Inputs to AERMOD include 5 years of meteorological data, as well as 1992 land cover data used to calculate surface characteristics at the meteorological station sites. For the risk assessment for chronic inhalation exposures, EPA first estimated 5-year average HAPs ambient concentrations at census block centroids from the dispersion modeling. The estimated concentration at a census block centroid then became a surrogate for the concentration to which all people residing in that census block were chronically exposed. Then, maximum individual cancer risk (MIR) for each facility was calculated. 6 For its actual emissions case, EPA s analysis estimated the highest lifetime inhalation cancer risk from any of the 16 case study facilities was from Hawaiian Electric Company (HECO) Waiau, an oil-fired facility where a greater than 1 in a million risk (1.0 x 10-5 ) was driven by nickel emissions. EPA calculated that three coal-fired case study facilities had maximum cancer risks greater than 1 in a million: City Utilities of Springfield James River (8.0 x 10-6 ); Dominion Chesapeake Energy Center (3.0 x 10-6 ); and Conesville (3.0 x 10-6 ). EPA s calculations determined that risks at these coal-fired facilities were driven by hexavalent chromium emissions. It is important to note that a screening level inhalation cancer risk of greater than 1 in a million indicates the need for more detailed study of the factors contributing to risk at those facilities. All 16 of the case study facilities had non-cancer target-organ-specific hazard index (TOSHI) values of less than one, with a maximum TOSHI value of 0.4 at oil-fired HECO Waiau. In examining EPA s chronic inhalation risk assessment for the 16 case study facilities, EPRI has identified several critical issues, related to the underlying data, where flaws need to be addressed. As described below, these issues involve stack parameters, stack concentrations, emission rates, and maximum lifetime exposures. 4 Cambria Cogen, Conesville, TVA Gallatin, City Utilities of Springfield James River, Ameren Labadie, PSNH Merrimack, Monticello Steam Electric Plant, and OG&E Muskogee. 5 Actual emissions were developed directly from actual test data; potential emissions were developed when actual test data were not available for facilities/units. Emission factors for these units were calculated from similarly configured units at the facility (site-average across tested units) or from similarly configured units from other facilities tested (average across tested units with similar configuration). 6 A risk due to a continuous exposure to the maximum concentration that equals 24 hours per day, 7 days per week, and 52 weeks per year for a 70-year period at the centroid of an inhabited census block. 3-25

86 3.6.2 Several stack parameters, stack concentrations, and emission rates used by EPA are flawed and in some cases erroneous A review of stack parameters, stack concentrations, and emissions rates conducted by EPRI indicates that corrections to EPA emission estimates for arsenic, chromium, and nickel are warranted. EPRI reviewed the supporting information file provided by EPA ( Casestudy_emis_26apr2011 for EPRI.xls, called the EPA case study spreadsheet in these Comments) to evaluate EPA s methodology for estimating emissions from those case study facilities where EPA estimated a lifetime cancer risk at or greater than 1 in a million. EPA did not provide data for case study facilities where the Agency estimated a lifetime cancer risk at less than 1 in a million. For EGUs tested in the 2010 ICR, EPA appears to have used the reported Part III emission test data to develop annual emissions estimates for use in its case study risk evaluation. For EGUs not tested as part of the ICR, EPA used (a) data from a sister EGU that was tested, or (b) an average emission factor (EF) derived from ICR-tested EGUs grouped by type of control system (e.g., ESP, fabric filter, ESP/wet FGD, etc.). EPA did not apply the proper statistical methodology for lognormally distributed data sets when employing an arithmetic mean for the emissions data from each control class bin; the geometric mean should be used. EPRI s review of EPA s supporting information indicates that several of the coal-fired case study facilities with an EPA-calculated cancer risk of greater than 1 in a million have chromium as the primary risk driver (Appendix E, Table E-1). This is contrary to previous health risk modeling results [EPRI, 2009] that did not show chromium as the primary cancer risk driver for any coal-fired EGU of greater than 25 MW capacity. Therefore, EPRI performed a more detailed evaluation of the chromium emissions data obtained from the ICR test sites. This evaluation is presented in Appendix E. EPRI also found several errors in the stack parameters (e.g., number of stacks, stack flow, velocity) used by EPA for several case study EGUs. These errors would likely have implications for calculated emissions, as well as for the overall risk assessment. The following discussion highlights the results of EPRI s data quality assurance/quality control (QA/QC) review. Seven EGUs have significantly elevated chromium concentrations and consequently, anomalous emission factors: EPRI s review of the individual run data in EPA s data set found seven EGUs with elevated concentrations of chromium, arsenic, nickel, and sometimes manganese (Mn) (see Appendix E for further details). The average chromium concentration from these suspect runs was more than an order of magnitude higher than the average chromium concentration obtained in all the other runs. 7 Thus, two EGUs (Conesville Unit 3, James River Unit 5) had chromium EFs that were statistically significant outliers or extreme values (in this case, defined as the average of the EF data set plus three standard deviations, p < ). Moreover, in some cases the amount of chromium measured at the stack was greater than the amount entering with the coal (e.g., James River Unit 5, Gallatin Unit 2). 7 EPA s average Cr emission factor for the Phase II 1-ESP bin was 5.62E-5 lb/mmbtu. If the 7 EGUs with suspected contamination are excluded, the average is 6.2E-6 lb/mmbtu. 3-26

87 This suggests potential metallic contamination in one or more runs, and data from these seven EGUs should be excluded from any risk analysis. Emission factors are not appropriately calculated: EPRI s review of the ICR data (Bin 1 ESP only) indicates that the data are clearly not normally distributed based on EPA s own guidelines. Thus, the use of an arithmetic mean is not appropriate, and the use of a geometric mean is more appropriate for a lognormal distribution. Emission factors are not differentiated by coal rank: For case study facilities not tested in the ICR, the EPA case study spreadsheet provides an assignment to a specific coal rank and control class. However, for case study facilities with ESPs not tested in the ICR, EPA apparently used a single average EF that is not differentiated by coal rank. The difference between the Phase I and Phase II and derivation of emission factors is not clear: Further explanation from EPA is needed to clarify what is meant by Phase I and Phase II EF derivation. ICR test EGUs were omitted from the 1-ESP bin used by EPA to calculate the Phase II average emission factors: Sunbury, boiler 4, a coal-fired unit with ESP controls, should be included in the 1-ESP bin calculations. EPA should check the ICR data set to be sure all test units have been properly assigned to a control class bin for the emission factor calculations. Several stack parameters used by EPA are erroneous: EPRI found that several of the stack parameters used by EPA are incorrect. This conclusion is based on consultation with the facilities listed in the EPA case study spreadsheet, and on EPRI s 2007 data [EPRI, 2009]. Appendix F, Table F-1 lists all the case study EGUs, EPA s stack parameters, and EPRI s correct stack parameters. EPA needs to incorporate these correct stack parameters in a re-assessment of the non-mercury chronic inhalation risks from the selected case study facilities. 3-27

88 3.6.3 EPRI s (2009) screening risk assessment finds lower risk numbers and warrants that EPA re-analyze its Tier 2 risk assessment and conduct a Tier 3 risk analysis In 2009, EPRI conducted a Tier 2 screening risk assessment that included, among others, the case study facilities reported by EPA as at or above 1 in a million risk [EPRI, 2009; Strum et al., 2011]. Table 3-3 allows comparison of the results of EPRI s study with EPA s 2011 risk numbers. As presented in the table, the EPRI study found that all coal-fired facilities including these eleven facilities had maximum individual lifetime inhalation cancer risks below 1 in one million. Table 3-3 Comparison of EPA Modeled Maximum Individual Lifetime Inhalation Cancer Risk (1 in a million) with EPRI (2009) Results Facility EPA 2011 EPRI 2009* Cambria Cogeneration SC&E Canadys Dominion Chesapeake Energy Center Conesville Exelon Cromby Generating Station TVA Gallatin City Utilities of Springfield James River Ameren Labadie PSNH Merrimack Monticello Steam Electric Plant OG&E Muskogee * EPRI 2007 Emissions Estimates with EPRI 2007 Stack Parameters There are differences in the way modeling was applied that contribute to differences between the modeled risks in the EPA study and the EPRI study. For example, these differences relate to: Meteorological data: EPRI 2009 and 2011 modeling use one year (mostly 1991) of data from the EPA HEM3-AERMOD website, while EPA 2011 modeling selected 5 years of data and, in some cases, used different sites. For the Muskogee facility specifically, EPA used Muskogee Davis Field for surface data while the HEM3 data are from Oklahoma City. However, both surface data and HEM3 data for the James River facility are from Springfield Regional Airport. Emissions variability: The EPRI 2009 study used a constant annualized emission rate, while the EPA 2011 study applied an hourly utilization factor for each EGU. To the extent that variations 3-28

89 in utilization factor are correlated with seasonal changes in meteorology, the modeled long-term concentrations would differ. In calculating cancer risk, EPA took the average risk over 5 years, rather than over one year. Experience with modeling indicates that year-to-year variations in maximum annual average concentrations at specific receptor can differ by a factor of about 1.5, and inter-site differences in meteorology can easily results in difference of more than a factor of 2.0. Thus, the differences between EPRI and EPA modeled risk results are within the expected range of variability given the differences in modeling methods and meteorological data. These differences underscore the cancer risk models high sensitivity to input data selection. A high level of conservatism is build into EPA s risk model (e.g., the MIR implies that a person stays exactly at the center of a census tract for 70 years (from cradle to grave), leading to an unrealistic over-estimation of the risks. In addition, ICR data are based on short-term (3-hour) measurements that are unrealistic and not representative of the long-term (decades) of operation typical of U.S. EGUs. Given that most case study facilities have maximum lifetime risks at or below 1 in a million, these arguments lead EPRI to conclude that a Tier 3 risk assessment is warranted Summary In summary, EPRI s review identified the following: The ICR data for three case studies, as well as an addition 14 EGUs used to develop emission factors, have significantly elevated levels of chromium likely due to contamination. These data should be excluded from any risk analyses. EPA s EF methodology of employing arithmetic means is not appropriate since the data are lognormally distributed and geometric means should be used. EPA s stack parameters are in some cases erroneous. All of the EPRI modeled risks for coal facilities (for the base case year of 2007) are below 1 in a million. High levels of conservatism are built into the risk models used. Thus, EPA should recalculate its Tier 2 screening risk analyses using correct input variables and emission estimates, re-assesses the relevance of the current MIR regarding the estimated risk numbers, and do a Tier 3 risk assessment. Finally, EPA needs to assess how its incorrect parameters and input variables extrapolate to all other U.S. EGUs, and thus to the findings for its entire Tier 2 risk assessment References EPA, Information Collection Effort for New and Existing Coal- and Oil-fired Electric Utility Steam Generating Units (EPA ICR No ; OMB Control Number ), December EPRI, Updated Hazardous Air Pollutants (HAPs) Emissions Estimates and Inhalation Human Health Risk Assessment for U.S. Coal-Fired Electric Generating Units. EPRI, Palo Alto, CA:

90 Strum M., Thurman J., Morris M., MEMORANDUM to Docket EPA-HQ-OARL , Non-Hg Case Study Chronic Inhalation Risk Assessment for the Utility MACT Appropriate and Necessary Analysis, March EPA provides no clear definition of subsistence, near-subsistence, or high-end populations at risk The TSD lacks a clear definition, with supporting assumptions and terms, for the subsistence, near-subsistence, or high-end populations being evaluated in this watershed-level risk assessment. Variations of all three terms intermingle throughout the document. Although the TSD cites (see TSD, Section 1.2) an earlier EPA report [EPA, 2000] to define subsistence fishers as individuals who rely on noncommercial fish as a major source of protein, the TSD interprets this as self-caught fish consumption ranging from a fish meal (8 ounce) every few days to a large fish meal (12 ounces or more) every day (approximately grams per day) an interpretation that is not consistent with earlier EPA documents. The reasoning behind this consumption range definition, including any scientific support, is not provided [TSD, page 17, footnote 23]. Another Agency definition for high-end consumption rates is given in the TSD Executive Summary (page 2 footnote 3) as (i.e. a meal every 1-2 days) as clearly subsistence. Elsewhere, the TSD Executive Summary (page 8 paragraph 3) states that the high-end percentile consumption rates (90th to 99th) (i.e., 120 grams per day (g/day) to greater than 500 g/day fish consumption) define particular populations of interest. In the past, the Agency has recommended various default consumption rates (in the general range of 130 to < 150 g/day) to provide default intakes for subsistence fishers under the Risk Assessment Guidance for Superfund (RAGS) [EPA, 1989; EPA, 1991] or the Fish Advisory Guidance [EPA, 2000]. These default consumption rates are derived from various studies and generally are based on 90th 99th percentile distribution estimates. Clarity about historic versus current TSD definitions of fish consumers at risk would assist readers in evaluating the current national scale risk assessment, particularly the influence of choosing targeted small sample, survey data over the previous EPA recommendations. The TSD narrative (including page 27, paragraph 4) seems to describe a body of peer-reviewed or other literature supporting the identification, selection, and extrapolation of the source populations chosen to represent subsistence fishing exposure and risk, including a variety of diverse populations in different regions of the country. However, only three studies are used in EPA s analyses [Burger, 2002; Schilling et al., 2010; Dellinger, 2004], and two other studies are only mentioned briefly in TSD Appendix C [Burger et al., 1999; Moya et al., 2008]. It is unclear what literature the Agency says generally supports the plausibility of high-end subsistence-like fishing to some extent across the watersheds. If other studies exist, then EPA should provide the values for comparison. The three studies actually used to provide subsistence population estimates, which were extrapolated to the national scale, included a limited number of individuals living in diverse and localized areas, as briefly summarized below in Table

91 Table 3-4 Summary of Studies Used by EPA to Estimate U.S. Subsistence Populations Burger 2002 Schilling, et al Dellinger 2004 General Location South Carolina California, Central Valley Great Lakes Area, Multiple Tribes & States Study Site Recreational Outdoor Show River / River plus Community* Social & Community Health Study Type Convenience, Interview, Recall 12-month Convenience Angler, Intercept (w/ Community*), Recall 30-day Convenience, Questionnaire, Recall 12-month Population (N) % Consumers if Stated Whites (415) 78% Blacks (39) 79% Females (149) 72% Hispanics (45) Vietnamese (33) Lao (54)* Chippewa/Ojibwa (822) * Composite convenience survey design included both angler and community volunteers Small sample sizes could result in less reliable exposure estimates for these subsets of the U.S. population. No studies are provided in the TSD or Appendix C comparing either the degree of similarity, or dissimilarity, between local study source populations and subsistence fishers in the United States. Creel or other survey data for fish consumers could indicate a relative range of consumption for comparison purposes. For transparency, the Agency should summarize any available supporting studies by basic study content, characteristics, design, size, demographics, dietary recall period, and fish intake rates by demographic variables (e.g., sex, race, socioeconomic status/income, geographic area) important in the TSD. This would support the scientific validity of the assessment, and better illustrate the potential variability and uncertainty involved in extrapolating data from small populations to the national scale. It would also provide data for a sensitivity analysis of the relative influence these variables have on risk over- or underestimation. By consistently summarizing all data available, EPA can avoid any appearance of selective presentation and application of the data References Burger J., Stephens W.L., Boring C.W., Jujlinski M., Gibbons J.W., Gochfeld M., Factors in Exposure Assessment: Ethnic and Socioeconomic Differences in Fishing and Consumption of Fish Caught along the Savannah River, Risk Analysis, 19 (3), Burger, J., Daily Consumption of Wild Fish and Game: Exposures of High End Recreationists," International Journal of Environmental Research and Public Health, 12 (4), Dellinger J.A., Exposure Assessment and Initial Intervention Regarding Fish Consumption of Tribal Members of the Upper Great Lakes Region in the United States, Environmental Research, 95 (3), EPA, Risk Assessment Guidance for Superfund (RAGS). EPA/540/1-89/002. December. 3-31

92 EPA, Risk Assessment Guidance for Superfund (RAGS). Part C 1991 EPA/ C. October. EPA, National Guidance: Guidance for Assessing Chemical Contaminant Data for Use in Fish Advisories, Volume 2. EPA 823-B , November. Moya J., Itkin C., Selevan S.G., Rogers J.W., Clickner R.P., Estimates of Fish Consumption Rates for Consumers of Bought and Self-caught Fish in Connecticut, Florida, Minnesota, and North Dakota, Science of the Total Environment, 15 (403) [1 3], Schilling F., White A., Lippert L., Lubell M., Contaminated Fish Consumption in California s Central Valley Delta, Environmental Research, 110 (4), EPA does not provide clear, documented support for selection and application of the cooking loss factor EPA chose to use a cooking loss factor of 1.5 as a multiplier to modify calculations of the daily methylmercury intake rate to account for the concentration of methylmercury in fish tissue due to cooking (e.g. water evaporation) (see TSD, Section 1.3 and Appendix D). However, no justification for selecting this factor, which increases calculated daily intake estimates, is presented (e.g., TSD, page 9 and Appendix D). This factor increases estimated intake by 50%, thus increasing the daily methylmercury intake rate by a constant factor of 33% (using the formula Appendix D) and also increasing any resulting (HQ) risk estimate by a similar factor. The scientific support for this factor is a single citation [Moran et al., 1997], offered without discussion about why 1.5 was selected. Moran, et al. reported a range of methylmercury concentrations resulting from cooking fish using a variety of methods. These were times the concentrations observed in raw fillets (walleye, lake trout) and increased tissue concentrations by 10 60%. An investigation of methylmercury concentration in breaded, deepfried large mouth bass reported cooking loss factors of [Burger et al., 2003]. However, other studies have found no or highly variable changes in methylmercury levels as a result of cooking fish [Armbruster et al., 1988; Gutenmann and Lisk, 1991; Farias et al., 2010; Perello et al., 2008; Torres-Escribano et al., 2011]. EPA should calculate and present the influence of a range of cooking loss values to illustrate the influence of the specific factor selected. Additionally, EPA should make it clear that cooking loss factors are applied to consumption estimates from survey or interview studies, which record estimates (including application of models or other visual aids) representing as consumed fish portions. These factors are not applicable to studies surveying recollection on raw or uncooked fish tissue portions [Burger et al., 2003]. In previous documents, including the Exposure Factors Handbook Update [EPA, 1997] and Guidance for Assessing Chemical Contaminant Data for Use in Fish Advisories [EPA, 2000], EPA suggests using uncooked fish values for exposure assessments and fish advisories if population-specific data are unavailable. It remains unclear why a default value of 1.5 was selected as an exposure modifier for use across all subpopulations in the present analysis, especially given the potential for large geographic and cultural differences in cooking practices. 3-32

93 3.8.1 References Armbruster G., Gerow K.G., Lisk D.J., The Effects of Six Methods of Cooking on Residues of Mercury in Striped Bass, Nutrition Reports International, 37, Burger J., Dixon C., Boring C.S., Effect of Deep-frying Fish on Risk from Mercury, Journal of Toxicology and Environmental Health, Part A, 66 (9), EPA, Exposure Factors Handbook Update. EPA May 1989, EPA 600-P August. EPA, National Guidance: Guidance for Assessing Chemical Contaminant Data for Use in Fish Advisories, Volume 2. EPA 823-B , November. Farias L.A., Favaro, D.I., Santos J.O., Vasconcellos M.B., et al., Cooking Process Evaluation on Mercury Content in Fish, Acta Amazonia, 40 (4), Gutenmann, W.H. and Lisk D.J., Higher Average Mercury Concentration in Fish Fillets after Skinning and Fat Removal, Journal of Food Safety, 11, Morgan J.N., Berry M.R., Graves R.L., Effects of Commonly Used Cooking Practices on Total Mercury Concentration in Fish and Their Impact on Exposure Assessments, Journal of Exposure Analysis and Environmental Epidemiology, 7 (1), Perelló G., Martí-Cid R., Llobet J.M., Domingo J.L., Effects of Various Cooking Processes on the Concentrations of Arsenic, Cadmium, Mercury, and Lead in Foods, Journal of Agricultural and Food Chemistry, 156 (22), Schilling F., White A., Lippert L., Lubell M., Contaminated Fish Consumption in California s Central Valley Delta, Environmental Research, 110 (4), Torres-Escribano S., Ruiz A., Barrios L., Vélez D., Montoro R., Influence of Mercury Bioaccessibility on Exposure Assessment Associated with Consumption of Cooked Predatory Fish in Spain, Journal of the Science of Food and Agriculture, 91 (6), EPA does not clearly define criteria for assignment of census tracts to HUC12 watershed EPA combined two parameters with differing scales to establish the geographic unit used in the TSD risk assessment. Hydrologic Unit Code (HUC) watersheds are based on average about 35 square miles in size, while U.S. census tracts (CTs) used to identify watersheds relevant for subpopulations of interest cover a few tenths to hundreds of square miles (see TSD, Executive Summary, footnotes 10 on page 9, 11 on page 9, 21 on page 16; Sections 1.2 and 1.3; Appendix B). It remains unclear how these differences in geographic resolution were handled in the analyses. EPA cites no criteria for assigning an individual CT to an individual watershed, and hence to a methylmercury intake level developed for that watershed. Were CTs assigned based on how much they intersected with a portion of, or were contained in, a watershed? Were they assigned based on tract centroids? In the case of multiple CTs assigned to one watershed, did only one tract, a subset of tracts, of all HUC-assigned tracts need to meet a minimum threshold (such as greater than 25 members of the subpopulation, or more than 25 people below the poverty level) to be included in the risk assessment for a subpopulation of interest? 3-33

94 The influence of these unspecified decision criteria for assigning CTs could bias exposure outcomes. For example, a single influential CT in a watershed could drive risk, even if the watershed had only a minimal number of fish samples. This possibility is a concern in urban areas, which account for the majority of CTs. Due to population densities, these CTs are more likely to be included in a risk analysis because, for example, they house more than 25 people living in poverty. Such influential CTs may drive the extremes of the distribution without regard to the actual number of high, self-caught fish consumers within their boundaries. Unfortunately, this potential for exposure ascertainment bias could not be derived from the data presented by the Agency, nor was the possible influence of such bias tested by sensitivity analysis Using census tract assigned poverty as an indicator of subsistence fishing or high-end fish consumption lacks justification EPA appears to use census tract assigned poverty as a surrogate measure of subsistence fishing or high-end fish consumption, with minimal justification. The Agency states that higher levels of self-caught fish consumption (approaching subsistence) have been associated with poorer populations [TSD, page 8, footnote 9] but provides no supporting scientific citations, and further states that EPA only assessed this generalized high-end female consumer scenario at those watersheds located in U.S. Census tracts with at least 25 individuals living below the poverty line. [ibid.] It is unclear whether the 25 individuals represent 25 females, adult females, only females of child-bearing age, or 25 individuals regardless of age and sex (see additional discussion in the TSD, page 16, footnote 22; page 23, narrative and footnote 26). Although subsistence fishing can be associated with poverty, poverty is not an indicator of subsistence fishing or high-end fish consumption. However, EPA assumes that poverty indicates the presence of at risk fishing populations, regardless of the actual character or underlying distributions of the CT and HUC watershed combinations. In the Agency s risk analysis, any combination that meets the poverty threshold is weighted equally for the existence of a source population. The same reasoning holds for thresholds specific to race/ethnicity. For example, any watershed with at least one CT housing more than 25 Hispanics, Vietnamese, or Laotian residents regardless of age, sex, and income appears to be included (see TSD, Sections 1.3 and 2.6). This is true even though children born to women of childbearing age are the at risk population. Such low, generalized thresholds may lead to the inclusion of watersheds actually lacking subsistence fishers in the target subpopulation, and to an overestimate of the number of watersheds representing health risks related to methylmercury. It also remains unclear whether the poverty criterion was applied beyond the high-end female consumer scenario (see TSD, page 23 narrative). As stated later, 3-34

95 having at least 25 people below the poverty line reflects the assumption that near-subsistence levels of fishing activity is [sic] more likely among individuals who are economically disadvantaged. [TSD, page 23, footnote 26] As described in more detail below, derived risk estimates (see TSD, page 52, Table 2-8) indicate that poverty, race/ethnicity, or sex (as appropriate) were taken into account for at least some subgroups of interest such as high-end female consumers, poor white fishers in the Southeast, poor black fishers in the Southeast, and poor Hispanics. Some surveys have indicated the socioeconomic characteristics of subsistence level fishers, and related fish consumption. However, the lack of summary or tabulated data and descriptions of subpopulation distributions used in the analysis hinder the reader s ability to understand the analytical criteria used in the TSD assessment. By this EPA assumption, any densely populated urban census tract with a single fish tissue sample could be assigned to a potentially at risk watershed, regardless of the actual degree of recreational or subsistence fishing taking place there Selection of watershed-level risk metrics not adequately addressed In its watershed-level risk estimates using the RfD-based HQ metric and the IQ metric, EPA relies substantially on minimal citation from a limited selection of previously published reviews to support its outcome and risk modeling assumptions. These works include methylmercury reports from the National Research Council [NRC, 2000] and EPA s Integrated Risk Information System (IRIS) supporting documentation [EPA, 2000; EPA, 2001]; lead exposure and IQ effects methodology used by EPA [2007a, 2007b]; and the integrated assessment of methylmercury-to- IQ dose-response of Axelrad, et al. [2007]. EPA should integrate more recent and primary studies to support the selection criteria. Discussion of several issues involving the consistency and potential influence of this earlier work would aid in interpreting Agency findings. These issues include: Use of a linear dose-response model: A linear dose-response model was assumed for both the RfD-based HQ metric and the IQ metric without supporting explanation beyond the interpretation that this is NRC s preference [2000], and that it was easier to quantify IQ loss. However, additional discussion (perhaps in an appendix) of the potential effects on risk estimation of using a linear, non-threshold model would clarify the Agency s position. This is particularly important because the standard methylmercury RfD established by EPA assumes a threshold dose below which an appreciable risk of adverse effects is unlikely. In choosing a k-power model, the NRC committee did not evaluate whether methylmercury exposure data from the Faroe Islands [Grandjean et al., 1997; Budtz-Jørgensen et al., 2000] were better fit by a linear or non-linear model, or by a threshold or non-threshold model [NRC, 2000]. In the TSD (see Section 1.2, page 17, footnote 24), EPA states that no threshold was observed in the Faroe data set; but such an observation cannot be made since neither EPA nor others have been able to acquire and model the Faroe data. However, in the case of the Seychelles and Iraqi data sets, evidence of a threshold has been observed [Huang et al., 2000; Axtell et al., 2000; EPA, 2001]. The choice of an appropriate dose-response model remains a critical issue, given that exposure levels in the United States remain lower than those observed in the primary Faroe Islands study used to derive the methylmercury RfD [EPA, 2000; EPA, 2001] and to inform the IQ dose-response estimate in the TSD and Regulatory Impact Analysis (RIA) [Axelrad et al, 2007]. 3-35

96 Assumptions and uncertainties involved in derivation of the RfD are not fully presented to support EPA s statements: In the TSD, EPA makes general statements that the actual methylmercury RfD is lower than the current EPA IRIS value. As noted above, the Agency claims that no threshold was observed in the Faroe Islands data, which were the primary driver for the RfD value with the Seychelles and New Zealand data sets providing support for uncertainty factors [EPA, 2000; EPA, 2001]. Thus, EPA states that the risk analysis presented in the TSD actually represents an underestimate of the number of watersheds with at risk populations. However, this appears to contradict the actual derivation and basis of the EPA IRIS RfD. Given this perspective, the TSD states that substantial populations remain at risk for neurobehavioral losses at exposure levels well below the RfD. Unfortunately, the Agency offers no citations or narratives discussing the scientific evidence to support these statements beyond referring to EPA documentation on IRIS [EPA, 2000; EPA, 2001]. Changes in IQ are not a well-defined health consequence of methylmercury exposure: Although the TSD does not derive the IRIS RfD value and the IQ dose-response function, the Agency nonetheless relies on the RfD value to demonstrate risk associated with modeled methylmercury exposure estimates and to justify statements that IQ-based risks are likely underestimated. But performance on neurobehavioral tests, not IQ tests, was the primary health endpoint in the Faroe Islands cohort used to derive the RfD [EPA, 2001; Grandjean et al., 1997; Budtz-Jørgensen et al, 2000]. In the TSD, EPA applied an integrated dose-response estimate relating IQ change to methylmercury exposure [Axelrad et al., 2007] modeled from a subset of neurobehavioral tests, common to all three major cohort studies (Faroe, Seychelles, and New Zealand), that were well-correlated to IQ [Grandjean et al., 1997; Budtz-Jørgensen et al, 2000; Meyers et al., 2003; Crump et al., 1998]. However, the underlying differences in confounders measured and included in the Faroe Islands study s final multivariate models add variability to this integrated estimate of IQ change with methylmercury exposure (including issues related to demography and exposure characteristics). Furthermore, as noted above, Axelrad, et al. [2007] were unable to directly access the Faroe Islands data and relied on a non-peer reviewed analysis provided by the study investigators [Budtz-Jørgensen et al., 2005]. The dose-response relationship between methylmercury exposure and IQ change was developed for marine fish and mammalian species, not freshwater fish: Studies in the Seychelles [Meyers et al., 2003] and New Zealand [Crump et al., 1998] involved populations highly exposed to marine fish, while populations in the Faroe Islands consumed sea mammals [Grandjean et al, 1997]. These are the cohort studies used to derive the methylmercury RfD and the IQ dose-response functions used in the present risk analysis. They introduce substantial variance in exposure patterns (fish type, seasonality, confounding chemical and dietary exposures, and other socioeconomic or home environment factors). The uncertainty introduced by using data from marine versus freshwater sources is unknown, but should be qualitatively described in the TSD. Of particular concern is the potential for residual confounding due to the presence of neurotoxic PCBs found in high levels in marine species (particularly pilot whale) consumed in the Faroe Islands. PCBs were measured in the biological fluids obtained from the study cohort (maternal serum and cord blood) [Fangstrom et al., 2000; Grandjean et al., 2001]. These cord blood PCB levels were highly associated with decreased performance on neurological function tests (including the sensitive Boston Naming Test) in the Faroes cohort at 7 years of age [Grandjean et al., 2001]. 3-36

97 The appropriateness of using an IQ risk metric threshold of > 1 or > 2 points lost is questionable: The modeled relationship of IQ change to methylmercury exposure is derived from the research literature on lead exposure [EPA 2007a; EPA 2007b; Trasande et al., 2005; Salklever, 1995; Schwartz, 1994]. The size of the IQ loss (1- or 2-point threshold) is meant to represent the mean response across a population distribution, with greater concern (potential decrements) associated with the tails of the distribution. The reasoning behind EPA s choice to use this threshold, and its relative applicability to health effects of methylmercury exposure, are not described in the TSD (see TSD, Sections 1.2, 1.3, 2.6, Tables 2-6 and 2-7). It should be noted that substantial variations in IQ measures including intra-individual variation in IQ test scores over time and variation between scores on different IQ tests often exceed these thresholds. For example, a series of studies on personal variability in intelligence tests (Wechsler Adult Intelligence Scale) found statistically significant differences between the lowest and highest scores of 5 or more scaled IQ points (20% 9 points, or 3 standard deviations) [Matarazzo et al., 1988; Matarazzo and Prifitera, 1989]. Similar, large intra-individual variations ( 3 standard deviations) in test scores were observed in a more comprehensive battery of 15 neuropsychological tests that yielded 32 scores [Schretlen et al., 2003]. Changes in individual IQ scores over time (generally declining) have been demonstrated in children [Moffit et al., 1992], with some evidence that socioeconomic, home environment, urban/suburban, or other factors may influence decline to a significant extent [Breslau et al., 2001]. Thus, the assumptions EPA made in deriving the methylmercury RfD and in extrapolating a dose-response relationship between methylmercury exposure and change in IQ influence the degree of uncertainty and variability in the TSD risk analyses. These assumptions influence the number of watersheds (and individuals) at risk, as well as the magnitude of the risk. Additional qualitative discussion about the uncertainty, beyond that offered in TSD Appendix F, would improve clear thinking about this important topic. However, quantitative uncertainty analyses, where possible, would provide a better supported range of risk estimates References Axelrad D.A., Bellinger D.C., Ryan L.M., Woodruff T.J., Dose-response Relationship of Prenatal Mercury Exposure and IQ: An Integrative Analysis of Epidemiologic Data, Environmental Health Perspectives, 115, Axtell C.D., Cox C., Myers G.J., Davidson P.W., Choi A.L., Cernichiari E., Sloane-Reeves J., Shamlaya C.F., Clarkson T.W., Association Between Methylmercury Exposure from Fish Consumption and Child Development at Five and a Half Years of Age in the Seychelles Child Development Study: An Evaluation of Nonlinear Relationships, Environmental Research, 84 (2), Breslau N., Chilcoat H.D., Susser E.S., Matte T., Liang K.Y., Peterson E.L., Stability and Change in Children s Intelligence Quotient Scores: A Comparison of Two Socioeconomically Disparate Communities, American Journal of Epidemiology, 154 (8), Budtz-Jørgensen E., Debes F., Weihe P., Grandjean P., Adverse Mercury Effects in 7 Year Old Children Expressed as Loss in IQ. Report to the U.S. Environmental Protection Agency. Department of Biostatistics, University of Copenhagen. EPA-HQ-OAR

98 Budtz-Jørgensen E., Grandjean P., Keiding N., White R.F., Weihe P, Benchmark Dose Calculations of Methylmercury Associated Neurobehavioral Deficits, Toxicology Letters, , Crump K.S., Kjellstrom T., Shipp A.M., Silvers A., Steward A., Influence of Prenatal Mercury Exposure upon Scholastic and Psychological Test Performance: Benchmark Analysis of a New Zealand Cohort, Risk Analysis, 18, EPA, Integrated Risk Information System. Methylmercury. EPA, Water Quality Criterion for the Protection of Human Health: Methylmercury. Final. Office of Science and Technology. EPA-823-R January. EPA, 2007a. Review of the National Ambient Air Quality Standards for Lead: Policy Assessment of Scientific and Technical Information, OAQPS Staff Paper, Office of Air Quality Planning and Standards, Research Triangle Park, NC, EPA-452/R November. EPA, 2007b. Lead: Human Exposure and Health Risk Assessments for Selected Case Studies, Volume I. Human Exposure and Health Risk Assessments Full Scale, and Volume II, Appendices. Office of Air Quality Planning and Standards, Research Triangle Park, NC, EPA- 452/R a. November. Fangstrom B., Athanasiadou M., Bergman A., Grandjean P., Weihe P Levels of PCBs and Hydroxylated PCB Metabolites in Blood from Pregnant Faroe Island Women, Human Exposure, 48, Grandjean P., Weihe P., White R.F., Debes F., Araki S., Yokoyama K., et al., Cognitive Deficit in 7-year Old Children with Prenatal Exposure to Methylmercury, Neurotoxicology and Teratology, 19, Grandjean P., Weihe P., Burse V., Needham L., Storr-Hanse E., Heinzow B., et al., Neurobehavioral Deficits Associated with PCB in 7 Year Old Children Prenatally Exposed to Seafood Neurotoxicants, Neurotoxicology and Teratology, 23, Huang L.S., Cox C., Myers G.J., Davidson P.W., Cernichiari E., Shamlaya C.F., Sloane-Reeves J., Clarkson T.W., Exploring Nonlinear Association between Prenatal Methylmercury Exposure from Fish Consumption and Child Development: Evaluation of the Seychelles Child Development Study Nine-year Data Using Semiparametric Additive Models, Environmental Research, 97 (1), Matarazzo J.D., Prifitera A., Subtest Scatter and Premorbid Intelligence: Lessons from the WAIS-R Standardization Sample, Psychological Assessment, 1, Matarazzo, J.D., Daniel M.H., Prifitera A., Herman D.O., Inter-subset Scatter in the WAIS-R Standardization Sample, Journal of Clinical Psychology, 44, Meyers G.J., Davidson P.W., Cox C., Shamlaya C.F., Palumbo D., Cernichiari E., et al., Prenatal Methylmercury Exposure from Ocean Fish Consumption in the Seychelles Child Development Study, Lancet, 361,

99 Moffitt T.E., Caspi A., Harkness A.R., et al., The Natural History of Change in Intellectual Performance: Who Changes? How Much? Is it Meaningful? The Journal of Child Psychology and Psychiatry, 34, NRC, Toxicological Effects of Methylmercury. Committee on the Toxicological Effects of Methylmercury, Board on Environmental Studies and Toxicology. National Academy Press. Washington, DC. Salkever D.S., Updated Estimates of Earnings Benefits from Reduced Exposure of Children to Environmental Lead, Environmental Research, 70, 1 6. Schretlen D.J., Munro C.A., Anthony J.C., Pearlson G.D., Examining the Range of Normal Intra Individual Variability in Neuropsychological Test Performance, Journal of the International Neuropsychological Society, 9, Schwartz J., Societal Benefits of Reducing Lead Exposure, Environmental Research, 66, Transande L., Landrigan P.J., Schechter C., Public Health and Economic Consequences of Methyl Mercury Toxicity to the Developing Brain, Environmental Health Perspectives, 113, Schilling F., White A., Lippert L., Lubell M., Contaminated Fish Consumption in California s Central Valley Delta, Environmental Research, 110 (4), Unexplained uncertainties remain in underlying methodology used to estimate individual IQ loss due to methylmercury exposure In the RIA, as well as TSD, EPA estimation of IQ losses associated with levels of maternal methylmercury exposure used the integrated dose-response value developed by Axelrad, et al. [2007]. This method selects common subtests from the three primary maternal-child methylmercury-exposed cohorts in Faroe Islands, Seychelles, and New Zealand to create a single coefficient to estimate IQ changes associated with maternal hair mercury concentrations ( 0.18 IQ points per ppm hair mercury; 95% confidence limits to ). Again, it should be noted that none of the primary studies, including the Faroe Islands cohort (see previous EPRI Comments discussion sections), originally measured full IQ. Although the integrated hypothetical full scale IQ measure is made more robust by using data from three studies (positive and negative), EPA still relied on summary data results that include some uncertainty in the demonstration of a low-dose linear response. Additionally, this relatively small incremental change in IQ (0.18 on a 100-point IQ scale) associated with a 1 ppm mercury increase in hair cannot itself be measured for an individual, but must be estimated for the total population (through a series of analytical and exposure assumptions) to derive estimates at the national level (e.g., IQ points saved in RIA, Table 5-7). In fact, the individual effects estimates for the average IQ loss per exposed child were defined out to the fourth or fifth decimal place (see RIA, Tables 5-6 and 5-7) to indicate the potential (and relatively) small effects across the different exposure scenarios. 3-39

100 References Axelrad D.A., Bellinger D.C., Ryan L., Woodruff T.J., Dose-response Relationship of Prenatal Mercury Exposure and IQ: An Integrative Analysis of Epidemiologic Data, Environmental Health Perspectives, 115 (4), Underlying primary study for monetizing IQ (effect of IQ on lifetime earnings) of limited validity EPA relies on valuation models to translate estimates of IQ losses into reductions in lifetime earnings, based on analyses of lead exposure and IQ reduction [Schwartz, 1994; Salkever, 1995; Gross et al., 2002]. Use of the unit-value method to estimate the average effect of a 1-point IQ loss on future earnings (net) has precedent in EPA regulation. Although the population variability and uncertainties in modeling have been discussed extensively in lead regulation and supporting documentation [EPA, 2007a; EPA, 2007b], it should be noted that the national data set used to estimate the relationship between IQ and lifetime earnings Bureau of Labor Statistics National Longitudinal Survey of Youth does not contain a direct measure of IQ. Rather, the survey includes an Armed Forces Qualification Test (AFQT) that is not equivalent to a full-scale IQ test, but has been scaled to estimate IQ (translate or score each AFQT percentile on a normal distribution with a mean 100 and standard deviation of 15). An early analysis conducted at the request of the military reported a correlation between AFQT scores and scores on a full-scale IQ test, the Wechsler Intelligence Scale for Children [Office Secretary of Defense, 1980]. Therefore, in light of the sometimes small, often fractional shifts in IQ in the RIA IQ loss/reductions analysis, the relative loss of precision in scaling between the AFQT and a full-scale IQ test needs to be considered References EPA, 2007a. Review of the National Ambient Air Quality Standards for Lead: Policy Assessment of Scientific and Technical Information OAQPS Staff Paper, Office of Air Quality Planning and Standards, Research Triangle Park, NC, EPA-452/R November. EPA, 2007b. Lead: Human Exposure and Health Risk Assessments for Selected Case Studies, Volume I. Human Exposure and Health Risk Assessments Full Scale and Volume II, Appendices. Office of Air Quality Planning and Standards, Research Triangle Park, NC, EPA- 452/R a. November. Grosse S.D., Matte T.D., Schwartz J., Jackson R.L., Economic Gains Resulting from the Reduction in Children s Exposure to Lead in the United States, Environmental Health Perspectives, 110 (6), Office Secretary of Defense, Relationship between AFQT and IQ Inclusion for the ASVAB Backup Book. Memo to Mr. Danzig from Mr. A.J. Martine. September 4th, Salkever D.S., Updated Estimates of Earnings Benefits from Reduced Exposure of Children to Environmental Lead, Environmental Research, 70,

101 Schwartz J., Societal Benefits of Reducing Lead Exposure, Environmental Research, 66, Potential cardiovascular effects due to methylmercury exposure appear overstated given equivocal nature of studies Increased risk of cardiovascular disease (CVD) following exposure to methylmercury from fish consumption remains an open question in research and related regulatory decision making. A variety of epidemiological studies ranging in size, design, type of methylmercury exposure marker, and cardiovascular outcome have attempted to quantify a potential relationship between methylmercury exposure and CVD. However, results have been equivocal. EPA convened a group of investigators associated with previous studies or risk assessments of CVD and methylmercury exposure to determine whether sufficient data exist to develop a doseresponse analysis for regulatory decision making. Unfortunately, the quality and outcomes of such expert surveys are a function of the individuals selected by the convener, in this case EPA, who included principal investigators of the two selected studies. As summarized in Roman, et al. [2011], the workshop participants qualitatively assessed the available epidemiologic and toxicological literature, using their professional judgment to develop recommendations for the Agency. In the TSD and the proposed rule (76 FR, No. 85, May 3, 2011, see pp ), EPA uses this workshop report as support for a causal relationship between methylmercury exposure and CVD. This appears to be an overstatement, considering results from large, well conducted, environmentally relevant U.S. prospective cohort studies reporting no increased risk for cardiovascular events associated with biological markers of methylmercury exposure [Yoshizawa et al., 2002; Mozaffarian et al., 2011]. In Roman, et al. [2011], the authors also concluded that sufficient data exist to develop a doseresponse value to quantify the relationship between methylmercury exposure and at least one CVD outcome myocardial infarction (MI). They based their conclusion on results of four epidemiologic studies: two European studies reporting a positive, statistically significant association [Guallar et al., 2002; Virtanen et al., 2005]; one null U.S. study [Yoshizawa et al., 2002]; and one Swedish study finding an inverse relationship (methylmercury exposure associated with decreased MI risk) [Hallgren et al., 2001]. Roman, et al. did not evaluate the most recent U.S. report by Mozaffarian, et al. [2011] that found no relationship between methylmercury biomarkers and CVD risk. They reported Moderate epidemiological strength of evidence for the biological plausibility of methylmercury-related MI, increasing the classification to Moderate to Strong if intermediary effects such as oxidation ( Moderate to Strong ), atherosclerosis ( Moderate ), heart rate variability ( Strong ), and hypertension ( Weak ) are taken into account. Roman, et al. recommended the two positive European studies [Guallar et al, 2002; Virtanen et al., 2005] for use in establishing a dose-response value. However, this seems premature. Cardiovascular disease, including MI, remains a complex, multi-etiological disease process with a large number of known and unknown risk factors. To assess the relative contribution of any single environmental causal agent requires a systematic review using a standardized set of causal or weight-of-evidence criteria for supporting study inclusions, exclusions, or other decision making in quantitative or qualitative analyses. Unfortunately, the report by Roman, et al. [2011] does not present evidence of such a formalized analysis. This makes it difficult to assess the 3-41

102 scope of scientific support for the workshop s final recommendations. The small number of available studies precludes any conclusive decision, especially a robust quantitative result. The large U.S.-based cohort studies, particularly the most recent by Mozaffarian, et al. [2011], have several strengths, including evaluation of fatal and nonfatal MI risk, inclusion of women and men, and substantial evaluation of a range of potential risk or confounding factors (e.g., demographics, fish consumption, clinical and familial CVD markers, lifestyle habits, etc). More research is needed in this area, especially since the few mechanistic high-dose experimental studies are of limited value for extrapolating biological effects to exposure ranges relevant to U.S. populations, demographics, and underlying risk structure. In the context of science to support regulatory action in United States it would be pertinent not to exclude all epidemiological studies on methylmercury exposure and MI risk, but rather to apply a set of standardized benchmark dose models to all of the available U.S. and European studies, both negative and positive References Guallar E., Sanz-Gallardo M.I., van t Veer P., Bode P., Aro A., et al., Mercury, Fish Oils and the Risk of Myocardial Infarction, New England Journal of Medicine, 347, Hallgren C.G.., et al., Markers of High Fish Intake Are Associated with Decreased Risk of a First Myocardial Infarction, British Journal of Nutrition, 86, Mozaffarian D., Shi, P., Morris J.S., Spiegelman D., et al., Mercury Exposure and Risk of Cardiovascular Disease in Two U.S. Cohorts, New England Journal of Medicine, 364, Roman H.A., Walsh T.L., Coull B.A., Dewailly E., et al., Evaluation of the Cardiovascular Effects of Methyl Mercury Exposures: Current Evidence Supports Development of a Dose-response Function for Regulatory Benefits Analysis, Environmental Health Perspectives, 119 (5), Virtanen J.K., Rissanen T.H., Voutilainen S., Toumainen T.P., et al., Mercury, Fish Oils, and Risk of Acute Coronary Events and Cardiovascular Disease, Coronary Heart Disease, and All-cause Mortality in Men in Eastern Finland, Arteriosclerosis, Thrombosis, and Vascular Biology, 25, Yoshizawa K., Rimm E.B., Morris J.S., Spate, V.L., et al., Mercury and the Risk of Coronary Heart Disease in Men, New England Journal of Medicine, 347, EPA appears to rely on an inappropriate apportionment of health benefits for the HAPs rule analysis with the majority due to co-benefits of reduced mortality resulting from reduced PM 2.5 emissions and the double-counting of benefits, especially for short-term SO 2 and NO 2 NAAQS The RIA estimates the benefits of the rule will be between $53 and $140 billion depending upon the assumptions used in the benefits assessment. The overwhelming majority of these benefits are tied to co-benefits related to reductions in PM 2.5 -related mortality, in which there are estimated to be between 6,800 to 17,000 fewer deaths yearly (RIA, Tables 1-2, 1-3). 3-42

103 Appropriateness of considering PM 2.5 benefits for this rule; possibility of double-counting Given the importance of PM co-benefits, it is reasonable to ask whether or not they should be addressed by PM regulations per se. Indeed, consideration of the PM National Ambient Air Quality Standard (NAAQS) is underway, and EPA is expected to propose a new standard by the end of 2011 and finalize that standard in It is unclear how many of the benefits that would be achieved by this new PM standard are included in the benefits estimated in this rule. PM cobenefits have been used to justify several recent air quality regulations. In this document EPA discussed the Clean Air Transport Rule and appears to suggest that the PM co-benefits associated with the MACT rule are independent of the Clean Air Transport Rule, although the discussion of this independence would benefit from greater articulation. There are other recent regulations that also claim PM co-benefits, such as the recently promulgated short-term SO 2 and NO 2 primary NAAQS. The benefits of these regulations also need to be addressed in the RIA to demonstrate that their potential co-benefits are not included among those associated with the proposed MACT rule. Clearly, if there is any change in the PM NAAQS, this change must be incorporated into the analysis presented here to ensure that the estimated benefits of the MACT rule will not be achieved via a new PM NAAQS. It should also be noted that if the PM NAAQS is indeed protective of the public health, there should be no public health benefits below the current NAAQS; nevertheless, more than 99.5% of the estimated benefits of the MACT rule are in areas where PM is currently in compliance, and 70% of the benefits are in areas where current annual PM 2.5 levels are less than 10 ug/m 3 (see RIA, Figures 6-14, 6-15) Derivation of benefit estimates There are three major steps in the derivation of these estimates. The first is the choice of a doseresponse function that relates the ambient concentrations of a pollutant, in this case, PM 2.5, to mortality. Secondly, the impact of the rule on changes in ambient concentrations of PM 2.5 need be estimated. The dose-response relationship is then applied to before and after conditions to estimate the net mortality benefit of the rule. Estimates of the number of deaths are then monetized. The comments in this section address primarily the choice of the dose-response function EPA relies on limited number of studies with potential bias on benefits outcome The assessment for PM 2.5 -estimated mortality benefits considers two studies [Pope et al., 2002; Laden et al., 2006] to derive dose-response functions. The benefits are estimated using EPA s BenMAP software. Some justification for the choice of these studies is given in the RIA and in Appendix F of the BenMAP Users Manual. Other studies are dismissed as having study populations of a selective nature; but this is true of all studies, including those used by EPA. Are six cities (Laden et al., 2006] in the northeastern quadrant of the United States representative of the United States as a whole? Moreover, the Laden, et al. study included only Caucasian subjects. Also it has been noted that the Pope, et al. study is of a cohort that is likely of a higher socioeconomic class than the population as a whole. Appendix I lists several of the available cohort studies that could be used to estimate dose-response functions, and several of these studies show no or smaller effects than the studies used by EPA in its analysis. For example, if EPA had used the Veterans study [Lipfert et al., 2000; Lipfert et al., 2003; Lipfert et al., 2009] 3-43

104 or more recent studies of a Medicare cohort [Greven et al., 2011] or [Puett et al., 2011], it would have found no statistically significant co-benefits associated with the proposed new rule. This comment does not suggest that the results of the latter studies replace those used by EPA; rather, it argues that EPA should include the entirety of results from the literature in its derivation of estimates. (It should be noted that there are also some recent positive studies, as well as negative studies.) Consideration of all relevant studies would undoubtedly yield a wider range of benefit estimates than given in the RIA. There is also the issue of the choice of dose-response function from a given study. For example, Laden, et al. estimate mortality risks for two time periods and for a combined time period. EPA chose to use the results for the combined period. However, it can be argued that the most recent time period is most relevant to the present, as the pollution mix and concentrations have changed. If EPA used the second period results to estimate dose-response, the relative risk estimate would decrease from 1.16 to 1.13 and the lower confidence limit would decrease from 1.07 to Use of the second period dose-response would lead to lower estimates of benefits and the confidence limits around those estimates. Since most studies present several sets of results, it is important to acknowledge that the choice of result (dose-response) influences the estimates and uncertainty range. Again, all estimates need to be considered, rather than just one result selected by EPA. The RIA supports its analysis by citing an expert solicitation study [Roman et al., 2011 and noting that results of the solicitation support the choice of the two studies EPA used. It should be noted that the results of any expert solicitation are a function of the experts chosen to participate. In this case, the experts chosen included co-authors of the two studies cited by EPA and included three individuals from the same institution. The experts opinions are not independent from the studies cited by EPA, since some of the same authors/experts were involved Treatment of uncertainty appears rudimentary and in conflict with National Academy of Sciences recommendations to EPA As a minimum, response estimates should be considered in EPA s uncertainty analysis. EPA asked the National Academy of Sciences to review its methods to derive public health benefits. The result was a study by the National Research Council (NRC), whose report [NRC, 2002] stated that EPA should move the assessment of uncertainty from its ancillary analyses into its primary analyses to provide a more realistic depiction of the overall degree of uncertainty. It also noted that EPA should continue to use sensitivity analyses but should attempt to include more than one source of uncertainty at a time. It should be noted that the uncertainty analyses presented in this report follow neither of these recommendations. The only quantitative uncertainties presented are those for the particular dose-response estimates taken from the two studies cited by EPA. Other uncertainties are listed in several tables, with no attempt to quantify them. Paramount in the consideration of these uncertainties is the limited number of relevant studies included by EPA. Other uncertainties may be important, but likely have less influence on the total monetary estimates presented in the RIA. 3-44

105 Methodology in primary health studies used by EPA for benefits estimate appears flawed It is also important to note that the methodology employed in the two studies that EPA used to estimate mortality benefits has recently been questioned. A recent paper [Greven et al., 2011] suggests that the methodologies used by Laden, et al. and Pope, et al. may be flawed because of potential confounding by location-specific variables and variables that are trending on the national level. This could overestimate the benefits associated with changes in PM concentrations. With a revised methodology that overcame this problem, Greven, et al. then examined a data set of 18.2 million Medicare enrollees (aged 65 or older) in 814 areas across the United States. The areas chosen were those in which the zip code centroid was within six miles of an air quality monitor. Death from all causes was considered for the period. The results suggested that there is not any change in life expectancy for a reduction in PM 2.5. This result needs to be factored into the benefits assessment by EPA. In addition, it is imperative that the methodological issue raised by Greven, et al. be addressed by EPA; if the methods employed in the studies used by EPA are indeed flawed, they cannot be used unless the flaws are corrected EPA ignores differential toxicity for species of PM There is also little discussion in the report of the possible differential effects of different PM components on mortality. The Veterans study found no effect of PM per se, but of several PM components, such as elemental carbon. Indeed, this study suggested that some pollutants not routinely monitored are more highly associated with mortality than those pollutants regulated under NAAQS. Increasing evidence is pointing towards carbon-containing particles, whose concentrations and contributions to exposure are not likely to change as a result of the MACT rule. Yet there is no discussion of this topic within the study. Again as a minimum, this issue should be considered in the uncertainty analyses EPA-assigned values to mortality benefits are inconsistent with those of other federal agencies Finally there is the issue of the monetary values assigned to estimated lives lost. It should be noted that EPA values are higher than those assigned by other federal agencies [New York Times, 2011]. The reasons for this inconsistency clearly need be addressed. It should also be noted that EPA is in the process of reviewing the monetary value assigned to a life; hence the values in this draft report should be updated accordingly. The current values are based upon extrapolations from older studies, and they may no longer be appropriate. It would also be important to EPA to consider the age distribution of impacted deaths, as it could be very different from those of the populations used to derive loss-of-life values in general References Greven S., Dominici F., Zeger S., An Approach to the Estimation of Chronic Air Pollution Effects Using Spatio-temporal Information, Journal of the American Statistical Association, 106 (494), Laden F., Schwartz J., Speizer F.E., Dockery D.W., Reduction in Fine Particulate Air Pollution and Mortality: Extended Follow-up of the Harvard Six Cities Study, American Journal of Respiratory and Critical Care Medicine, 173 (6),

106 Lipfert F.W., Perry H.M. Jr, Miller J.P., Baty J.D., Wyzga R.E., Carmody S.E., The Washington University-EPRI Veterans' Cohort Mortality Study: Preliminary Results, Inhalation Toxicology, 12, Suppl. 4, Lipfert F.W., Perry H.M. Jr, Miller J.P., Baty J.D., Wyzga R.E., Carmody S.E., Air Pollution, Blood Pressure, and Their Long-term Associations with Mortality, Inhalation Toxicology, 15 (5), Lipfert F.W., Wyzga R.E., Baty J.D., Miller J.P., Air Pollution and Survival within the Washington University-EPRI Veterans Cohort: Risks Based on Modeled Estimates of Ambient Levels of Hazardous and Criteria Air Pollutants, Journal of the Air & Waste Management Association, 59 (4), New York Times, January 18, and February 16 issues. NRC, Estimating the Public Health Benefits of Proposed Air Pollution Regulations. National Academy Press, Washington, DC. Pope C.A. 3rd, Burnett R.T., Thun M.J., Calle E.E., Krewski D., Ito K., Thurston G.D., Lung Cancer, Cardiopulmonary Mortality, and Long-term Exposure to Fine Particulate Air Pollution, Journal of the American Medical Association, 287 (9), Puett R.C., Hart J.E., Suh H., Mittleman M., Laden F., Particulate Matter Exposures, Mortality and Cardiovascular Disease in the Health Professionals Follow-up Study, Environmental Health Perspectives, 31 March. Roman H.A., Walsh T.L., Coull B.A., Dewailly E., et al., Evaluation of the Cardiovascular Effects of Methyl Mercury Exposures: Current Evidence Supports Development of a Dose-response Function for Regulatory Benefits Analysis, Environmental Health Perspectives, 119 (5), Utility acid gas emissions, which are predominantly HCl, are insignificant contributors to acidification of ecosystems in the United States These comments pertain to III.D. (4); 76 Federal Register, No. 85, May 3, 2011 p and p EPA states that the deposition of hydrochloric acid (HCl) would further exacerbate the impacts of acidifying deposition from sulfur and nitrogen compounds. The extent to which acidifying deposition from sulfur and nitrogen compounds affect ecosystems is difficult to assess, as evidenced in the recent notice of proposed rulemaking for revision of the Secondary NAAQS of NOx and SOx [76 Federal Register, No. 147, August 1, 2011 pp ]. Notwithstanding the uncertainties in the effects from sulfur and nitrogen compounds, the impact of acid gas emissions, such as hydrochloric acid, are insignificant. In fact, the only citation in support of EPA s statement of concern over HCl emissions is a recent manuscript [Evans et al., 2011] that does not reflect conditions relevant to the United States. The work of Evans, et al. [2011] on the environmental effects of HCl deposition on acidification of waterbodies contained within peatlands in the United Kingdom (U.K.) serves as the basis for EPA s discussion of the potential environmental impacts of HCl deposition in the proposed 3-46

107 MACT rule. However, the Evans, et al. study is an inappropriate study to apply to the United States for three primary reasons: 1. U.S. coals have substantially lower chlorine (Cl) content than U.K. coals. 2. U.S. soils are predominantly of different composition than the peatland U.K. soils discussed in Evans, et al.; i.e., the United States has very limited geographical extent of histosols, which are organic soils including peatlands, bogs, moors, and mucks. 3. The extent of chloride (Cl ) mobility observed by Evans, et al. cannot be generalized to all ecosystems. Additionally, there is no evidence to suggest that HCl should be of concern in the United States with respect to environmental acidification. The main reasons are as follows: Emissions of HCl from power plants, and HCl emissions overall in the United States, are negligible, amounting to less than 0.7% of emissions with an acidifying potential (on a molar equivalent or charge equivalent basis). Anthropogenic SOx and HCl emissions have been reduced (on a relative, i.e. percentage, basis) by similar amounts from 1998 to Although a clear decrease in sulfate deposition has been observed by the National Atmospheric Deposition Program (NADP), there has been no signal of decreased HCl deposition over the United States, thereby making any contribution from anthropogenic HCl emissions negligible compared to the background contribution. These and other factors combine to demonstrate that HCl emissions from power plants are negligible contributors to ecosystem acidification in the United States. Accordingly, HCl emissions have not been the focus of past U.S. assessments of acidification. To the extent that chloride deposition has been included in such assessments, it has been to include the contribution from sea-salt aerosol and background HCl levels for completeness Summary of Evans, et al. [2011] research The Evans, et al. analysis, titled Hydrochloric Acid: An Overlooked Driver of Environmental Change, primarily investigated trends in the chemical composition of wet deposition samples (26 sites in the U.K. Acid Deposition Monitoring Network, two-week sampling period) from , surface water samples (22 sites in the U.K. Acid Waters Monitoring Network) from , and soil samples (3 sites in the Environmental Change Network) from The deposition samples were analyzed for the non-marine components of chloride (Cl ) and sulfate (SO 2 4 ) by subtracting out the seasalt Cl and SO 2 4 based on ratios to sodium (Na + ). The main source of non-marine chloride is HCl emissions from coal combustion in EGUs. Results were grouped by zones defined previously by non-marine sulfate deposition that are stated to represent more or less contribution from major industrial emission sources. The surface water and soil samples were intended to provide additional trend information, but the authors explain that neither could be used to reliably separate marine from non-marine Cl or SO 2 4. In 3-47

108 particular, the soil samples were solely from histosols, i.e., heavily organic soils, including peatlands, moors, and bog lands. The manuscript provides the trends in non-marine Cl which, despite substantial interannual variability, decreased in wet deposition samples at every site. Larger decreases were observed at sites with the highest initial non-marine Cl concentrations. Larger decreases were observed in geographic zones thought to be most influenced by industrial emissions. Non-marine sulfate followed generally similar temporal and spatial patterns. Previously reported changes in nitrogen deposition could not account for the observed soil and water trends. The authors suggest that these trends are consistent with the 95% reduction in HCl emissions from power plants over the past 20 years. The trends in total Cl (separated into non-marine Cl ) in surface waters were similar to, or exceeded those in rainfall non-marine Cl, and also were largest in the zone most influenced by industrial emissions. Similarly, decreases in total Cl and SO 4 2 were observed at the three soil sites. At one peat soil site, sulfate and nitrate had remained constant since the early 1990s due to complete retention in soils. The authors state despite the capacity of peats to retain most or all pollutant SO 4 2 and NO 3 [nitrate]... the waters did acidify and are now recovering. The authors conclude the concurrent increase in ph at that site is due to the observed reduction in soil Cl. This would require that chloride was not retained in soils, but rather that Cl mobility and related persistence led to acidification. In summary, the authors suggest that 30 40% of the acidification recovery in the United Kingdom since the 1980s could have resulted from heretofore unconsidered changes in HCl, and not solely changes in sulfate Detailed comments on the use of Evans, et al. regarding U.S. ecosystem acidification The proposed rule states that EPA is also concerned about the potential impacts of HCl and other acid gas emissions on the environment. When HCl gas encounters water in the atmosphere, it forms an acidic solution of hydrochloric acid. In areas where the deposition of acids derived from emissions of sulfur and NOx are causing aquatic and/or terrestrial acidification, with accompanying ecological impacts, the deposition of hydrochloric acid would further exacerbate these impacts. Recent research [Evans et al., 2011] has, in fact, determined that deposition of airborne HCl has had a greater impact on ecosystem acidification than anyone had previously thought, although direct quantification of these impacts remains an uncertain process. (III.D. (4); 76 FR, May 3, 2011 p ) The Evans, et al. manuscript is cited as supporting evidence for the ecosystem acidification impacts of HCl emitted from power plants in the proposed rule (e.g., 76 FR, May 3, 2011 p and p ). However, it is inappropriate to apply these results in such a manner because they are not representative for use in U.S. ecosystems. Additionally, when U.S.- appropriate data is investigated, there is no evidence to suggest that HCl is a concern for environmental acidification under U.S. conditions. The analysis of U.K. ecosystems is inapplicable to the United States for three reasons. 3-48

109 1. The chloride content of coals in the United Kingdom and the resulting emission rates of HCl in coal-fired EGUs in the United Kingdom are vastly different from U.S. coals and HCl emission rates. The coals present in the United Kingdom have high chloride content [Yudovich and Ketris, 2006; Vassileva et al., 2000; Tillman et al., 2009; Spears, 2005]. This can often be as much as 10 or more times the chlorine present in U.S. coals. In fact, the U.S. coals with the highest chloride content (Illinois Basin coals) still have 5 times lower chloride content than representative U.K. coals. As a result, chloride emissions to the atmosphere and potential deposition rates and impacts suggested in the Evans study are not representative of those for U.S. coals. 2. The soil generally found throughout the United Kingdom has a different composition than the range of U.S. soils. Moreover, the soil samples specifically targeted in the Evans analysis were taken from three highly organic soils known as histosols. Histosols include peatlands, moors, bogs, and mucks. Evans, et al. report that the peatlands absorbed nearly all the nitrogen and sulfur, therefore inhibiting the ability of nitrate and sulfate to acidify waters; on the other hand, chloride expressed higher mobility in the soils and did contribute to acidification. These results are not necessarily surprising given U.K conditions and much higher deposition rates due to higher HCl emission rates from anthropogenic sources (as noted above), but they have limited applicability in the United States. The United States has a small area of peatlands, or histosols in general. Peatlands are categorized as one type of histosol soils defined as being high in organic content. The locations of histosols present in the United States are shown in several United States Department of Agriculture (USDA) maps, as presented for example in Figure 3-1. By definition, histosols are primarily organic soils, and thus waterbodies in histosol areas are most often dominated by organic acids. As a result, reducing the input of inorganic acids is unlikely to change their ph. In the United States, such waterbodies show little sensitivity due to the high DOC (dissolved organic carbon) content that controls the ph. As the ph decreases with increasing DOC content, the fraction of total aluminum that is present in the inorganic monomeric form increases, thereby increasing the bioavailability of aluminum, which in turn determines the toxicity of the waters to aquatic biota. 3-49

110 Figure 3-1 Map of U.S. Histosol-type Soils Available at: Accessed June 20, The Evans analysis claims that chloride is not taken up at all by soils and plants, thus making it highly mobile and persistent. Although this has been generally accepted historically, researchers are finding that there may be in fact some retention of chloride in soils and plants. Several recent papers have suggested that plant accumulation of chloride can be stored in the vegetative material, including roots and litter [Kauffman et al., 2003] or be chemically converted to a volatile form before emission to the atmosphere [Hamilton et al., 2003]. Similarly, various soils can act as either a source or a sink for chloride through biological activity or mineralization of soil organic matter [Oberg 1998; Kauffman et al., 2003; Rodstedth et al., 2003; Lovett et al., 2005; Bastviken et al., 2007]. Plant and soil accumulation of chloride is much less than that of sulfur, and certainly much less than that of nitrogen (which is taken up practically in its entirety), but it is a new finding that must be considered in the context of different soils and vegetation. Thus, a key premise in the Evans, et al. study that chlorine is mobile and not retained in the soils and vegetation is likely not universally applicable to all ecosystems. 3-50

111 Additionally, there is no or very limited indication that HCl deposition could be an environmental concern in the United States: U.S. anthropogenic HCl emissions are, on a charge equivalent basis, negligible compared to the sum of other emissions of concern with respect to ecosystem acidification, such as sulfur oxides, nitrogen oxides, and reduced nitrogen (SOx, NOx, reduced nitrogen [NHx]). HCl is the predominant acid gas (e.g., HCl, HF) in primary anthropogenic emissions. However, U.S. emissions of HCl are negligible compared to other primary emissions (SOx, NOx, NHx) that undergo processing in the atmosphere to form material that can deposit to ecosystems and lead to the potential acidification of ecosystems. (In the case of NHx deposition, which is largely in the form of ammonia or ammonium deposition, there is an additional step of nitrification by microorganisms). Based on an analysis of the NEI and Toxics Release Inventory (TRI) trends shown in Tables 3-5 and 3-6, total HCl emissions have been consistently less than 0.7% of the sum of SOx, NOx and NHx emissions on a molar equivalent (also referred to as charge equivalent) basis. It is important to compare emissions on a molar/charge equivalent basis, because this is what is most important from an acid-neutral-base characterization standpoint. Although there has been a clear signal of reduced sulfate deposition in NADP monitors in the Midwest (far away from coastal influences; west of the Appalachians), there has been no signal of HCl reductions, as illustrated by the NADP maps in Appendix H. Power plant SOx and HCl emissions have been reduced (on a relative, i.e. percentage, basis) by a similar amount from 1998 to 2009 approximately 56%, although following different trajectories of reduction. As shown in Figure 3-2, comparable trends are also present for total anthropogenic SOx and HCl emissions. Although a clear decrease in sulfate deposition has been observed by the NADP, there has been no signal of decreased HCl deposition over the United States, thereby implying any contribution from anthropogenic HCl emissions is negligible compared to the background contribution. Particularly in areas not affected by coastal influences, such as the Midwestern United States, the anthropogenic signal of chloride deposition cannot be separated from the background. In coastal regions of the United States, emissions and chemistry on sea-salt particles dominate chloride deposition and exhibit a strong degree of interannual variability. 3-51

112 EGU SO2 (Norm 1998) EGU HCl (Norm 1998) All SO2 (Norm 2009) All HCl (Norm 2009) Figure 3-2 HCl and SOx Emissions for EGUs and All Sources Normalized to 1998 Source: Tables 3-5 and 3-6 An additional comment is that the Evans analysis did not include dry deposition data. Thus, even for the specific case of U.K. HCl deposition rates and peatland soils, it is hard to discern the true meaning and implications of the work, since it is based on incomplete data Summary In summary, given that the United States has different (lower chlorine content) coals, different emissions (indiscernible from the background), and generally different soils from those specifically targeted in the Evans, et al. study (histosols), the Evans results do not apply to the United States. These and other comments noted above indicate that HCl is a negligible contributor to environmental acidification in the United States; accordingly, HCl emissions have not been the focus of past U.S. assessments of acidification. To the extent that chloride deposition has been included in past U.S. assessments of acidification, it has been to include the contribution from sea-salt aerosol and background HCl levels for completeness. 3-52

113 Table 3-5 HCl Emissions Relative to Other Emissions of Concern with Respect to Acidification in the United States ( ) Source Category EGU HCl (lb) 552,351, ,519, ,486, ,866, ,864, ,297, ,112,944 ALL HCl (lb) 611,028, ,284, ,525, ,182, ,363, ,002, ,859,166 EGU SO2 kton 13,416 12,583 11,396 10,850 10,436 10,425 10,415 EGU NOx kton 6,232 5,721 5,330 4,917 4,709 4,404 4,098 EGU NH3 kton EGU HCl kton ALL SO2 kton 18,943 17,544 16,348 15,932 14,774 14,797 14,820 ALL NOx kton 24,347 22,843 22,599 21,546 21,135 20,464 19,793 ALL NH3 kton 4,940 4,857 4,907 3,689 4,133 4,136 4,138 All HCl kton EGU SO2 Geq EGU NOx Geq EGU NH3 Geq EGU HCl Geq kton = thousand US short tons, can also be written as 000 ton, thousand ton, or 10 3 ton Geq = giga equivalents, can also be written as eq, billion eq, or 10 9 eq 3-53

114 Table 3-5 (continued) HCl Emissions Relative to Other Emissions of Concern with Respect to Acidification in the United States ( ) Source Category ALL SO2 Geq ALL NOx Geq ALL NH3 Geq ALL HCl Geq EGU (SO2 + NH3 + NOx) Geq ALL (SO2 + NH3 + NOx) Geq 1,287 1,214 1,178 1,080 1,063 1,050 1,037 EGU HCl/(SO2 + NH3 + NOx) eq ratio 1.3% 1.6% 1.7% 1.6% 1.7% 1.7% 1.6% ALL HCl/(SO2 + NH3 + NOx) eq ratio 0.6% 0.7% 0.7% 0.7% 0.7% 0.7% 0.7% Sources: data: Adapted from ; 2008 NH3 emissions from EGU and all sources estimated by equating to 2007 EPA has yet to post 2008 NH3 emissions on the cited webpage data: EGU data is generated from ; 2009 data for all sources only takes into account decreases in EGU emissions relative to 2008, i.e., all other emissions are kept at 2008 levels. eq = the amount of a substance that will either react with or supply one mole of hydrogen ions (H+) in an acid base reaction, or react with or supply one mole of electrons in a redox reaction eq ratio = ratio of two quantities based on molar equivalents (calculated by dividing the mass by the equivalent weight, also known as the gram equivalent weight, or gram equivalent mass ) 3-54

115 Table 3-6 HCl Emissions Relative to Other Emissions of Concern with Respect to Acidification in the United States ( ) Source Category / (2009/1998) EGU HCl (lb) 514,179, ,148, ,009, ,141, ,578,891 44% 56% ALL HCl (lb) 565,802, ,771, ,408, ,675, ,941,875 46% 54% EGU SO2 kton 10,404 9,404 8,941 7,552 5,816 43% 57% EGU NOx kton 3,792 3,446 3,320 3,006 2,066 33% 67% EGU NH3 kton % -300% EGU HCl kton % 56% ALL SO2 kton 14,844 13,655 13,006 11,429 9,693 51% 49% ALL NOx kton 19,122 18,110 17,321 16,339 15,399 63% 37% ALL NH3 kton 4,143 4,135 4,131 4,131 4,131 84% 16% All HCl kton % 54% EGU SO2 Geq % 57% EGU NOx Geq % 67% EGU NH3 Geq % -300% EGU HCl Geq % 56% 3-55

116 Table 3-6 (continued) HCl Emissions Relative to Other Emissions of Concern with Respect to Acidification in the United States ( ) Source Category / (2009/1998) ALL SO2 Geq % 49% ALL NOx Geq % 37% ALL NH3 Geq % 16% ALL HCl Geq % 54% EGU (SO2 + NH3 + NOx) Geq % 59% ALL (SO2 + NH3 + NOx) Geq 1, % 38% EGU HCl/(SO2 + NH3 + NOx) eq ratio 1.7% 1.7% 1.7% 1.6% 1.4% ALL HCl/(SO2 + NH3 + NOx) eq ratio 0.7% 0.7% 0.7% 0.6% 0.4% Estimate Sources: data: Adapted from ; 2008 NH3 emissions from EGU and all sources estimated by equating to 2007 EPA has yet to post 2008 NH3 emissions on the cited webpage data: EGU data is generated from ; 2009 data for all sources only takes into account decreases in EGU emissions relative to 2008, i.e., all other emissions are kept at 2008 levels. 3-56

117 References Bastviken D., Thomsen F., Svensson T., Karlsson S., Sande P., Shaw G., Matuche M., Oberg G., Chloride Retention in Forest Soil by Microbial Uptake and by Natural Chlorination of Organic Matter, Geochimica et Cosmochimica Acta, 71, Evans C.D., Monteith D.T., Fowler D., Cape J.N., Brayshaw S., Hydrochloric Acid: An Overlooked Driver of Environmental Change, Environmental Science & Technology, 45, Hamilton J.T.G., McRoberts W.C., Keppler F., Kalin R.M., Harper D.B., Chloride Methylation by Plant Pectin: An Efficient Environmentally Significant Process, Science, 301, 206, doi: /science Kauffman S.J., Royer D.L., Chang S., Berner R.A., Export of Chloride after Clear-cutting in the Hubbard Brook Sandbox Experiment, Biogeochemistry, 63, Lovett G.M., Likens G.E., Buso D.C., Driscoll C.T., Bailey S.W., The Biogeochemistry of Chlorine at Hubbard Brook, New Hampshire, USA, Biogeochemistry, 72, Oberg G., Chloride and Organic Chlorine in Soil, Acta Hydrochimica et Hydrobiologica, 26 (3), Rodstedth M., Stahlberg C., Sanden P., Oberg G., Chloride Imbalances in Soil Lysimeters, Chemosphere, 52, Spears, D.A., A Review of Chlorine and Bromine in Some United Kingdom Coals, International Journal of Coal Geology, 64, Tillman D.A., Duong D., Miller B., Chlorine in Solid Fuels Fired in Pulverized Fuel Boilers Sources, Forms, Reactions, and Consequences: A literature Review, Energy & Fuels, 23, Vassileva S.V., Eskenazyb G.M., Vassilevaa C.G., Contents, Modes of Occurrence and Origin of Chlorine and Bromine in Coal, Fuel, 79, Yudovich Y.E. and Ketris M.P., Chlorine in Coal: A Review, International Journal of Coal Geology, 67,

118

119 A APPENDIX: MACT FLOOR AND ICR DATA QUALITY ISSUES This Appendix lists confirmed and suspected errors and data quality issues that EPRI has identified in the MACT floor calculations and in the ICR Part II or Part III data underlying those calculations. The list is not comprehensive; it includes only observations that EPRI has made in the course of evaluating the proposed limits. EPA should keep the following limitations of EPRI s review in mind: EPRI has not performed a full evaluation of the ICR data. Although we conducted a detailed quality review of some draft Part III test reports, our review only encompassed about half of the Part III EGUs. We posted a report summarizing the findings of those reviews to the docket [EPA-HQ-OAR ]. Similar issues to those identified in our report are expected to be present in the tests that EPRI did not review. In addition, the recipients of EPRI s reviews were not required to make corrections based on our observations and EPRI did not verify that corrections were made. EPRI has not reviewed final stack test reports, ERT files or supporting documentation for the Part III tests and is not able to perform that review in the time available for comment. EPRI does not have access to test reports and other documentation for the ICR Part II emissions tests. A.1 Incorrect Heat Rate Heat rates are incorrect for 53 EGUs listed in the MACT floor spreadsheets. The errors impacted ICR facilities with multiple boilers that contribute steam to a common generator. In these cases, the gross capacity in megawatts (electric) (MWe) for the entire facility was entered into the Part II report rather than an allocated capacity for each boiler. EPA divided the maximum heat input (in million British thermal units per hour [MMBtu/h]) by the gross capacity (MWe) to obtain a heat rate in million British thermal units per megawatt-hour (MMBtu/MWh) and applied that heat rate to each boiler at the facility. When EPA used the heat rate to convert emissions in lb/mmbtu to emissions in lb/mwh, the result was an emission rate that is several times too low. An example of this problem is shown below. This error produces heat rates that are not realistic for fossil-fuel fired power plants. The 53 EGUs with heat rate errors are listed in Table A-1. Example: AES Beaver Valley (ORIS 10676) has three 40-MWe units and one 20-MWe unit feeding one generator. The Part II report for this facility lists the following information: Max_heat_input of the 40 MWe units = 550 MMBtu/h Max_heat input of the 20 MWe unit = 285 MMBtu/h MWe_capacity (gross) of facility = 135 MWe The EPA used the reported values to calculate heat rates for each unit as follows: Heat rate applied by EPA to each 40 MWe boiler: 550*1000/135 = 4,074 British thermal units per kilowatt-hour (Btu/kWh) Heat rate applied by EPA to the 20 MWe boiler: 285 * 1000/135 = 2,111 Btu/kWh A-1

120 A typical heat rate for a bituminous coal-fired unit is about 12,000 Btu/kWh. A quality control check would have identified this error. The incorrect heat rates produced erroneously low estimates of emissions in lb/mwh for the 40- and 20-MWh boilers (three and six times too low, respectively). Due to the impact on emissions, units with this problem are prevalent among the lowest emitting plants sorted by lb/mwh. The impacted units were used as the basis for several future plant limits for coal (PM, total metals, antimony, and beryllium). A.2 MACT Spreadsheet Calculations EPRI noted two discrepancies between EPA s description of the procedure used to derive MACT floor limits and the implementation of that procedure. 1. The EGUs used as the basis for new unit limits for chromium and selenium are not the lowest emitting units. No explanation is offered for why those EGUs were selected. This discrepancy is present both in the coal MACT PM workbook [EPA-HQ-OAR ] and in the revised MACT floor memo [EPA-HQ-OAR ]. 2. The procedures used to calculate the upper prediction limit (UPL) were not consistent across all parameters and EGU categories. In some cases, the mean emission was calculated from the lowest test series reported for each unit, while in others it was calculated from all test series reported across all of the units in the MACT pool. EPRI also noted that in calculating a UPL for TPM and metals for coal-fired EGUs, EPA did not include all Part II and III test series in the variability term, as they did for mercury. Instead, only the lowest test series for each EGU was included in the UPL variability term, giving a much lower variability and hence, a lower MACT floor for existing EGUs. The methods used to calculate the mean and variability for each EGU category and parameter are listed in Table A-2. A.3 Miscellaneous Data Errors EPRI has identified many confirmed or suspected errors in individual EGU emissions values that could potentially impact MACT floor calculations. EPRI considers an error to be confirmed if we have verified a unit conversion error or data transcription error from the facilities ICR report, or have confirmed a reporting error with the facility owner. Suspected errors are those that present illogical results (e.g., a metal emission hundreds of times higher than the input fuel concentration), but EPRI could not confirm the error. Table A-3 lists the errors identified in the course of our evaluations. This list does not represent all errors in the ICR database. It is critical that EPA implement basic quality control checks to identify and correct errors in the ICR data before they are used as the basis for MACT floor calculations. Several checks that should be performed include: Inspect outliers: review the test reports for the unusually high and low emitting units for each parameter. Plotting the distribution of values can aid in identifying likely outliers. Screen for unit conversion errors: calculate the ratios of emissions reported in the ICR database (e.g., micrograms per dry standard cubic meter [µg/dscm] to pounds per million British thermal units [lb/mmbtu]). A-2

121 Inspect highly variable emissions results: differences in emissions of several orders of magnitude between the Part II and Part III data are likely to indicate a unit conversion problem or some other error. This basic check would have identified the mercury conversion problem. Compare metal emissions from the stack with emissions calculated from the fuel analysis: If stack emissions greatly exceed fuel input, this indicates either sample contamination or a calculation error. A.4 Missing ICR Test Data Test results for eight liquid oil-fired EGUs in the EPA s ICR Part III Access database (version 4) are missing from the oil MACT floor spreadsheet [EPA-HQ-OAR ] and supporting memorandum [EPA-HQ-OAR ]. These results should be added and the UPL calculation revised to consider all EGUs that submitted test results, or EPA should provide justification for not including them. PREPA Aguirre, Unit 1 (facility ID 70000) PREPA Aguirre, Unit 2 (facility ID 70000) PREPA South Coast (Costa Sur), Unit 3 (facility ID 70002) PREPA South Coast (Costa Sur), Unit 6 (facility ID 70002) PREPA San Juan, Unit 7 (facility ID 70003) PREPA San Juan, Unit 8 (facility ID 70003) PREPA San Juan, Unit 9 (facility ID 70003) PREPA San Juan, Unit 10 (facility ID 70003) In addition, test results from No. 6 oil unit Cabras, Unit 1 (ORIS G1) were submitted late; these should be included in the MACT limit calculation. A.5 Bibliography EGU ICR Part I/II Access Database, dated March 16, 2011, posted to EPA web site EGU ICR Part III Access Database, dated March 16, 2011, posted to EPA web site EPA-HQ-OAR Revised Coal Mercury MACT floor spreadsheet: dated May 18, EPA-HQ-OAR Coal HCl MACT floor spreadsheet. EPA-HQ-OAR Coal PM MACT floor spreadsheet. EPA-HQ-OAR Oil MACT floor spreadsheet. EPA-HQ-OAR Pet Coke MACT floor spreadsheet. A-3

122 EPA-HQ-OAR Data Quality Evaluation of Hazardous Air Pollutants Measurements for the US Environmental Protection Agency's Electric Utility Steam Generating Units Information Collection Request. EPRI, Palo Alto, CA: EPA-HQ-OAR National Emission Standards for Hazardous Air Pollutants (NESHAP) Maximum Achievable Control Technology (MACT) Floor Analysis for Coal- and Oilfired Electric Utility Steam Generating Units REVISED, dated May 18, A-4

123 Table A-1 Heat Rate Errors Plant Name Boiler ID ORIS Code Max_heat_input (MMBtu/h) MWe_capacity (MWe gross) Heat Rate Used by EPA (Btu/kWh) Type of Error Lamar Plant ,091 Heat input Bowen ,807 Heat input Edwardsport ,982 Unit capacity, heat input (fuel oil unit) Edwardsport ,982 Unit capacity Edwardsport ,982 Unit capacity Edwardsport ,982 Unit capacity AES Westover, LLC ,935 Unit capacity AES Westover, LLC ,935 Unit capacity Roxboro Steam Electric Plant 3A ,669 Unit capacity Roxboro Steam Electric Plant 3B ,669 Unit capacity Roxboro Steam Electric Plant 4A ,656 Unit capacity Roxboro Steam Electric Plant 4B ,656 Unit capacity R. E. Burger ,470 Unit capacity R. E. Burger ,470 Unit capacity Glen Lyn ,720 Unit capacity Glen Lyn ,720 Unit capacity Rodemacher ,990 Unit capacity Rodemacher ,990 Unit capacity A-5

124 Table A-1 (continued) Heat Rate Errors Plant Name Boiler ID ORIS Code Max_heat_input (MMBtu/h) MWe_capacity (MWe gross) Heat Rate Used by EPA (Btu/kWh) Type of Error Mayo Electric Generating Facility 1A ,726 Unit capacity Mayo Electric Generating Facility 1B ,726 Unit capacity North Branch 1A ,822 Unit capacity North Branch 1B ,989 Unit capacity Anclote Power Plant ,097 Unit capacity Anclote Power Plant ,491 Unit capacity Grant Town Power Plant BLR1A ,105 Unit capacity Grant Town Power Plant BLR1B ,105 Unit capacity Edgecombe Genco, LLC 1A ,148 Unit capacity Edgecombe Genco, LLC 1B ,148 Unit capacity Edgecombe Genco, LLC 2A ,148 Unit capacity Edgecombe Genco, LLC 2B ,148 Unit capacity Chambers Cogeneration LP Boil ,867 Unit capacity Chambers Cogeneration LP Boil ,867 Unit capacity Cedar Bay Generating Company L.P. CBA ,796 Unit capacity Cedar Bay Generating Company L.P. CBB ,796 Unit capacity Cedar Bay Generating Company L.P. CBC ,796 Unit capacity A-6

125 Table A-1 (continued) Heat Rate Errors Plant Name Boiler ID ORIS Code Max_heat_input (MMBtu/h) MWe_capacity (MWe gross) Heat Rate Used by EPA (Btu/kWh) Type of Error AES Hawaii BLR A ,032 Unit capacity AES Hawaii BLR B ,032 Unit capacity AES Thames A ,817 Unit capacity AES Thames B ,817 Unit capacity AES Beaver Valley ,074 Unit capacity AES Beaver Valley ,074 Unit capacity AES Beaver Valley ,074 Unit capacity AES Beaver Valley ,111 Unit capacity Morgantown Energy Facility CFB ,879 Unit capacity Morgantown Energy Facility CFB ,810 Unit capacity Hopewell ,368 Unit capacity Hopewell ,206 Unit capacity Southampton Power Station ,500 Unit capacity Southampton Power Station ,574 Unit capacity TES Filer City Station ,335 Unit capacity TES Filer City Station ,335 Unit capacity Scrubgrass Generating Company L.P. Scrubgrass Generating Company L.P. #2 CFB Boiler ,186 Unit capacity #1 CFB Boiler ,701 Unit capacity A-7

126 Table A-2 Inconsistencies in UPL Calculation Method for Existing EGUs Category Parameters Basis of UPL in lb/mmbtu Basis of UPL in lb/mwh All Coal Mercury Mean of Lowest Test Series Variance of All Test Series Coal 8,300 Btu/kWh Mercury Mean of Lowest Test Series Variance of All Test Series Coal < 8,300 Btu/kWh Mercury Mean of Lowest Test Series Variance of All Test Series Coal with ACI Mercury Mean of Lowest Test Series Variance of All Test Series All Coal HCl Mean of Lowest Test Series Variance of All Test Series All Coal HF Mean of Lowest Test Series Variance of All Test Series All Coal SO 2 Mean of All Test Series Variance of All Test Series All Coal PM Mean of All Test Series Variance of Lowest Test Series Mean of All Test Series Variance of All Test Series Mean of All Test Series Variance of All Test Series Mean of Lowest Test Series Variance of All Test Series Mean of Lowest Test Series Variance of All Test Series Mean of Lowest Test Series Variance of All Test Series Mean of Lowest Test Series Variance of All Test Series Mean of All Test Series Variance of All Test Series Mean of All Test Series Variance of Lowest Test Series All Coal Total Metals Mean of Lowest Test Series Mean of Lowest Test Series All Coal Individual Metals Mean of Lowest Test Series Variance of Lowest Test Series All Oil All parameters Mean of All Test Series Variance of All Test Series All Pet Coke All parameters Mean of All Test Series Variance of All Test Series IGCC All parameters Mean of All Test Series Variance of All Test Series Mean of Lowest Test Series Variance of Lowest Test Series Mean of All Test Series Variance of All Test Series Mean of All Test Series Variance of All Test Series Mean of All Test Series Variance of All Test Series Note: Inconsistent UPL calculations may not have impacted the MACT floor in all cases, as some floors were based on detection limits or beyond-the-floor calculations. A-8

127 Table A-3 Confirmed and Suspected Errors in ICR Data Plant Name Units ORIS Code Parameters Affected Description of Error Error Status Montville Mercury Norwalk Mercury Middletown Mercury Northside 1A 667 Lead Part III mass and ug/m 3 values are not consistent with results in lb/mmbtu. Reporting error: results in lb/mmbtu are 1000 times too high. Emission rates reported to EPA are 1000 times lower than values in MACT spreadsheet and Access database. Emission rates reported to EPA are 1350 times lower than values in MACT spreadsheet and Access database. Average lead emissions are 100 times the fuel content due to an outlier in the Part III Run 1 result. A laboratory or reporting error is suspected. Confirmed Confirmed Confirmed Suspected Vermillion 1/2 897 Acid gases Part III acid gas data from Joppa Steam (ORIS 887) are duplicated in the Part III Access database and MACT floor spreadsheet as Vermillion Units 1 and 2 (ORIS 897). Vermillion did not test for acid gases in the ICR. Confirmed Harding St Metals Part III metals emissions results for Harding St. (ORIS 990) Unit 70 listed in the EPA s Access Database do not match the data submitted to EPA. Confirmed Walter Scott Filterable PM Part II filterable PM measurement of 1 x 10-6 lb/mmbtu from 8/2007 test is in error. Reported value of lb/mmbtu was transcribed incorrectly from hard copy as lb/h. Confirmed Green River Acid gases Part III acid gas data from Ghent (ORIS 1356) Unit 4 are duplicated in the Part III Access database and MACT floor spreadsheet as Green River Unit 5 (ORIS 1357). Green River did not test for acid gases in the ICR. Confirmed A-9

128 Table A-3 (continued) Confirmed and Suspected Errors in ICR Data Plant Name Units ORIS Code Parameters Affected Description of Error Error Status Reid Gardner U Selenium BL England Mercury Ashville 1d 2706 Selenium Average coal Se emissions value of lb/mmbtu listed in the MACT PM spreadsheet is equivalent to about 3000 ppmw. Part III Se fuels concentrations (1 2 ppm) are not consistent with this value. A data entry or unit conversion error is suspected. Docket does not contain sufficient data to confirm error. Error in high Part II value impacted a test series in the revised coal Mercury MACT floor spreadsheet of 2.43E-6 lb/mmbtu ug/dscm lb/h. Data entry or conversion error in the Part II data. The average coal Se emission of 0.07 lb/mmbtu in the coal PM MACT floor spreadsheet is about 1000 ppmw, which is abnormally high. The coal Se concentration listed in the Part III Access database is only 2 3 ug/g. The Run 1 heating value (HHV_dry) is a factor of 1000 too low ( Btu/lb), which impacted the calculation of the lb/mmbtu coal Se value. Suspected Confirmed Confirmed Mitchell Power Station 1 and All Facility owner indicates that fuel type was incorrectly reported as No. 6 oil. Actual fuel type was distillate (No. 2) oil. Confirmed Scrubgrass 1/ Selenium ICR Part III Run 2 Se emission value is suspected to be a factor of 1000 too high. A data entry error in the units of mass is suspected. Docket does not contain laboratory data to confirm. Suspected A-10

129 Table A-3 (continued) Confirmed and Suspected Errors in ICR Data Plant Name Units ORIS Code Parameters Affected Description of Error Error Status AES Somerset Mercury EPA entered a value of 0.13 lb/mmbtu for Hg into the Part II Access database rather than the value reported in the Part II Hg test report of lb/tbtu. Emission value is 1000 times too high. Confirmed Oak Grove All nonmercury metals Results of Part II metals compliance testing were entered into the Part I/II report in units of µg/dscm when they should have been entered in µg/dscf. This error caused emission rates in lb/mmbtu to be too low. MACT floors for antimony and lead for new coal units are impacted as are MACT floors for existing units for some non-mercury metals. Confirmed Bonanza PM Logan Generating Co. B HCl Edensburg Mercury Part II PM results reported in µg/dscm should have been in mg/dscm. Microgram amounts are not detectable with PM methods. Results are 1000 times too low. Docket does not contain sufficient information to confirm error. All Part II runs were below detection limit except for one, which is reported at 16 ppmvd 0.5 lb/h. Data entry or conversion error. Coal Hg emission average of 65 lb/mmbtu is much too high. May be reported in the wrong units. Suspected Confirmed Suspected Cedar Bay Generating Co. CBA, CBB, CBC Filterable PM Part III mass results reported in mg in the ERT files appear as µg in the Part III Access database. Emissions in the coal PM MACT spreadsheet are 1000 times too low. Confirmed A-11

130 Table A-3 Confirmed and Suspected Errors in ICR Data Plant Name Units ORIS Code Parameters Affected Description of Error Error Status Indiantown Cogeneration 1,2 and Filterable PM Filterable PM 2.5 Part III mass results reported in mg in the ERT file for this facility are probably in µg. A mass of mg is not detectable with PM methods. Docket does not contain sufficient information to confirm error. Suspected TS Power BLR HCl, HF A 1000x error in HCl and HF is present in the Part III ICR Access database. The database values are ~1E-8 lb/mmbtu. EPA corrected these in the coal Acid Gas MACT floor spreadsheet to ~1E-5 lb/mmbtu. Values should be corrected in the database as well. Confirmed A-12

131 B APPENDIX: METHOD SENSITIVITY FOR ICR DATA USAGE AND COMPLIANCE WITH PROPOSED MACT LIMITS B.1 Introduction For EGUs to comply with the emission limits in the proposed EGU MACT rule, there need to be test methods and/or performance specifications for continuous monitors available that can measure the target HAPs or surrogates accurately at those stack gas concentrations. For each of the proposed MACT limits for coal- and liquid oil-fired EGUs, EPRI attempted to determine whether an accurate measurement could be obtained at those limits by a competent laboratory using the test methods required by the ICR for this rulemaking. EPRI did not attempt to determine whether the reported results of individual stack tests conducted for Part III of the ICR could be measured accurately by the test method. To accomplish that task would require access to more detailed documentation of field and laboratory quality control than was reported by most of the ICR respondents. The sensitivity of a particular test method varies among laboratories for many reasons, including the instrument type and model, the amount of sample analyzed, the expertise of the analyst, and the care taken to minimize background contamination. In addition to variations in lab performance, errors in calculating and reporting emissions can produce values that fall outside of the normal range of detectable concentrations. In reviewing ICR test reports, EPRI observed that many respondents incorrectly reported detection limits for multi-fraction methods, failing to sum the fractions correctly. Another common error was to blank-correct to a concentration below the method detection limit. Both of those errors can produce reported emission values below the actual capability of a method. The terms most often used in discussion of method sensitivity are the detection limit and the quantitation limit. The statistical basis of these terms and the appropriate methodology to apply them to laboratory methods is contentious, and a subject of ongoing discussion between EPA, laboratories, and the regulated industries. It is not EPRI s intention to address the statistical approaches to assessing detection and quantitation here. However, to understand the capabilities of stack test methods, it is necessary to define several terms: Analytical detection limit is the value that EPA requested ICR respondents to report for emissions measured below detection limits in the ICR. EPA did not define the term, but it is likely that they intended it to be equivalent to the method detection limit. Method detection limit (MDL) is defined in 40 CFR Part 136, Appendix B as the minimum concentration of a substance that can be measured and reported with 99% confidence that the analyte concentration is greater than zero and is determined from analysis of a sample in a given matrix containing the analyte. The MDL is determined by carrying through the entire analytical procedure (including laboratory sample preparation) seven or more aliquots of a sample containing the analyte at a concentration one to five times the estimated detection limit (EDL). The MDL is calculated as approximately three times (3.14 for 7 aliquots) the standard deviation of the concentration measured in the replicate analyses. B-1

132 Reporting limit (RL) is a value used by a laboratory, below which measured concentrations will be reported as non-detected (ND) or less than RL. Quantitation limit can be defined as the smallest concentration of analyte greater than the detection limit where the measurement accuracy meets the objectives of the intended purpose. There is little agreement on how a quantitation limit should be determined. Some of the definitions in common use include: Ten times the standard deviation of a blank (a sample that does not contain the analyte). This quantity is generally termed the limit of quantitation (LOQ). Ten times the standard deviation of samples at the estimated detection limit, or 3.18 times the MDL. The EPA minimum level (ML), used in most EPA water methods, employs this definition. The lowest concentration of standard solution used to calibrate an instrument (the lowest calibration point). A quantitation limit based on a laboratory MDL study will generally be lower than the method can accurately determine in practice, for several reasons including: An MDL study is often run by a single analyst, on a single instrument, on a single day, immediately after calibrating the instrument. Therefore the results ignore many sources of within-laboratory variability that may overestimate the capability of the method. The matrix used in MDL studies is generally clean water (for liquid samples) or clean sand (for solid samples) spiked with the chemicals of interest. These prepared samples are very different from a sample that has been exposed to flue gas from a fossil fuel-fired power plant. The MDL samples do not contain co-occurring chemicals that can interfere with chemical measurements. The MDL sample matrix is also likely to have different chemical characteristics (e.g., ph, ionic strength) from a flue gas sample. A quantitation limit based on an analytical or method detection limit does not reflect measurement variability associated with sampling and sample handling. Sources of sampling variability that are not reflected in a MDL study include: 1. Sample train glassware cleaning 2. Sampling media preparation and handling 3. Measurement of the sampled gas volume 4. Transport of the sample gas to the collecting medium. 5. Recovery of the sample in the field 6. Sample preservation and shipping The preferred approach to determining the minimum concentration that can be measured by a stack test method is to conduct field tests at stationary sources using multiple sampling trains. The results indicate the precision of the entire measurement system, including all sampling and laboratory procedures. As precision typically worsens at lower concentrations, the tests should be conducted at the entire concentration range over which the method is intended to be used. The lowest concentration at which the method provides acceptable precision would be the B-2

133 quantitation limit of the method. This approach was used in the ReMAP project, an EPAsupported study of several of the test methods used in the ICR [ASME, 2001]. The multiple train approach only indicates method precision; it is does not provide information about whether the method is measuring the true stack gas concentration. To evaluate method biases, other approaches are required, including dynamic spiking of the flue gas sample and the use of specialized analytical techniques. In evaluating the ICR test methods, EPRI considered multiple sources of information, including analytical detection limits stated in the test methods, the detection and/or reporting limits provided by the ICR respondents, and the ReMAP study. However, the ReMAP study tests are fairly old (conducted in the 1970s through 1990s) and some of the methods have improved greatly. Thus, the ReMAP results are generally not as useful in evaluating the present-day capability of methods as is their performance in the ICR Part III testing. B.2 Mercury and Non-mercury Metals B.2.1 Metals by EPA Methods 29 and 30B The ICR required respondents to measure non-mercury metals by Method 29 using an analytical technique of inductively coupled mass spectrometry (ICP-MS). Methods allowed for mercury were Method 29 and Method 30B. Method 29 for mercury was allowed only for stacks with emissions greater than 1 microgram per cubic meter (µg/m 3 ), using the analytical procedures specified in ASTM D , the Ontario Hydro Method. 8 About 80% of the Part III mercury tests were performed using Method 30B. Five sources of information were used to evaluate the sensitivity of these methods within the range of concentrations observed in the ICR Part III testing, and at the proposed MACT limits for existing and future coal- and liquid oil-fired EGUs. They included: 1. Analytical detection limits, if provided in the method 2. EPRI survey of ICR Part III MDLs. 3. Comparison of MACT limits to MDLs 4. Comparison of MACT limits to below detection limit (BDL)-flagged ICR test results 5. Results of the ReMAP evaluation of multi-train sampling data B Analytical Detection Limits Method 29/6020: Neither Method 29 nor Method 6020 lists detection limits; Method 29 states that ICP-MS detection limits are generally lower by a factor of 10 than detection limits using EPA Method 6010, and lower by a factor of three for beryllium. Although this guidance is only approximate, it can be used to calculate an analytical detection limit at the stack. Assuming that (a) all of the sample is digested, (b) the final liquid volume for analysis is the method-specified 300 milliliters (ml) for analytical fraction 1 and 150 ml for analytical fraction 2A, and (c) the 8 B-3

134 minimum stack gas volume of sample required by the ICR (120 cubic feet, ft 3 ) was collected, the analytical detection limits in lb/tbtu calculated for each non-mercury metal are shown in Table B-1. Based on this comparison and assuming that the analytical quantitation limit for Method 6020 is at least 3 times the detection limit Method 29 does not have adequate sensitivity to measure antimony accurately at the alternative individual metal MACT limits for existing units. At the individual metals MACT limits for new coal EGUs, most metals would not be measured accurately, as indicated by the colored cells in Table B-1. Note that the new coal EGU MACT limits for chromium and selenium are questionable, as discussed in Appendix A. Table B-1 Comparison of Coal MACT Limits to Approximate Analytical Detection Limits MACT Floor for New Coal Units 1 MACT Floor for Existing Coal Units 2 Approximate Analytical Detection Limit 3 Methods 29/30B lb/tbtu lb/tbtu lb/tbtu Antimony Arsenic Beryllium Cadmium Chromium Cobalt Lead Manganese Mercury Nickel Selenium Calculated from ratios of MACT Floors for New and Existing Coal Units in lb/gwh 2 For mercury, MACT floor for coal units 8,300 Btu/kWh 3 Calculated from EPA guidance on Method 6020 detection limits 4 Quantitation limit of ASTM D ASTM D states a lower limit of quantitation for mercury of 0.5 µg/m 3, which corresponds to a stack emission for coal-fired units of approximately 0.45 lb/tbtu. This level of sensitivity is adequate to measure mercury at the MACT limit for existing coal units, but not at the limit for new units. As discussed earlier, analytical detection limits do not consider the many sampling- and matrixrelated factors that can raise detection limits at the stack. However, method sensitivity could also be better in practice than indicated in Table B-1 due to improvements in instrumentation and B-4

135 collection of larger sample volumes than required by the ICR. Therefore, this comparison was not given as much weight as the ICR results in the determination of method adequacy. Method 30B: This method does not specify a MDL or require that one be determined. The tester must estimate in advance of testing the minimum sample volume that will provide an adequate mass of mercury for analysis for a specific stationary source and analytical technique. It is possible, however, to state a minimum mass of mercury that must be collected on a sorbent tube to obtain an accurate measurement by a given analytical method (e.g., thermal desorption/cold vapor atomic absorption spectrometry (CVAAS) or acid extraction/cold vapor atomic fluorescence spectrometry (CVAFS). Based on EPRI s reviews of many ICR Part III test reports, almost all samples were analyzed using thermal desorption/cvaas (e.g., Ohio Lumex). EPRI s research indicates that at least 10 nanograms (ng) should be collected on the front bed of the Method 30B sorbent tube to ensure accurate results. This minimum mass is required for two reasons: to minimize inaccuracy due to instrumental gas flow fluctuations that can distort the desorption peak shape, and to raise the sample mass sufficiently above the background sorbent mercury level. Sorbent blank mercury levels range from about 0.1 ng to 2 3 ng per tube section. The importance of the minimum loading is corroborated by the ICR test results. Test runs with loadings below 10 ng generally failed method quality criteria for one or more of the three samples [either the second bed breakthrough criterion or the relative difference (RD) criterion]. Assuming a sampling flow rate of 1 liter per minute (0.06 cubic meters per hour) and the minimum 2-hour sampling duration required in the ICR, a 10 ng loading on a sorbent trap corresponds to an emission rate of about 0.08 lb/tbtu for coal-fired EGUs. This is much higher than the proposed MACT limit for new coal EGUs (calculated as lb/tbtu). Some lowemitting ICR units were sampled for longer durations (e.g., 4 hours) and at higher sample volumes (e.g., 2 cubic meters per hour), so that adequate mass was collected. However, at a recent EPRI CEM workshop, several stack testers reported leakage and equipment failure at high sample flow rates. Based on this information, EPRI has classified the adequacy of Method 30B to measure at the proposed MACT limit for new coal EGUs as uncertain. For liquid oil-fired EGUs, it is uncertain whether Method 30B can measure mercury accurately at the proposed MACT limit for existing EGUs, for the reasons discussed above. Method 30B will be unable to measure mercury accurately at the proposed limit for new units without an excessively long sampling duration. The monitoring requirements in the proposed rule require mercury collected on the sorbent trap to be at least one-half of the standard (approximately lb/tbtu). To collect 10 ng of mercury at this concentration with a flow rate of 1 liter per minute would take about 30 hours. B EPRI Review of Method 29 ICR Part III MDLs EPRI reviewed laboratory reports from ICR Part III tests to determine the range of MDLs reported. The source of this information was about 250 draft test reports provided by participants in EPRI s ICR data quality review program [EPRI, 2010a; EPA-HQ-OAR ]. EPRI did not have resources to review all test reports; instead, we identified the laboratories and stack testers that provided analytical services for a large number of the ICR tests and analyzed one report for each stack tester/laboratory pairing. For Method 29 non-mercury metals, we B-5

136 reviewed the data sets from 10 ICR units. For Method 29 mercury, each of 24 test reports accessible to EPRI was reviewed. Each selected laboratory report was reviewed to determine if the laboratory report listed MDLs. In some cases, the laboratory report included MDLs but RLs were entered into the ERT. Where an MDL was reported in units of mass (usually µg), EPRI calculated the equivalent emission in lb/mmbtu using the minimum sampling volume required by the ICR and a standard F-factor for coal combustion. MDLs for all fractions of the sampling train were summed to obtain the final MDL. For non-mercury metals, five of the ten laboratory reports that we reviewed contained sufficient information to calculate MDLs. The rest either did not contain that information or omitted some of the required analytical fractions. For mercury, only four of the 24 test reports contained sufficient information to calculate MDLs. Table B-2 lists the minimum and maximum analytical detection limits (in lb/tbtu) identified in this process. Table B-2 Results of EPRI Survey of ICR Part III Method 29 MDLs* Minimum MDL lb/tbtu Maximum MDL lb/tbtu Antimony Arsenic Beryllium Cadmium Chromium Cobalt Lead Manganese Mercury Nickel Selenium *Based on a limited sampling of ICR test reports B Evaluation of Method 29 ICR Data An important factor in evaluating the sensitivity of stack test methods is what the laboratory determines can actually be measured. As discussed earlier, there are many reasons why methods cannot measure real-world samples down to the level indicated by an analytical MDL. For some of the Method 29 metals, BDL-flagged values were reported as high as 10,000 times the highest MDL verified by EPRI. Although some of these extremely high BDL emissions may be reporting errors or inappropriate analytical procedures, most often they indicate analytical interferences that required the laboratory to dilute the sample. B-6

137 EPRI s data quality review of the ICR tests determined that many of the emission values that were flagged as BDL were based on reporting limits, not detection limits. However, given the impracticality of resolving and correcting the basis for each ICR test, EPRI decided to assume that BDLs do represent detection limits. Given the extremely large range of BDL values, this decision is not expected to interfere with a comparison to MACT limits. For each metal, we plotted the cumulative frequency distributions of all test runs and of BDL test runs from the Part III ICR and compared those data to the range of MDLs (in lb/tbtu) from the EPRI review and to the MACT floor limits for existing and new EGUs. Separate comparisons were made for units firing coal and liquid oil. Three pieces of information were used to decide whether Method 29 provides adequate sensitivity to measure metals at the alternative MACT limits: 1. The lowest MDL from the EPRI review multiplied by 10: This arbitrary multiplier was used to account for the range of MDLs reported by the labs, as well as the higher concentration above the method detection limit that is required to obtain accurate quantitation. 2. The percentage of all ICR test runs that were flagged BDL: A high percentage of BDL values indicates poor sensitivity for real-world samples. 3. The median BDL value: If many BDL values were reported and the median BDL value was above a MACT floor limit, this indicates that the method does not provide adequate sensitivity in real-world samples For some metals, it was not clear from this evaluation whether the method could measure accurately at the MACT limits. Generally, this was due to contradictory information, for example where the MDLs were well below the MACT limits but many high BDL values were reported. For those comparisons, EPRI stated that the method adequacy was uncertain. Two examples are shown below to illustrate EPRI s evaluation process. Figures B-1 and B-2 show a cumulative frequency distribution for the ICR Part III test runs of arsenic from coal-fired EGUs. Figure B-1 shows all test runs, while Figure B-2 shows only the BDL test runs. The MACT limit for existing EGUs is more than 10 times the minimum MDL from EPRI s survey, and the median BDL value was less than the limit. These factors led EPRI to conclude that Method 29 can accurately quantify arsenic at the MACT limit for existing units. However, 17% of all test runs were flagged BDL, and the median BDL value was 20 times the proposed limit for new coal-fired EGUs. The proposed limit is lower than the range of MDLs reported by laboratories in EPRI s survey of the ICR data. These facts indicate that Method 29 cannot measure arsenic adequately at the proposed limit for new coal EGUs. Figure B-3 shows a cumulative frequency distribution for all ICR Part III lead test runs from liquid oil-fired EGUs, a metal for which Method 29 does provide adequate sensitivity at both the existing and new MACT floors. No ICR test runs were flagged BDL. The minimum MDL from EPRI s survey multiplied by 10 is below the MACT floor limit for new and existing units. B-7

138 Figure B-1 Cumulative Frequency Distribution of All Coal-fired EGU Arsenic Test Runs B-8

139 Figure B-2 Cumulative Frequency Distribution Showing Only Coal-fired EGU BDL Arsenic Test Runs B-9

140 Figure B-3 Cumulative Frequency Distribution Showing All Liquid Oil-fired EGU Lead Test Runs It is difficult to evaluate the adequacy of Method 29 to measure total metals or total non-mercury metals at the proposed alternative limits, as the limits are not calculated from the individual metals tests but are the result of a separate statistical calculation on total metals reported for each EGU. The approach used by EPRI was to compare the proposed MACT limits with the sum of the lowest MDLs from the EPRI review multiplied by 10 (2.5 lb/tbtu). However, that approach may be too optimistic as to the method s capability to measure at the limit, as the MDLs do not reflect the actual performance of the method on stack samples. EPRI s method adequacy determination for coal-fired EGUs is summarized in Table B-3. Method 29 was determined to be sensitive enough to accurately measure all metals at the proposed MACT limits for existing coal-fired EGUs. However, Method 29 is not sensitive enough to quantify five of the HAPs metals (antimony, arsenic, beryllium, cadmium, and mercury) at the MACT limits for future coal-fired EGUs, and may not be sensitive enough for four other metals (cobalt, lead, manganese, and nickel). The sensitivity of Method 29 is adequate to quantify total non-mercury metals at the proposed alternative limit for existing and future coal-fired EGUs. The method adequacy determination for new coal-fired EGUs 8,300 Btu/lb also applies to the limits for new IGCCs, where those are the same as for coal EGUs. Mercury cannot be measured at the limit for new IGCCs ( lb/gwh) by any of the proposed methods. B-10

141 EPRI s method adequacy determination for liquid oil-fired EGUs is summarized in Table B-4. Method 29 was determined to be sensitive enough to accurately measure all metals, except beryllium and mercury, at the proposed alternative individual metal MACT limits for existing liquid oil-fired EGUs. Method 29 is not sensitive enough to accurately measure five of the HAPs metals (arsenic, beryllium, cadmium, mercury, and selenium) at the proposed alternative MACT limits for future liquid oil-fired EGUs. The sensitivity of Method 29 is adequate to quantify total non-mercury metals at the total metals limits for both existing and future liquid oil-fired EGUs. As a final observation, EPA stated that Method 29 should not be used in the ICR Part III for stacks with mercury emissions under 1 µg/dscm (approximately 0.9 lb/tbtu for coal units and 0.7 lb/tbtu for oil units). However, 31% of the coal-fired EGU test runs and 100% of the liquid oil-fired EGU test runs measured with Method 29 had emissions values below those levels and by EPA s definition should be considered inaccurate. Therefore, EPA should consider whether omitting those test results from their evaluation would improve the accuracy of the MACT floor determination. Additionally, according to this guidance, Method 29 should not be used for compliance with the proposed MACT floor for existing units of 1.2 lb/tbtu, as that value is fairly close to the level at which EPA states the method is not applicable. Table B-3 Adequacy of Method 29 and Method 30B for Coal-fired EGUs MACT Floor for New MACT Floor for Percent of Values Minimum Median New Unit Limit Existing Unit Limit Coal Units 1 Existing Coal Units 2 BDL MDL x 10 BDL Value Methods 29/30B lb/tbtu lb/tbtu lb/tbtu lb/tbtu Antimony % No Yes Arsenic % No Yes Beryllium % No Yes Cadmium % No Yes Chromium % Yes 3 Yes Cobalt % Uncertain Yes Lead % Uncertain Yes Manganese % Uncertain Yes Mercury (Method 30B) % No MDL 0.01 Uncertain Yes Mercury (Method 29) % No Yes Nickel % Uncertain Yes Selenium % Yes 3 Yes Total Metals 4.0 NA NA Yes Total non-hg Metals 40 NA NA Yes 1 Calculated from ratios of MACT Floors for New and Existing Coal Units in lb/gwh 2 For mercury, MACT floor for coal units, >8,300 Btu/KWh 3 Based on new unit limit selected in MACT spreadsheet No MDL - Method 30B detection limits are site-specific. Lowest emitting unit had adequate Hg loading on trap. NA - not applicable. Method Sensitivity is Adequate At: B-11

142 Table B-4 Adequacy of Method 29 and Method 30B for Liquid Oil-fired EGUs MACT Floor for New Liquid Oil MACT Floor for Existing Liquid Oil Existing Unit Limit Units 1 Units New Unit Limit Method 29/30B lb/tbtu lb/tbtu Antimony % Yes Yes Arsenic % No Yes Beryllium % No No Cadmium % No Uncertain Chromium % 0.68 NA Yes Yes Cobalt % Yes Yes Lead % NA Yes Yes Manganese % 0.07 NA Yes Yes Mercury (Method 30B) <1% No MDL 0.01 No Uncertain Mercury (Method 29) % 0.3 NA No No Nickel % NA Yes Yes Selenium % No Yes Total Metals NA 3.0 NA Yes Yes 1 Calculated from ratios of MACT Floors for New and Existing Liquid Oil Units in lb/gwh or lb/mwh 2 MACT proposal requires mercury on trap to be >half of the limit. Thus, the actual sensitivities required are and lb/tbtu NA - not applicable. No BDL results were reported. No MDL - Method 30B detection limits are site-specific. Percent of Values BDL Minimum MDL x 10 Median BDL Value Method Sensitivity is Adequate At: B ReMAP Study This EPA-sponsored project [ASME, 2001] evaluated the Method 29 measurement precision of seven metals: antimony, arsenic, beryllium, cadmium, chromium, lead, and mercury. Results were compiled from field studies of municipal waste combustors (full-scale and pilot units) that conducted simultaneous sampling with multiple trains. The analytical technique used for each field study was not stated. The results were used to determine the precision of the method across the range of concentrations in the tests, and also to model precision for a wider range of concentrations. The ReMAP report concluded that Method 29 imprecision is within an acceptable range of 13 18% RSD above 20 µg/dscm, but that the imprecision increases asymptotically below 10 µg/dscm (about 9 lb/tbtu for coal units). The median concentrations of the ICR emissions were at least an order of magnitude lower than 9 lb/tbtu, which could indicate that method precision is poorer than 18% RSD. However, no information was provided in the ReMAP report on the analytical technique used. It is likely that instrument sensitivity has improved since the ReMAP tests. Thus, the ReMAP conclusions on Method 29 have limited usefulness and were not considered in EPRI s evaluation. B.2.2 Continuous Mercury Monitor Current installations of mercury measurement systems fall into two categories: continuous mercury monitors (CMMs) and sorbent trap systems. There are a number of commercially available CMMs; however, the EGU market is dominated by two vendors, Thermo Scientific and Tekran. A significant portion of the EGU industry has opted to use the sorbent trap approach rather than a CMM. Sorbent trap monitoring methods were discussed earlier, under the subheading of Method 30B. When used in a long-term monitoring application, the sorbent trap B-12

143 may be left in the stack for a period of days to several weeks. This extended duration increases the mercury loading on the trap and consequently increases the sensitivity of the method. A sorbent trap system operated for periods of days to weeks is expected to have sufficient sensitivity to monitor mercury at the proposed MACT limits for all EGUs. However, the manual test methods that must be used to perform Relative Accuracy Test Audits (RATAs) on the instrumental systems do not have sufficient sensitivity to measure at the new unit limits, as discussed earlier. There is a concern with potential failure of sorbent traps to capture or retain mercury during long-term sampling in power plants that have elevated emissions of sulfur dioxide and sulfuric acid. There also may be a loss of the mandated third-bed mercury spike in these circumstances, causing the test run to fail quality control criteria. Installation of a CMM requires the addition of an umbilical line separate from the one serving the SO 2 /NOx CEM system. The probe and sample line for CMM systems are maintained at much higher temperatures than for conventional CEMs. This has presented serious problems for plant instrument technicians. Many CMM systems failed on startup due to overheating or electrical shorting somewhere in the tube bundle. The lines are much larger, heavier, and more expensive than conventional lines. Tekran CMM systems operate at higher temperatures, due to the design decision to transport oxidized mercury from the probe to the analyzer before converting it to elemental mercury just prior to the measurement. This higher temperature has caused more sample line failures. Fewer problems have been reported with the Tekran analyzer, which has a long history in the ambient monitoring arena, and uses a gold trap with argon carrier gas to avoid oxygen quenching in the final measurement. The CMM systems supplied by Thermo Scientific, which convert the oxidized mercury to elemental mercury at the probe and therefore operate the sample lines at a somewhat lower temperature, overall have had fewer reported problems. The Thermo Scientific systems have a unique system that uses nitrogen for dilution to avoid oxygen quenching during measurement. The nitrogen is generated in the system, and its use requires additional quality control checks. The very hot probe boxes in CMMs have presented service problems for both suppliers, posing a hazard to service personnel and shortening component life. Other common problems relate to certification of the mercury calibrators. To maintain National Institute of Standards and Technology (NIST) certification, the calibrators must be recertified periodically against vendor calibrators maintained as NIST prime calibrators. Both Tekran and Thermo Scientific have reported problems in coordinating with NIST to maintain their certification. NIST has also failed to resolve a discrepancy between the head-space elemental standard and the liquid-based oxidized calibrators. The EERC of North Dakota conducted a study to investigate the low-level measurement capability of CMMs from Thermo Scientific and Tekran, with support from EPRI, DOE, and the Illinois Clean Coal Institute. The objectives of the study were to investigate the low-level measurement capabilities and variability of CMMs. EPRI reported that the quantification level of the Tekran system was 0.1 micrograms per normal cubic meter (µg/nm³), compared to 0.4 µg/nm³ for the Thermo Scientific system [EPRI, 2010b]. These concentrations are equivalent to B-13

144 about lb/tbtu. Method 30B was used as the basis for comparison, with extended sampling times used to reach the low levels of mercury. Based on this study, CMMs from both vendors can measure mercury accurately at the proposed MACT limit for existing coal-fired EGUs. However, neither CMM would be able to measure mercury accurately at the revised MACT limit for new coal-fired EGUs greater than or equal to 8,300 Btu/lb, or at the proposed limits for either existing or new liquid oil-fired EGUs. B.3 Acid Gases and Sulfur Dioxide EPRI reviewed methods sensitivity for HCl, HF, and SO 2 only. Method CTM-033 for hydrogen cyanide (HCN), which was included in this HAPs test group for the ICR, had extremely poor performance, including elevated detection limits and blank contamination. In addition, the inability of most testers to maintain the required basic ph in the CTM-033 impingers led EPRI to conclude that the ICR data for HCN are biased low [EPRI, 2010a]. B.3.1 HCl and HF by Methods 26/26A The ICR required respondents to measure HCl and HF by Method 26A. Methods 26 and 320 (Fourier transform infrared spectroscopy [FTIR]) were also allowed for dry stacks. Most tests on coal-fired EGUs were done with Methods 26A or 26. Liquid oil-fired EGUs were split about evenly between Methods 26/26A and Method 320. EPRI s ICR data quality review indicated that the Method 320 data quality varied greatly among testing contractors [EPRI, 2010a]. Because FTIR is a technically complex, relatively expensive method that is not widely used, we have not included it in this evaluation. Four sources of information were used to evaluate the sensitivity of Methods 26/26A within the range of concentrations observed in the ICR Part III testing, and at the proposed MACT HCl and HF limits, including: 1. Analytical detection limits 2. EPRI survey of ICR Part III MDLs 3. Comparison of MACT limits to MDLs and BDL-flagged ICR test results 4. Results of the ReMAP study of parallel train Method 26 data B Analytical Detection Limit for Method 26A Method 26A states that the typical analytical detection limit for HCl is 0.2 micrograms per milliliter (μg/ml) and that the detection limits for other analytes (e.g., HF) should be similar. Assuming that 300 ml of liquid is recovered from the acidified impingers and the basic impingers, this would result in an MDL of 120 μg. Assuming that the sample volume is 2.5 cubic meters, the minimum required by the ICR, this mass corresponds to an emission rate for a coalfired EGU of 4.4E-5 lb/mmbtu. Comparing this analytical detection limit with the MACT proposal, the limit for existing coalfired EGUs (0.002 lb/mmbtu) is well above the analytical detection limit. The limits for HCl and HF in existing oil units (3E-4 and 2E-4 lb/mmbtu, respectively) are above the analytical detection limit; however, the limit for all future EGUs (0.3 lb/gwh, equivalent to B-14

145 3E-5 lb/mmbtu) is not. Unless better sensitivity can be obtained than indicated in Method 26A, this method would not be able to measure HCl or HF accurately in future EGUs at the proposed limit. B EPRI Survey of Part III MDLs EPRI reviewed laboratory test reports to identify MDLs for Methods 26A using the same approach described above for Method 29. Of the eleven test reports reviewed, five reported usable MDL values. The minimum and maximum emissions in lb/mmbtu corresponding to those MDLs are shown in Table B-5. Table B-5 EPRI Review of ICR MDLs for Method 26A Minimum MDL lb/mmbtu Maximum MDL lb/mmbtu HCl 3.2E-6 6.9E-5 HF 1.6E-6 1.2E-4 B Comparison of MACT Limits to MDLs and ICR Part III BDL-flagged Data EPRI compared the proposed MACT limits to the MDL range from the EPRI survey and the cumulative frequency distribution of ICR Part III HCl and HF data, as discussed previously for metals. Figure B-4 shows the cumulative frequency distribution for BDL-flagged HCl measurements at coal-fired EGUs. For coal-fired EGUs, about 80% of the HCl test runs flagged as BDL were below the proposed limit for existing facilities, 20% were higher than the limit. Comparing the data against the limit for new EGUs, 11% of BDL values were below the limit; 89% reported higher non-detect values. Figure B-5 shows the cumulative frequency distribution for BDL-flagged HCl measurements at liquid oil-fired EGUs. Seventy-five percent of the HCl emissions results flagged BDL were below the limit for existing units; 25% were higher than the limit. Eighty-eight percent of liquid oil-fired EGU test values flagged as BDL were above the new unit limit of 3E-5 lb/mmbtu; only three test runs were lower than the new unit limit. These were three runs from the East River facility (ORIS 2493) that appear to be outliers in the HCl distribution. For HF, most of the BDLflagged values were above the limit for new EGUs but below the limit for existing EGUs. B-15

146 Figure B-4 Cumulative Frequency Distribution Showing Only Coal-fired BDL HCl Test Runs B-16

147 Figure B-5 Cumulative Frequency Distribution Showing Only Liquid Oil-fired BDL HCl Test Runs A summary of EPRI s review of the ICR Part III HCl and HF data is shown in Tables B-6 and B-7, for coal- and liquid oil-fired EGUs, respectively. Table B-6 Adequacy of Method 26/26A for Coal-fired EGUs MACT Floor for New MACT Floor for Percent of Values New Unit Limit Existing Unit Limit Coal Units 1 Existing Coal Units BDL MDL x 10 BDL Value Method 26/26A lb/mmbtu lb/mmbtu lb/mmbtu lb/mmbtu Hydrogen chloride 2 3.0E E-03 24% 3.2E E-04 No Yes 1 Calculated from ratios of MACT Floors for New and Existing Coal Units in lb/gwh Minimum 2 Omitted 2 much lower BDL values (TS Power) corrected by EPA in MACT floor spreadsheet Median Method Sensitivity is Adequate At: B-17

148 Table B-7 Adequacy of Method 26/26A for Liquid Oil-fired EGUs MACT Floor for New Liquid Oil MACT Floor for Existing Liquid Oil Percent of Minimum Median Method Sensitivity is Adequate At: New Unit Limit Existing Unit Limit Units 1 Units Values BDL MDL x 10 BDL Value Method 26/26A lb/mmbtu lb/mmbtu lb/mmbtu lb/mmbtu Hydrogen chloride 5.0E E-04 24% 3.20E E-04 Uncertain Yes Hydrogen fluoride 5.0E E-04 54% 1.60E E-04 Uncertain Yes 1 Calculated from ratios of MACT Floors for New and Existing Liquid Oil Units in lb/gwh or lb/mwh EPA derived the HCl and HF limits for new EGUs by selecting the lowest test run value in the lowest test series of the lowest emitting coal-fired EGU and multiplying that emission by a factor of three, rather than by a statistical analysis of the ICR data. This procedure, carried out without consideration for outlier identification or correction of laboratory and reporting errors, resulted in limits for new EGUs that are not achievable with the methods used in the ICR. For coal-fired EGUs, the new unit HCl limit is based on the Logan Generating Company, Unit 1 (ORIS 10043) lowest test value (of 6 test series) of 8.7E-5 lb/mwh. That BDL value multiplied by three produced the MDL x 3 MACT floor value of 3E-4 lb/mwh. However, inspection of the other Part II and III data reported for this EGU shows that the average emissions reported for the other test series (all flagged BDL) ranged from 7E-4 to 0.03 lb/mwh. Based on these results, HCl could not be measured at the MACT limit in any other test series in the same unit used to calculate the floor. Raising the sample volume from the 2.5 cubic meters required by the ICR to the 4 cubic meters proposed in the rule could potentially improve the method sensitivity sufficiently to measure HCl at the new unit limit. However, it has not been demonstrated that the method will be accurate at those concentrations. For liquid oil-fired EGUs, the new unit HCl and HF limits are based on the East River (ORIS 2493) lowest test value times 3. As noted earlier, this unit appears to be a low outlier, reporting BDL values about ten times lower than any other emission value (detected or not). It would be more appropriate to set the limit at a concentration that most qualified labs could measure accurately in stack gas samples. B ReMAP Study of Method 26 This EPA-sponsored project [ASME, 2001] evaluated multi-train stack test data from three sources. One was a municipal waste combustor (MWC); the other two units are not described in the project report. The study concluded that at 40 mg/dscm (about 0.04 lb/mmbtu), 99 of 100 future triplicate measurements will be within 11% of the true concentration. This concentration is 20 times higher than the MACT limit for existing coal-fired plants (0.002 lb/mmbtu) and about 1,300 times the MACT limit for new units. At 1 mg/dscm (9E-4 lb/mmbtu), 99 of 100 future triplicate measurements were predicted to be within ± 20% of the true value. At a stack gas concentration of 0.5 mg/dscm (4E-5 lb/mmbtu), the precision of the method (%RSD) fell to 30%. An acceptable level of precision for a test method is generally considered to be ± 15%. Thus, the ReMAP findings indicate that the lower limit of adequate precision for Method 26 is about 9E-4 lb/mmbtu. Over 90% of the HCl and HF ICR measurements made at coal-fired units B-18

149 were below 9E-4 lb/mmbtu, indicating that the great majority of the ICR measurements may not be accurate. The proposed MACT limit for existing coal EGUs (0.002 lb/mmbtu) falls in the range where acceptable accuracy is expected. However, the MACT floors for new coal and liquid oil-fired EGUs are much lower (3E-5 and 5E-5 lb/mmbtu, respectively) and the test methods are not expected to perform accurately at those concentrations. B.3.2 HCl Continuous Emissions Monitor There are three likely candidate technologies for continuous measurement of gas-phase HCl: FTIR, tunable diode laser (TDL), and gas filter correlation (GFC). There is little or no experience with any of these technologies on EGUs, although all three have been used in Europe on municipal waste incinerators. EPA has approved a reference method using FTIR for HCl in cement plant emissions. GFC is widely used on waste incinerators in Europe for HCl monitoring, but U.S. experience with this technology on EGUs is limited to carbon dioxide (CO 2 ) monitoring. All of these technologies operate in the near infrared range and are to some extent subject to interference from water vapor. However, both FTIR and TDL systems are able to resolve this interference. All of the suppliers of HCl monitors claim a detection limit of at least 0.2 ppm per meter of duct diameter. For a 10-meter-diameter duct that would provide a sensitivity of 0.02 ppm, which would be sufficient to measure HCl at the proposed MACT limits for new coal- and oil-fired EGUs. However, with increasing measurement path there is a tradeoff between increased sensitivity and loss of signal strength. Particulate and entrained droplets in the flue gas can degrade the signal strengths in an in situ application in wet stacks. Demonstration of in situ techniques in a wet stack environment needs to be conducted. An alternative approach is to extract a gas sample and measure HCl. Sample transport in a hot, wet flue gas is problematic and would require multipass measurement cells to achieve sufficient path length for the required sensitivity. Currently, the only installation on a U.S. EGU is an in situ TDL on a dry stack. In that application, there was sufficient intensity to measure HCl and to identify process variations. There is no experience with HCl monitors on a wet stack. The preferred installation by the suppliers is to make an in situ measurement, or at least place the instrument close coupled to the stack, to avoid problems with condensation of HCl during transport. Key tasks that need to be completed before an HCl monitor can be deployed and used for compliance measurements include: Determine detection and quantitation limits Quantify interferences of likely species in flue gas Validate CEMs in a wet stack Develop alternative extractive sampling techniques Develop low-level calibration standards Develop and validate an EPA performance specification B-19

150 B.3.3 Sulfur Dioxide by Continuous Emission Monitor Most Part II and III ICR data were obtained using CEMs (EPA Method 6C or equivalent); a few sites used manual test methods (EPA Method 6). Pulsed fluorescence analyzers are the technology most often used to monitor SO 2 at EGUs. This technology is an adaptation of ambient SO 2 monitoring equipment and is capable of very low detections. The ambient monitors have detection limits in the parts per billion (ppb) range, which corresponds to a range of detection limits in diluted flue gas of about 1E-5 to 5E-4 lb/mmbtu. However, typical models installed on EGUs are set for ranges that are somewhat higher and have detection limits around 0.5 ppm. Current EPA protocols require annual relative accuracy test audits and include incentives to maintain the relative accuracy (RA) within 7.5%. At the emission levels typically found in power plants, the impact of sample line losses has not been an issue and operators have minimal problems meeting the 7.5% RA. However, when trying to measure very low concentrations, sample line losses may make the RA criterion more difficult to meet. It is reasonable to expect that equipment will be available to measure SO 2 in the flue gas at the proposed alternative MACT limits for new and existing coalfired EGUs (approximately 0.04 lb/mmbtu and 0.2 lb/mmbtu, respectively), even accounting for the dilution extractive sampling systems. B.4 Filterable and Condensable Particulate Test Methods The ICR required selected respondents to measure FPM and CPM. The proposed rule sets limits on TPM for existing and new coal-fired EGUs at 0.03 lb/mmbtu and 0.05 lb/mwh (approximately lb/mmbtu), respectively. A TPM measurement requires the use of two test methods: one for FPM and one for CPM. Thus, the detection limit for TPM is the sum of the detection limits of the FPM and CPM test methods. Particulate matter is measured by weighing the collected residue (gravimetric analysis); thus, the procedure used to determine a method detection limit (standard deviation of replicate spiked samples) is not applicable to these methods. Although it is possible to weigh a sample multiple times to calculate an MDL for the gravimetric measurement, or to spike replicate samples with some surrogate for filterable particulate, the variability in that measurement will be outweighed by variability of the sampling procedure, sample recovery, and sample preparation steps of the method. There are two sources that can provide information on the stack concentration at which particulate methods provide accurate results. They include: 1. The EPA methods list a minimum mass needed to obtain sufficient gravimetric precision. As indicated above, this quantity is likely to overestimate the capability of the methods, but can provide a lower bound detection limit. 2. The ReMAP study [ASME, 2001] modeled the precision of FPM at a range of stack gas concentrations. EPRI used a combination of the above two information sources to evaluate whether the sensitivity of particulate methods is adequate to measure PM emissions at the proposed MACT limits. B-20

151 One source of information that is not useful for this assessment is the detection flag in the Part III ICR test results. Most ICR respondents either did not flag PM data or flagged them as above detection limit (ADL) even if the measured weight was below the stated precision limit of the method [EPRI, 2010a]. B.4.1 FPM by Methods 5/29 and OTM-27 The ICR required FPM tests on stacks with wet FGD units (wet stacks) to be conducted with Method 5, which uses a heated, out-of-stack filter. Some testers combined the Method 5 FPM measurement with a Method 29 metals analysis, reporting FPM from the probe and filter of the Method 29 sampling train. FPM tests on dry stacks were required to be conducted with OTM-27 (promulgated in December 2010 as revised Method 201A), which uses particulate sizing cyclones and an in-stack filter at stack temperature. The FPM method required for monitoring compliance with the TPM limit is Method 5 (for wet and dry stacks). A PM CEM can be used to monitor compliance with TPM, in place of Method 5 testing. As discussed above, the percentage of ICR emission results that were flagged BDL was not used as a factor in EPRI s evaluation, as very few respondents applied the flags correctly to gravimetric methods. Neither Method 5 nor Method 27 establishes a method detection limit for FPM; however, each method states a minimum mass that will allow acceptable precision in the gravimetric measurement. Method 5 has two sample fractions, the filter and the rinsate of the sampling probe and line. Each fraction is weighed and the weights are summed to obtain the final result. The particulate filter is repeatedly dried and weighed until consecutive weighings are within 0.5 mg. The final weights are recorded to the nearest 0.1 mg. EPRI calculated the lower limit of method precision based on 0.5 mg times 2 (1 mg). Assuming the 4-hour sampling period required by the ICR, a sample volume of 120 dry standard cubic feet (dscf) and a default F-factor of 9780 dscf/mmbtu at 0% oxygen (O 2 ), FPM residue can be measured accurately at 3E-4 lb/mmbtu. Very few ICR test runs reported FPM results lower than this: 1% of coal-fired EGU test runs and 3% of liquid oilfired EGU test runs in the ICR Part III Access database would be considered inaccurate based on this metric. OTM-27 (and Method 201A) has four fractions (the filter and three acetone rinses) that are repeatedly dried and weighed until consecutive weighings are within 0.5 mg or within 1% of total weight less tare weight. The final weights are recorded to the nearest 0.1 mg. EPRI calculated the lower limit of method precision based on 0.5 mg times 4 (2 mg). Based on this evaluation, Method 201A should be able to measure accurately at 6E-4 lb/mmbtu. About 16% of coal-fired EGU test runs in the Part III Access database, but only 3% of liquid oil-fired EGU test runs, fell below this emission value. The ReMAP study [ASME, 2001] evaluated the precision of Method 5 from multi-train tests conducted at five facilities (one coal-fired power plant and four municipal solid waste combustors). The study concluded that the Method 5 RSD was essentially constant at emission rates between mg/dscm. The best estimate for the RSD ranged from about 5 12%, indicating that the method performed well in this emission range. However, 59% of the ICR test runs at coal-fired EGUs and 34% of the test runs at liquid oil-fired EGUs reported FPM B-21

152 emissions below 15 mg/dscm (approximately 0.01 lb/mmbtu), indicating that method precision is likely to be poorer. Additional parallel train studies at low emission rates are needed to determine method performance. It is not possible to state with any confidence whether Method 5 sensitivity is adequate to support compliance monitoring at the proposed MACT limits for TPM, as that will depend on the relative proportions of FPM and CPM in the stack gas. However, for a stack gas where PM emissions are predominantly FPM, the above methods will be able to measure accurately at the proposed TPM limit (0.03 lb/mmbtu) for existing coal-fired EGUs. At the proposed limit for new EGUs (0.05 lb/mwh, approximately lb/mmbtu), Method 5 should still be sufficient to measure accurately by gravimetric analysis. However, whether the overall method accuracy, considering sampling variability, will be adequate at that emission rate is unknown. Field studies conducted by EPRI and others have identified problems with the accuracy of Method 5 at low emission levels. Results can be biased high or low depending on how the sample is collected. A recent study conducted by AEP and EPRI found that the material used for the particulate filter (glass fiber versus quartz fiber filter) can produce a significant bias in the results due to acid gas sorption on the alkaline material of glass fiber filters [EPRI, 2011]. The temperature of the sampling probe and filter also affects the amount of FPM collected, with the lower temperature collecting more FPM. The ICR Part III data include tests conducted at two different temperatures (250 F and 320 F) and with both types of filters. These variables introduce additional uncertainty into conclusions drawn from the ICR data, in addition to normal method variability. B.4.2 CPM by Methods OTM-28 and 202 The ICR required CPM to be measured using OTM-28; this method was promulgated in December, 2010 with significant changes as revised Method 202. Most ICR test contractors used OTM-28; a few used the original Method 202. Because none of the ICR tests used revised Method 202, which is proposed to be required for compliance monitoring, the conclusions of EPRI s evaluation of method adequacy are highly uncertain. Neither method establishes a method detection limit. OTM-28 requires analysis of two sample fractions: inorganic (aqueous-phase) CPM and organic CPM. An ammonium hydroxide correction is applied to the inorganic fraction. The method allows for blank correction of residue in the field blanks of both fractions; however, the blank correction must be applied to the final CPM total (not the individual fractions) and the correction amount is limited to 2.0 mg. The median total CPM in blanks reviewed by EPRI exceeded 2.0 mg; thus, some of the ICR test results may have been biased high by this method restriction. No data sets had a gross inorganic catch less than 1 mg; however, 12 data sets reported an organic catch less than 1.0 mg. The data indicate that the MDLs were not an issue for the inorganic fraction, but reporting limits/mdls may impact the organic fraction at some sites. Gravimetric detection limits ranged from mg for the inorganic fraction and from mg for the organic fraction. Assuming a 4-hour sampling period, an ICR sample volume of 120 cubic feet and a default F-factor of 9,780 dscf/mmbtu at 0% O 2, total CPM emission rates corresponding to these detection limits range from 1.1E-4 lb/mmbtu to 3.5E-4 lb/mmbtu. The reporting limits for the inorganic fraction ranged from mg; for the organic fraction they B-22

153 were all 1.0 mg. For the purpose of this evaluation, EPRI used a total CPM emission of 2E-4 lb/mmbtu (the middle of the range) to compare to the MACT limits. None of the total CPM emissions measured in coal- or oil-fired EGUs were lower than this value. The proposed TPM limit for new coal-fired EGUs is 0.05 lb/mwh (equivalent to lb/mmbtu), about 20 times higher. This indicates that gravimetric imprecision is not a limiting factor for use of Method 202 at the proposed MACT limits. It is not possible to state with any confidence whether the Method 202 sensitivity is adequate to support compliance monitoring at the proposed MACT limits for TPM, as that will depend on the relative proportions of FPM and CPM in the stack gas. However, for a stack gas where the PM emissions are predominantly CPM, the mass collected in a 4-hour test will be sufficient for an accurate gravimetric measurement at the proposed TPM limit (0.03 lb/mmbtu) for existing coal units. At the proposed limit for new EGUs, approximately lb/mmbtu, the collected PM residue is also sufficient to measure accurately by gravimetric analysis. However, limited results of field testing indicate that the overall method accuracy may not be adequate at that emission rate. A field study conducted by AEP at several coal-fired power plants found that revised Method 202 has high between-run variability [EPRI, 2011]. In four replicate tests conducted with the same sampling conditions (probe temperature, PM filter type and sampling duration), the relative percent difference (RPD) of the CPM measurements ranged from 25 78%. These were samples taken at different times and were not parallel train samples; however, process monitoring indicated that there were no upsets during the sampling periods. FPM samples taken from the same sampling trains had lower (6 27%) variability. CPM emissions in the samples ranged from lb/mmbtu to lb/mmbtu. The RPD of the replicates at the lowest emission rate, about twice the proposed MACT TPM limit for new plants, was 78%. These findings indicate that the method does not provide sufficient precision to support compliance monitoring at the proposed limit for new coal-fired EGUs, and may not provide sufficient precision at the proposed limit for existing coal-fired EGUs. An EPRI laboratory study of OTM-28 determined that modifications from the original Method 202 had reduced the positive bias due to aqueous-phase conversion of gaseous sulfur dioxide to sulfuric acid (a CPM species) compared to original Method 202 [EPRI, 2009]. However, the study also found that OTM-28 did not appear to capture sulfuric acid aerosol effectively. As sulfuric acid is the predominant CPM species in coal-fired power plants, this finding may indicate that the revised Method 202 has a negative bias. B.4.3 TPM by Methods 5 and 202 Based on the measurement limits determined above, the combination of Method 5 and Method 202 is sensitive enough to provide accurate gravimetric measurement at the proposed MACT limits for both existing and new coal-fired EGUs. The combined detection limit for a 2-hour sampling duration using these methods is 5E-4 lb/mmbtu, about ten times lower than the proposed limit for new coal-fired EGUs. However, the precision of Method 202 may not be sufficient at the new EGU limit to support that determination. B-23

154 B.4.4 FPM by Continuous Particulate Monitor There are almost 70 continuous particulate mass monitors (PM CEMs) operating on EGUs in the United States. Over 40 are installed on wet stacks (following FGD systems). The process for certifying and calibrating PM CEMS is described in EPA Performance Specification 11 (PS-11). PS-11 requires a minimum of 15 Method 5 test runs to develop a calibration curve for the CEM. As a practical matter, the lowest point on the calibration curve is the normal operating point of the CEM. The only way to expand the operating range is to de-tune the power plant control equipment to provide up-scale readings to the instrument while conducting the Method 5 manual testing. While this approach has been demonstrated with some success on dry stacks by de-tuning the ESP, it has not been attempted on units equipped with fabric filters. Wet stack applications present further challenges, since de-tuning an ESP upstream of a wet FGD has less effect on the outlet emissions and increases the risk of producing off-specification gypsum. In these cases, the calibration curve for the PM CEMs consists of the normal operating point and an artificial zero. There is not a lot of documentation available on the installed PM CEMs equipment; however, all of the systems now installed on wet stacks have indicated some plugging problems. Usually this has been handled by frequent cleaning of the probes. The experience at an EPRI demonstration project has shown similar issues, with some systems working better than others. All of the vendors continue to make improvements and try alternatives. Dry stack systems fare better, especially where cross-stack installations are possible. One particular model had problems with data loss due to sunlight interference at the height of the day. The sensitivity of the PM CEMs is limited by the lowest FPM emission that can be accurately and precisely measured by Method 5. In EPRI sponsored PS-11 testing, the lowest range of PM emissions used for calibration was 2 to 5 milligrams per cubic meter (mg/m³), approximately to lb/mmbtu. This level of sensitivity is adequate to measure FPM at the proposed TPM limits for existing coal-fired EGUs (0.03 lb/mmbtu), but may not be sensitive enough to measure at the proposed TPM limit for new coal-fired EGUs (approximately lb/mmbtu). As discussed earlier, a lower detection limit for Method 5 can be obtained by extending the sampling time. However, longer sampling times will increase the time required to conduct the full 15 runs required by PS-11. PM CEMs measure only FPM; thus, this technology is not directly applicable to monitoring compliance with the proposed TPM limit for coal-fired EGUs. B.5 References ASME, Reference Method Accuracy and Precision (ReMAP): Phase I, Precision of Manual Stack Emission Measurements, Final Report. EPRI, Evaluation of Alternative Condensible Particulate Matter Methods EPRI, Palo Alto, CA: EPRI, 2010a. Data Quality Evaluation of Hazardous Air Pollutants Measurements for the US Environmental Protection Agency's Electric Utility Steam Generating Units Information Collection Request. EPRI, Palo Alto, CA: EPA-HQ-OAR B-24

155 EPRI, 2010b. Determining the Variability of Continuous Mercury Monitors at Low Mercury Concentrations. EPRI, Palo Alto, CA: EPRI, Impact of Sampling Procedures on Results of Filterable and Condensable Particulate Stack Test Methods. EPRI, Palo Alto, CA: B-25

156

157 C APPENDIX: LIST OF POTENTIAL AREA SOURCE FACILITIES AND ASSOCIATED EMISSIONS ESTIMATES Table C-1 Potential Area Source Facilities and Associated Emissions Estimates ORIS Code Plant Name Max HCl (tons/yr) Max HF (tons/yr) Max Total (tons/yr) 60 Whelan Energy Center Unit 1 (WEC1) Holcomb Cholla Valmont Hayden Hammond Joliet * 897 Vermilion * 1382 HMP&L Station Two Henderson Hoot Lake Hawthorn Montrose * 2535 AES Cayuga, LLC Hamilton Seward Hatfield's Ferry Power Station * 3295 Urquhart Navajo Generating Station * 6016 Duck Creek * 6041 H L Spurlock Station Nearman Creek Nebraska City * 6170 Pleasant Prairie Healy C-1

158 Table C-1 Potential Area Source Facilities and Associated Emissions Estimates (continued) ORIS Code Plant Name Max HCl (tons/yr) Max HF (tons/yr) Max Total (tons/yr) 6639 R D Green Louisa * 6761 Rawhide J K Spruce Polk Neil Simpson II Ray D Nixon Logan Generating Plant Chambers Cogeneration LP AES Hawaii Hopewell Southampton Power Station Silver Bay Power * INDIANTOWN COGENERATION L.P Mecklenburg Power Station Sandow Station Roanoke Valley I Spruance Genco, LLC Birchwood Power Facility Roanoke Valley II Red Hills Generating Facility Wygen Hardin Generator Project Elm Road Generating Station TS Power Plant Wygen AES Puerto Rico Cogeneration Facility * Denotes facilities where the additional contribution of total metals was evaluated to determine if adding total metals pushed the total annual HAPs emissions over the major source 25 tpy limit. None of these sites was estimated to exceed the limit with total metals included. C-2

159 D APPENDIX: INDEPENDENT REVIEW OF THE UPPER PREDICTION LIMIT (UPL) STATISTICAL METHODOLOGY Date: July 5, 2011 Subject: National Emission Standards for Hazardous Air Pollutants (NESHAP) Maximum Achievable Control Technology (MACT) Floor Analysis for Coal and Oil-fired Electric Utility Steam Generating Units From: Paul Switzer, PhD To: Babu Nott Senior Program Manager Environment Sector Electric Power Research Insitute Palo Alto, CA INTRODUCTION I am responding to your request to review the revised EPA/RTI document dated May 18, 2011 [EPA/RTI] describing a proposal for calculating national emission standards for hazardous air pollutants using a Maximum Achievable Control Technology [MACT] for coal and oil fired electric utility steam generating units [EGU], together with associated materials including spreadsheet calculations. The statistical procedures described in the EPA memorandum were used to calculate MACT emissions limits that are proposed for adoption by the U.S. Environment Protection Agency [EPA]; my comments are intended to form part of EPRI s response. Due to the shortness of the comment period my comments address only the principal statistical issues as I see them. My comments draw on my 45 years of teaching and research experience as professor of statistics at Stanford University, particularly in areas of environmental statistics. I also served in various editorial capacities, including as Theory & Methods editor for the Journal of the American Statistical Association. I have had many interactions with EPA including a pair of sabbaticals at the agency and have given short courses at the EPA. The articulation and rationalization of the proposed Upper Prediction Limit [UPL] is based on statistical analyses and interpretations of recent data from currently operating EGUs. In this instance I truly hope that EPA will consider the comments that follow as an opportunity to address statistical errors contained in the UPL definition, rationalization, and calculation in order to strengthen the credibility and defensibility of its proposed emission limits. I would certainly welcome an opportunity for peer review of these comments. The goal of reduced EGU emissions utilizing better plant technology should go hand-in-hand with utilizing better statistical technology for the articulation of emissions standards. Some of my comments will refer specifically to verbatim text, reproduced below, that is extracted from the 5/18/11 document and appears on pages 4-6. For convenience of later reference the text citations are numbered A-I, with some parts italicized by me. The individual paragraphs of my own comments, which follow these text citations, are numbered for convenience of later referencing. D-1

160 SELECTED TEXT CITATIONS A. The level of confidence represents the level of protection afforded to facilities whose emissions are in line with the best performers, and consequently, the level of confidence is not arbitrary. For example, a 99 percent level of confidence means that a facility whose emissions are in line with the best performers has one chance in 100 of exceeding the floor limit. B. A prediction interval for a single future observation (or an average of several test observations) is an interval that will, with a specified degree of confidence, contain the next (or the average of some other pre-specified number) of randomly selected observation(s) from a population. C. In other words, the UPL estimates what the upper bound of future values will be, based upon present or past background samples taken. D. The UPL consequently represents the value at which we can expect the mean of future observations for the HAP or HAP surrogate emissions to fall within a specified level of confidence, based upon the results of an independent sample from the same population. E. This formula encompasses all the data point to data point variability. Its predictions derive from the data set to which it is applied, and thus can be applied to any type of data. F. The form of the equation differs somewhat depending upon the nature of the data set to which it is applied. To this end, the data sets were evaluated for each HAP and HAP surrogate to ascertain whether the data were normally distributed, or distributed in some other manner (i.e., lognormally).. This approach is more accurate and obtained more representative results than a more simplistic normal distribution assumption. G. For data sets where the number of available EGUs was 15 or more, use of the UPL was based on assuming a normal distribution based on the Central Limit Theorem (Durrett, 1995). The Central Limit Theorem states that regardless of the shape of the original distribution, if the distribution has a finite mean (μ) and variance (σ²), the sampling distribution of the mean approaches a normal distribution with a mean of (μ) and a variance of σ²/n as N, the sample size, increases (Durrett, 1995). H. When the sample size is smaller than 15 and the distribution of the data is unknown, the Central Limit Theorem cannot be used to support the normality assumption. Statistical test of the kurtosis, skewness, and goodness of fit test are then used to evaluate the normality assumption. I. The 99th confidence UPL was selected as a reasonable upper limit because only 1% of future tests of the MACT pool of lowest emitting EGUs will exceed the limit if they are performing as well as the emission test data indicate (i.e., these EGUs will be below or achieve the limit 99 percent of the time in the future). INTERPRETATION OF THE UPL AS THE 99TH PERCENTILE 1. The first step in calculating currently proposed emissions limits for existing EGUs is to pool measurement data from a selected group of the best-performing comparable EGUs. I refer to this data pool as the baseline data. My comments do not address any issues related to the selection of the best-performing EGUs or the decision to pool their measurement data. 2. The baseline data are treated in the EPA/RTI proposal as though they comprise independent identically distributed measurements from a conceptual baseline population, i.e., a random sample from some distribution. Thus inter-unit heterogeneity, intra-unit correlation, and temporal autocorrelation have all been ignored. D-2

161 3. Baseline data should be compatible with the form of anticipated compliance measurements. The statistical properties of the spot concentration measurements in the baseline data pool are not the same as the statistical properties of 30-day averages anticipated for compliance monitoring. Thus a UPL calculated assuming compliance measurement data like those in the baseline pool will not be relevant if we do not have such data compatibility. 4. As a starting point for these comments, I adopt the point of view explicitly referenced in EPA/RTI, i.e., that only 1% of monitored emissions for a compliant EGU should exceed the UPL [see text extracts A, B, C, I]. Indeed, if data were plentiful [say several hundred] then the empirical 99th percentile of the pooled baseline data would be a simple, nonparametric, nearly unbiased, estimate of the desired UPL. There are no a priori assumptions imposed regarding the shape of the distribution. If a compliant EGU had emissions that mimicked the baseline pool then there would be a 1% chance that a future measurement would exceed the empirical 99th percentile, as required for a UPL. 5. However, baseline data pools are typically not large enough for direct calculation of an empirical, fully nonparametric, 99th percentile. But using a parametric model for the shape of the whole distribution, such as a normal or lognormal distribution, can severely bias the estimation of the 99th percentile. Instead, one should consider using direct modeling of the tail of the distribution without making assumptions about the shape of the distribution for smaller concentrations, and this is possible and preferable even with moderate sample sizes. Statistical methods directly aimed at estimating the upper tail of a distribution can tolerate measurement errors at low concentrations and do not require special handling of data below detection limits. 6. Any estimate of the 99th percentile will carry uncertainty because the available baseline data are themselves only a sample from the baseline population of concentration measurements. When the available baseline data are few the uncertainty will be substantial and the question arises whether and how emissions standards should recognize and incorporate this baseline uncertainty. For example, if the 99th percentile estimate based on the limited sample data was approximately unbiased there would still be about an even chance that the true population 99th percentile is too high or too low. To be reasonably confident that the true baseline 99th percentile falls below a proposed emissions limit would require an appeal to statistical tolerance limits for determination of an emissions limit that explicitly recognizes the uncertainty of 99th percentile estimates. 7. The foregoing comments are based on the premise that the baseline data sample mimics the form of anticipated compliance data with respect to measurement protocols, sampling intervals, and variability over time. When this is not the case then the statistical properties of the baseline data would not represent what one would expect from a compliant EGU. For example, spot concentration measurements used in the baseline data pool have statistical properties that will differ from those for rolling 30-day averages anticipated for compliance monitoring. 8. Another issue concerns the interpretation of the proposed emission standards regarding the 1% of exceedances that would be expected for monitored EGUs whose emissions mimic the baseline pool of high-performing EGUs. The units within the baseline pool, themselves, would exceed a 99th percentile UPL some of the time and would, themselves, be therefore considered non-compliant. IRRELEVANCE OF THE CENTRAL LIMIT THEOREM 9. EPA/RTI s UPL proposal, based on the two-sample t-statistic, does not provide an estimate of a 99th percentile, the stated goal for the UPL. The proposal appeals to the Central Limit Theorem [CLT] for its justification, based on the large sample size [n>15] of the baseline data pool. However, a large sample size for the baseline pool is not sufficient for an appeal to CLT because of the small sample size [m=1 or m=3] of the anticipated application to putative future data. See also the text reference G above. D-3

162 10. The appeal to the CLT requires both baseline and future sample averages be based on sufficiently large sample sizes. Failing this, the use of normal distribution theory to calculate the UPL based on the t-statistic requires that the monitored data themselves be a sample from a normal distribution. This is rarely the case for trace concentration data, which are typically strongly skewed. Tests that examine normality of the baseline data, as used by EPA/RTI for small sample sizes, have almost no power to detect non-normality precisely because of the small sample sizes. 11. In any event, there is no connection between this UPL and the 99th percentile of the pooled baseline distribution, which was the explicit objective of the calculation. For example, suppose that a monitored EGU produces emissions measurements that do exactly match the typically skewed non-normal distribution of the baseline pool data. Then UPL calculations based on the t-statistics, regardless of the baseline sample size, are irrelevant to exceedance rate. Even the fraction of baseline measurements exceeding the t-distribution UPL may be nothing like 1%. DATA SELECTION 12. Data selection according to the values of the measurements will surely bias subsequent calculations. For example, it is proposed to select only the smallest of three available measurements from a measurement run for calculating mean values of the pooled baseline data. Data selection of this kind might be justified only if similar data selection was anticipated for future compliance monitoring, but this has not been proposed. ASSESSMENT OF VARIABILITY 13. Another issue relates to capturing relevant sources of emissions variability in determining emissions limits, as noted in the EPA/RTI document. In light of anticipated compliance monitoring protocols an important source of variability would be the temporal variability on time scales corresponding to the proposed compliance requirements, e.g., a 30-day rolling average. If the baseline data used for determining emissions limits do not capture temporal variability applicable to measurements reported for compliance monitoring then even baseline EGUs may well be out of compliance. The important point is that emissions limits need to recognize the full range of within-unit variability dictated by the statistics appropriate to compliance monitoring protocols. DATA WEIGHTING 14. When baseline units contribute unequal numbers of measurements to the baseline pool then it is important that the weight contributed by a unit should not be merely reflective of the number of measurements that happen to be available at the time of pooling. Instead the goal should be to give all units in the pool equal weight for any subsequent calculations. However, the EPA/RTI formulas and corresponding spreadsheet calculations suggest that each individual measurement is given equal weight, as opposed to equal weighting for each EGU in the baseline pool. For example, the EPA/RTI calculations effectively give three times as much weight to baseline EGUs with three measurements vis-à-vis those with single measurements. The proper pooling of the baseline measurements can readily be accomplished by assigning a weight to each measurement that is inversely proportional to the number of available measurements for that unit. All subsequent calculations would then use the weighted data. Weighting of data could in some cases yield substantially different results for UPL calculations compared with unweighted calculations, for example if units with more data have generally higher [or lower] concentrations than units with fewer data. D-4

163 NORMAL VS LOGNORMAL 15 EPA/RTI proposes choosing between normal and lognormal population models when there are few baseline data in the pool. See text citations F and G. This issue would not arise if one directly estimated the distribution tail rather than assume a distributional shape that encompasses the lower concentrations; see my earlier remarks in para 5. In any event, power to discriminate between normal and lognormal models is very low for the small sample sizes involved, while the particular selection of one of these two distribution models can nevertheless substantially impact the model-dependent UPL calculation. The claim for increased accuracy using normal/lognormal selection was not substantiated [text citation F]. Furthermore, if the lognormal model is to be applied to average values of future measurements, rather than to individual measurements, then the relevance of fitting a lognormal model to individual measurements is unclear. Whatever procedure is used, considerable uncertainty will always remain when high percentiles are inferred from small sample sizes. For this reason, it is highly recommended that sufficient data be a requirement for the setting of emissions limits in order to avoid questionable dependence on indefensible model choices. Reference Richard Durrett. Probability: Theory and Examples. 2nd Edition. Duxbury Press, Pacific Grove, CA, D-5

164

165 E APPENDIX: CASE STUDY RISK ASSESSMENT REVIEW OF EPA EMISSIONS CALCULATIONS AND ASSOCIATED DATA E.1 Review of Chromium Concentrations and Emissions Data Table E-1 provides a list of the stations that exceeded the 1 x 10-6 cancer risk criteria by EPA s calculations. The Agency s risk calculations for James River, Conesville, and Gallatin were all based on measured ICR data where elevated chromium (Cr) levels were present, as noted above. The risk calculation for Chesapeake was based on the Phase I ESP average EF, which also contains the seven sites noted as having potential chromium contamination issues. Table E-1 Facilities with EPA Cancer Risk 1E-6 or Greater Operator Facility EPA Risk Value EPA Risk Driver EPA Control Class for Metals Source of EF Used in EPA Risk Calculation for Risk Driver Metal (Cr, As, Ni) Spring-Mo James River 8E-6 Cr+6 Units 3 5: Subbit ESP U3: avg of ICR U4 and 5 U4 and U5: ICR Dominion Chesapeake 3E-6 Cr+6 U1 4: Bit ESP Phase I ESP avg AEP Conesville 3E-6 Cr+6 U3: Bit ESP U4 6: Bit ESP FGDw U3: ICR TVA Gallatin 1E-6 Cr+6 U1-4: Subbit ESP U2: ICR NU Merrimack 1E-6 As U1: Bit ESP U2: Bit ESP w/aci and DSI U4 6: Phase 2 ESP FGDw avg U1,3,4: ICR from U2 U1: Phase 2 ESP avg U2: ICR OG&E Muskogee 1E-6 As U4-6: Subbit ESP U4 6: Phase 2 ESP avg Dominion Yorktown U1&U2 (Coal) 1E-6 Cr+6 U1-2: Bit ESP U1 2: Eval. Incomplete note in case study spreadsheet. Not sampled in ICR Dominion HECO Yorktown U3 (Oil) Waiau (Oil) 1E-6 Cr+6 U3: No. 6 oil, no PM control Cr+6 = hexavalent chromium; As = arsenic; Ni = nickel Subbit = subbituminous coal; Bit = bituminous coal 10E-6 Ni U3 8: No. 6 oil, no PM control U3: ICR U3 5: ICR U6 data U6: ICR U7: ICR U8: ICR U7 data U = unit; ESP = electrostatic precipitator; FGDw = wet flue gas desulfurization; ACI = activated carbon injection; DSI = dry sorbent injection E-1

166 EPRI reviewed the Casestudy_emis_26apr2011 for EPRI.xls supporting information file (subsequently called the EPA case study spreadsheet ) to evaluate EPA s methodology for estimating emissions for those case study plants where EPA estimated a lifetime cancer risk at or greater than 1 in a million. The data sets that EPA selected to calculate the average emission factors used to estimate risk for sites not sampled as part of the ICR were also reviewed. For units tested in the 2010 ICR, EPA appears to have used the actual Part III emission test data to develop the actual annual emission estimates used in their case study risk evaluation. If a unit was not tested as part of the ICR, EPA either used data from a similar ICR-tested sister unit or used an average emission factor derived from units tested in the ICR, grouped by type of control system (e.g. ESP, fabric filter, ESP/wet FGD, etc.). These average emission factor bins by control class do not appear to have taken coal rank into account. Thus all ICR data, including data from the random 50 test units, were used by EPA to calculate average control class-based emission factors used in the Agency s risk analyses. In addition, the data sets produced by EPA for this analysis were not properly statistically described. EPA used arithmetic average values for the bins, when the data sets are often not normally distributed. Per the EGU MACT Floor Analysis Memo dated March 16, 2011 by Jeff Cole of RTI International, when the ratio of the skewness to the standard error and the ratio of kurtosis to its standard error exceed 2.0, lognormal distributions are to be used for data statistics. In addition, a number of data sets from tested sites are possible outlier values and should be further evaluated or removed from the statistics. A review of this supporting information indicates that many of the coal-fired facilities with greater than 1 in a million risk had chromium as the primary risk driver in EPA s analysis, contrary to previous health risk modeling conducted by EPRI [EPRI, 2009] whose results did not show chromium to be the primary cancer risk driver for any coal-fired EGU greater than 25 MW. Therefore, EPRI performed more detailed evaluation of the chromium emissions data obtained from ICR test sites. The result of this review can be summarized as follows: The data sets for all of the bins were examined for normality, and were determined to be lognormally distributed; thus a geometric mean should be used to model untested plant emissions. Possible outliers were identified and should be removed from test data sets before specific plant emissions are modeled or the average of bins calculated. E.2 Data Set Distribution Issues As described in the March 16, 2011 memo by RTI, data sets in the MACT analyses were to be examined for normality by two ratios skewness and kurtosis. The 50 sites in Bin 1 (ESP only) were examined for arsenic, chromium, and nickel. The average, standard deviation, skewness, kurtosis (and their error estimates) for both normal and lognormal transformed values have been calculated and are presented in Table E-2. As stated by RTI, when the ratios of skewness and kurtosis to their error values exceed 2.0, the normal distribution is rejected. By these criteria, arsenic, chromium, and nickel data are clearly not normally distributed. Consequently the arithmetic average is not appropriate for estimating the bin mean value, and a geometric mean is appropriate for developing emission factors. E-2

167 Table E-2 Arithmetic and Logarithmic Data Set Statistics for Bin 1 Statistic Arsenic log As Chromium log Cr Nickel log Ni Mean* 2.01E E E E E E-6 Std Deviation 5.53E E E E E E-1 Skewness SE Skew Ratio Skewness Kurtosis (0.16) SE Kurt Ratio Kurtosis (0.23) * The log means have been transformed back to base 10 for comparison to the arithmetic means Ratio values exceeding 2.0, indicating the data are not normally distributed Note that both the skewness and kurtosis ratios exceed 2.0 for all three data sets in normal space. In log space, both nickel and chromium still show excessive skewness, indicating the high values in those data sets may be outliers. E.3 Review of Chromium Data Based on EPRI s review of the data, metallic contamination was suspected at some of the ICR test sites that showed elevated levels of nickel and in some cases manganese, as well as chromium. Although the Method 29 blank sample train data from these suspect ICR test sites did not typically indicate a problem with sample system or reagent contamination for chromium, nickel, and manganese, the presence of elevated levels for all three of these elements suggests some other source of metallic contamination. Chromium data from the ICR data set were evaluated on the basis of ppmw in the stack particulate matter, in an attempt to determine a reasonable range of enrichment factors and to thus develop another tool to identify sites with possible contamination issues. Figure E-1 below shows all the individual run ICR data for coalfired units with stack FPM plotted versus chromium in the stack particulate (e.g. lb/mmbtu Cr divided by lb/mmbtu FPM, expressed on a ppmw basis) for individual runs. Higher chromium on a ppmw basis is expected at lower FPM emissions, due to enrichment effects in the fine particulate. However, there are a significant number of individual test runs that show high chromium levels in the stack particulate at high stack FPM emissions, suggesting a problem with these data. By comparison, an analogous plot of chromium data, from the historical data set used to derive the emission correlations employed in EPRI s 2009 risk modeling, is shown in Figure E-2 [EPRI 2009]. The plots are generally similar, but the historical data does not show the same issue with outlier data points at higher FPM emissions. E-3

168 Figure E ICR Chromium Data for Coal-fired Units (Individual Test Runs) E-4

169 Figure E-2 Historical Chromium Data for Coal-fired Units Used in EPRI s 2009 Emission and Risk Modeling (Unit Averages) The suspect runs for chromium shown in Figure E-1 are associated with the units shown in Table E-3. Those units used by EPA in the calculation of the Phase II average emission factor for various control class bins are noted. The greatest impact likely occurs for the 1-ESP and 2-FF bin average EFs, since these groups contain the largest number of sites with potential contamination issues. In some cases, (e.g., James River U5, Gallatin U2, Valley B1, Valley B3, and Craig C1) the amount of chromium measured at the stack was greater than the amount entering with the coal, when coal and stack emission values were compared on a lb/mmbtu basis. This is another indication of possible metallic contamination issues for stack gas measurements at these sites. For example, Table E-4 provides a summary of Part III individual run ICR coal and stack measurement data from James River Unit 4 and Unit 5. Run 2 at Unit 4 shows both chromium and nickel emissions that are 5 to 10 times higher than Runs 1 and 3, yet the particulate emission rates for all three runs are comparable. Stack emission levels for Run 2 were at or above the inlet coal chromium concentration on a lb/tbtu basis. When expressed on a ppmw basis in the stack particulate, the Run 2 data indicate 18,000 ppmw (nearly 2 wt%) chromium in the particulate matter. This is inconsistent with other ICR data and historical test data, as shown previously in Figures E-1 and E-2. Likewise, all runs at Unit 5 show elevated chromium levels in the stack E-5

170 particulate matter, as well as stack emission levels that are greater than the inlet coal levels for Runs 1 and 2. Table E-3 ICR Coal-fired EGUs with Suspect Chromium and Nickel Run Values ORIS Facility Unit Config ID Runs Impacted EPA Use in Risk Assessments 47 Colbert 4 All Phase II 1-ESP average EF calculation 1218 Fair Station U2 Run 2 Phase II 1-ESP average EF calculation 2161 James River Unit_4_JRPS Run 2 Phase II 1-ESP average EF calculation 2161 James River Unit_5_JRPS All Phase II 1-ESP average EF calculation 2840 Conesville CV3 All Phase II 1-ESP average EF calculation 3403 Gallatin 2 Run 2 Phase II 1-ESP average EF calculation 3942 Albright Unit_1 Run 1 Phase II 1-ESP average EF calculation 4042 Valley VAPP-B1 All Phase II 2-FF average EF calculation 4042 Valley VAPP-B2 All Phase II 2-FF average EF calculation 4042 Valley VAPP-B3 All Phase II 2-FF average EF calculation 4042 Valley VAPP-B4 All Phase II 2-FF average EF calculation 4078 Weston W3* Run 3 Phase II 5-FF,Wet average EF calculation 6021 Craig C1 All Phase II 5-FF,Wet average EF calculation Cambria Cogen 001 Run AES Hawaii 001 Run AES Hawaii 002 Run Roanoke Valley I Boiler 1 Run 1 Phase II 12-ACI,FF average EF calculation E-6

171 Table E-4 Example of Chromium and Nickel ICR Data for James River, Units 4 and 5 (ORIS 2161) Unit No. Run No. Coal lb/tbtu* Stack Emissions lb/tbtu Chromium Nickel Chromium Nickel Filterable PM ** PM Equivalent Chromium (ppmw) Unit , , ,500 Unit , , ,100 * Derived from fuel data reported in the OTM fuel data file posted to the docket. Coal data from James River units were not found in the EPA ICR Part III database, indicating EPA did not load fuel data supplied by the test site to the ICR database. Concentrations ranged from 4 to 5 ppmw chromium and nickel in the coal samples reported. ** Method 29 filterable particulate (lb/mmbtu basis). E.4 Updated Sites Average Emission Values or Emission Factors Annual mass emission rates for each of the EPA case study facilities were developed using revised emission values or revised emission factors for various control class bins. For sites tested in the ICR, revised annual emission values were calculated by omitting the suspect individual run values listed above in Table E-2 from the site average factor. If all individual runs for a unit were suspect, a revised average emission factor for the appropriate control class bin was used to estimate emissions. Revised average emission factors for various control class bins were developed using the Phase II data sets presented in the EPA case study spreadsheet supporting data file as the starting point. Runs from units with suspect chromium data sets (Table E-3) were then excluded from the chromium and nickel average emission factor calculations for the control configurations that apply to the list of case study plants. In addition, EPRI also developed revised emission factors by coal rank within each applicable control class bin. The resulting final set of revised emission factors used for units listed in the EPA case study spreadsheet are summarized in Table E-5 and compared to the original Phase II emission factors calculated by EPA in that spreadsheet. Note that no adjustments were made to EPA s Phase II average arsenic emission factors for various control class bins. Table E-5 shows both the arithmetic average and the geometric mean value for each category. Note that the geometric mean, which may be the better statistic for most cases, is close to the arithmetic mean for the categories from which suspect data have been excluded. The arithmetic mean and geometric mean revised EPRI factors shown in Table E-5 are based on the same data sets from which test sites with suspect data have been excluded. E-7

172 Table E-5 Revised Control Class Average Emission Factors for Coal-fired Units Used in EPRI Case Study Risk Reassessment 1-ESP arith EPA Control Class Emission Factor Bin 1-ESP geo 4-ESP, FGD Wet arith 4-ESP, FGD Wet geo 5-FF, FGD Wet arith 5-FF, FGD Wet geo Arsenic EPA Factor (Phase II) 2.01E E E E E E-7 Revised EPRI Factor, All Coals 2.01E E E E E E-7 Revised EPRI Factor, Bituminous 3.57E E E E E E-7 Revised EPRI Factor, Subbituminous 1.72E E E E E E-7 Chromium EPA Factor (Phase II) 5.62E E E-6 1.8E E E-6 Revised EPRI Factor, All Coals 6.18E E E-6 1.8E E E-6 Revised EPRI Factor, Bituminous 8.39E E E E E E-6 Revised EPRI Factor, Subbituminous 3.62E E E E E E-7 Nickel EPA Factor (Phase II) 4.92E E E E E E-6 Revised EPRI Factor, All Coals 9.16E E E E E E-6 Revised EPRI Factor, Bituminous 1.26E E E E E E-6 Revised EPRI Factor, Subbituminous 5.23E E E E E E-6 Values reported in lb/mmbtu arith = arithmetic mean; geo = geometric mean ESP = electrostatic precipitator; FF = fabric filter; FGD wet = wet flue gas desulfurization Category factors more than 5x different The arithmetic mean values listed in Table E-5 were used by EPRI to estimate actual annual emission for one or more units from the following plants listed in EPA s case study spreadsheet: Canadys (ORIS 3280), Chesapeake (ORIS 3803), Conesville (ORIS 2840), Cromby (ORIS 3159), Labadie (ORIS 2103), Merrimack (ORIS 2364), Monticello (ORIS 6147), and Muskogee (ORIS 2952). All actual annual emission rates were calculated based on the actual annual heat input values, as presented in EPA s case study spreadsheet as MMBtu per year. E-8

173 Emissions (lb/tbtu) EPRI Comments on Proposed HAPs MACT Rule 4 August 2011 In EPRI s 2009 risk modeling project [EPRI 2009], emission correlations were developed by EPRI based on historical measurement data for HAPs trace elements and then used to estimate emissions from each coal-fired unit in industry. These correlations predict lb/tbtu emission factors for each trace element, based on the following inputs: trace element content of the coal (ppmw), ash content of the coal (ash fraction), and the stack particulate emission rate (lb/mmbtu). These correlations represent modified versions of the correlations adopted by EPA for estimating emissions from coal-fired units in the AP-42 Compilation of Air Pollutant Emission Factors. An example of these emission factor correlations is provided in Figure E-3 for arsenic. 1.0E+03 ESP FF 1.0E+02 FGDd/FF FGDw 1.0E E E-01 y = 2.92x 0.77 R² = E Coal ppmw * PM / ash% (lb/tbtu) Figure E-3 EPRI Emission Factor Correlation for Arsenic Emissions from Coal-fired EGUs [EPRI 2009] EPRI believes that these correlations provide a more accurate representation of emissions compared to the average emission factor approach used by EPA for case study sites not measured as part of the ICR. For example, EPA s case study risk assessment included the four bituminous/esp units at the Chesapeake Station, but none of these units was tested in the ICR. Therefore EPA appears to have used average emission factors derived from ICR measurements for units equipped with ESPs to estimate emissions from Chesapeake. EPRI reviewed the ICR Part II database and found that there were sufficient site-specific measurement data regarding coal composition and stack PM emission rates at Chesapeake to derive emission factors based on the EPRI correlations. E-9

174 As an illustration, Table E-6 summarizes the coal (arsenic and ash) and stack particulate measurement data for Chesapeake (ORIS 3803) extracted from the Part II database. The coal arsenic composition shown in Table E-6 is a tonnage-weighted value derived from Part II fuel shipment analyses reported for 2009 at Chesapeake. It was assumed that this tonnage-weighted average coal arsenic value was representative of fuels fired at all four units at Chesapeake. The stack particulate emission values for each unit were derived from annual stack measurements conducted at each unit from 2005 through 2009, as reported in the Part II ICR database. Table E-6 also provides a comparison of the EPA average emission factor for arsenic for the ESP bin with the arsenic emission factor derived using the EPRI emission factor correlation. The unitspecific emission factors derived from the EPRI correlation (0.7 to 2.4 lb/tbtu) are significantly lower than the EPA average emission factor for ESP units derived from the ICR test data (20 lb/tbtu). Chesapeake burns approximately 80% South American coal and 20% Central Appalachian coals. The ICR Part II annual coal shipment data for 2009 showed that the South American coal samples all contained about 2 ppmw arsenic, which is more consistent with arsenic levels in subbituminous coals than in eastern bituminous coals. The geometric mean emission factor (Table E-6) for arsenic is 1.1 lb/tbtu, which is generally consistent the estimates developed using the EPRI correlation. Table E-6 Example Comparison of EPA Average Emission Factor and EPRI Emission Factor Correlation Chesapeake Station (ORIS 3803), Arsenic Unit No. Part II ICR Data Arsenic Emission Factor (lb/tbtu) Coal Arsenic (ppmw)* Coal Ash (fraction)* Stack Particulate Emissions (lb/mmbtu)** EPRI Correlation EPA Average Factor for 1-ESP Bin Unit Unit Unit Unit * Tonnage weighted average coal composition based on 2009 fuel shipment records reported in Part II ICR database. Assumed to be representative of the coal fired at all four units. ** Average of annual stack particulate emission measurements from 2005 to 2009, as reported in the Part II ICR database. E.5 References EPA, Exposure Factors Handbook Update. EPA May 1989, EPA 600-P August. EPRI, Emission Factors Handbook: Guidelines for Estimating Trace Substance Emissions from Fossil Fuel Steam Electric Plants. EPRI, Palo Alto, CA: E-10

175 EPRI, Updated Hazardous Air Pollutants (HAPs) Emissions Estimates and Inhalation Human Health Risk Assessment for U.S. Coal-Fired Electric Generating Units. EPRI, Palo Alto, CA: E-11

176

177 F APPENDIX: CASE STUDY RISK ASSESSMENT EPRI INHALATION RISK ASSESSMENT FOR SELECTED COAL-FIRED POWER PLANTS Table F-1 Comparison of EPA 2011 and EPRI 2007 Stack Parameters Facility EPA EPRI Latitude Longitude Temp Height Diameter Velocity Latitude Longitude Temp Height Diameter Velocity (K) (m) (m) (m/s) (K) (m) (m) (m/s) Cambria Cogen Cambria Cogen SC&E Canadys SC&E Canadys SC&E Canadys Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Conesville Conesville Conesville Conesville Exelon Cromby Generating Station Exelon Cromby Generating Station TVA Gallatin TVA Gallatin TVA Gallatin TVA Gallatin City Utilities of Springfield -James River City Utilities of Springfield -James River City Utilities of Springfield -James River Amerenue-Labadie Amerenue-Labadie Amerenue-Labadie Amerenue-Labadie PSNH -Merrimack PSNH -Merrimack Monticello Steam Electric Plant Monticello Steam Electric Plant Monticello Steam Electric Plant OG&E -Muskogee OG&E -Muskogee OG&E -Muskogee F-1

178 Table F-2 Comparison of EPA 2011, and EPRI 2011 and 2007 Emission Rates Facility Name EPA Value Arsenic, ton/yr EPA Value Cr VI, ton/yr EPA Value Nickel Subsulfide, ton/yr (65% of total Ni) EPRI All Coals Value Arsenic, ton/yr EPRI All Coals Value Cr VI, ton/yr EPRI All Coals Value Nickel Subsulfide, ton/yr (65% of total Ni) EPRI Coal Rank Value Arsenic, ton/yr EPRI Coal Rank Value Cr VI, ton/yr EPRI Coal Rank Value Nickel Subsulfide, ton/yr (65% of total Ni) EPRI 2007 Value Arsenic, ton/yr EPRI 2007 Value Cr VI, ton/yr EPRI 2007 Value Nickel Subsulfide, ton/yr (65% of total Ni) Cambria Cogen Cambria Cogen NM NM NM SC&E Canadys SC&E Canadys SC&E Canadys Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Dominion Chesapeake Energy Center Conesville Conesville Conesville Conesville NM NM NM Exelon Cromby Generating Station Exelon Cromby Generating Station NM NM NM TVA Gallatin TVA Gallatin NM NM NM TVA Gallatin TVA Gallatin NM NM NM City Utilities of Springfield -James River City Utilities of Springfield -James River City Utilities of Springfield -James River Amerenue-Labadie Amerenue-Labadie Amerenue-Labadie Amerenue-Labadie PSNH -Merrimack PSNH -Merrimack Monticello Steam Electric Plant Monticello Steam Electric Plant Monticello Steam Electric Plant OG&E -Muskogee OG&E -Muskogee OG&E -Muskogee NM = not modeled F-2

179 G APPENDIX: ITEMIZED COMMENTS ON TECHNICAL SUPPORT DOCUMENT FATE AND TRANSPORT, HEALTH, AND RISK G.1 Errors Identified These comments address material contained in the Technical Support Document: National-Scale Mercury Risk Assessment Supporting Appropriate and Necessary Finding for Coal- and Oilfired Electric Generating Units. EPA-452/D , March Page 4. It is stated that, Because measurements for the dry deposition of Hg do not currently exist, the modeled dry deposition performance could not be evaluated. However, there are a number of experimental methods used in the measurement of the dry deposition of atmospheric mercury, most of which have been reviewed by Zang, et al. [2009]. These include enclosure methods, such as the dynamic flux chambers (DFC); the use of surrogate surfaces (SS); and micrometeorological methods, including the modified Bowen ratio (MBR) method, the aerodynamic (AER) method, and the relaxed eddy accumulation (REA) method [Zang et al., 2009]. Hence, there are available measurements (direct and indirect) for dry deposition, which the Agency should use to evaluate the dry deposition performance of its model. Page 5 (and page 34). Table ES-1 (and Table 2-1) shows a comparison of total and U.S. EGUattributable Hg deposition (in ug/m 2 ) for the 2005 and 2016 scenarios, with the statistics based on CMAQ results interpolated to the watershed-level and calculated using all ~88,000 watersheds in the United States. Table ES-2 (and Table 2-2) shows the same comparison for the same watersheds, but as the percentage of total Hg deposition attributable to U.S. EGUs for 2005 and When comparing the actual numbers in Table ES-1 with the percentages in Table ES-2, there is a discrepancy. For example, for the 2005 scenario, the 99th percentile total Hg deposition and U.S EGU-attributable Hg deposition are reported as 58.32, and 7.77 (μg/m 2 ), respectively. Calculating this as a percentage would result in 13.32% [7.77 / * 100%]. However, Table ES-2 shows that under the 2005 scenario U.S EGU-attributable Hg deposition is 30% at the 99th percentile level. The same holds for the 2016 scenario: total deposition is calculated as μg/m 2, with 2.41 μg/m 2 reported as attributable to U.S. EGUs, which is 4.28% [2.41 / *100%]. However, Table ES-2 reports this as 11%. All percentiles, the mean and median, show a discrepancy between Table ES-1 and Table ES-2. Page 7. Table ES-4 shows a comparison of total and U.S. EGU-attributable fish tissue MeHg concentrations for the 2005 and 2016 scenarios, and Table ES-5 shows the same comparison expressed as a percentage. However, the listed percentages in Table ES-5 are not those obtained when calculating the reported concentrations in Table ES-4 to percentages. For example, under the 2005 scenario: at the 75th percentile the fish MeHg concentrations attributable to U.S EGUs are estimated at ppm and the total fish tissue MeHg concentration is reported as 0.39 ppm. As a percentage, U.S EGU-attributable fish tissue MeHg concentrations of total MeHg levels are 8.21% [0.032 / 0.39 * 100%]. However, Table ES-5 reports 14%. Similar discrepancies are found for the other percentiles and the mean. Since the reported percentages in Table ES-5 are for the 2,461 watersheds included in the risk assessment, whereas no such description is provided for Table ES-4, it is possible that Table ES-4 represents all ~88,000 watersheds, although that G-1

180 seems unlikely. If that is the case, then EPA should also report how they derived the percentages for the 2,461 watersheds from MeHg fish tissue concentrations for the ~88,000 watersheds. Page 11. It is stated that, Reflecting current emissions, U.S. EGUs can contribute up to 11% of total Hg deposition (for the 99th percentile watershed in the 2016 scenario). However, based on the first comment and apparent miscalculation of percentages in Table ES-2, this number should be 4.28%. Page 12. It is stated that, U.S. EGUs are estimated to contribute up to 18% of fish tissue MeHg levels in the 2016 scenario (for the 99th percentile). But, based on the second comment and Table ES-4, this should be 3.86%. Page 16, footnote 21. No basis or explanation is given to justify why there should be at least 25 poor Hispanics in a U.S Census tract in a watershed. EPA refers to Section 1.3 for additional detail, but no explanation given there, except that the number 25 refers to Vietnamese fishers. In addition, Section 1.3 refers to Appendix C for additional information on source populations, but Appendix C also fails to provide an explanation for EPA s choice of the number 25. Page 24. It is unclear what the exact/mathematical relationship is between matching proportional change in the levels of MeHg in fish to a fractional change in mercury deposition to a watershed. EPA refers to Appendix E of the TSD, but Appendix E provides only a general description of MMaps, not an in-depth analysis of EPA s assumed (near) steady-state and matching proportional change function. Page 29. It is stated that, This watershed coverage (which is only about 4% of the watersheds in the U.S.), leaves much of the country not covered by the analysis, including a substantial number of watersheds with relatively elevated levels of U.S. EGU-related mercury deposition. It is unclear from this statement what exactly this substantial number of watershed is and what exactly the relatively elevated levels of U.S. EGU-related mercury deposition are. Page 34, Table 2-3. This table reports a comparison of percent reduction of total mercury deposition, and U.S. EGU-attributable deposition, based on comparing the 2016 scenario against the 2005 scenario, with values based on CMAQ results interpolated to the watershed-level and reflecting trends across all ~88,000 watersheds in the United States. The reported percentage values are apparently calculated from the mercury deposition values (μg/m 2 ) reported in Table 2-1. Table 1: reprint of Table 2-1 and Table 2-3 with inclusion of EPRI s re-calculated percent changes based on the values reported in Table scenario 2016 scenario EPA's calculated percentage (Table 2-3) EPRI's calculated percentage Statistics U.S. EGUattributable Hg attributable Hg total Hg EGU-attributable Hg in total Hg EGU-attributable Hg U.S. EGU- percent change in percent change in U.S. percent change percent change in U.S. Total Hg Total Hg Deposition Deposition Deposition Deposition deposition deposition deposition* deposition** Mean % NC -3.86% NC Median % -41% -3.83% % 75th percentile % -70% -3.63% % 90th percentile % -80% -2.86% % 95th percentile % -85% -4.59% % 99th percentile % -91% -3.58% % * (total Hg deposition 2016 scenario total Hg deposition 2005 scenario) / (total deposition 2005 scenario) 100% ** (U.S. EGU attributable Hg deposition 2016 scenario U.S. EGU attributable Hg deposition 2005 scenario) / (U.S. EGU attributable Hg deposition 2005 scenario) 100% G-2

181 From Table 1, it follows that the percent changes reported by EPA in Table 2-3 are incorrect, unless EPA has done additional unreported calculations to derive the percentages reported in Table 2-3. If the percent changes reported in Table 2-2 and Table 2-3 are incorrect, and it appears they are, then the text following and citing these numbers (e.g., on page 35) is also incorrect. Page 43, Table 2-4. The reported total and U.S. EGU-attributable percent change (2016 versus 2005) in Hg fish tissue concentrations reported in Table 2-4 are incorrectly rounded. For example, the mean percent change in total Hg fish tissue concentrations between 2016 and 2005 (0.29 and 0.31, respectively) is, based on the data provided in the table, 6.45% lower (-6.45% change; [[ ] / [0.31]]* 100%) and not 6.7% lower as EPA reports. The difference probably arises because EPA has used all the decimals for the entire values. However, when reporting 2 or 3 decimals, the reported calculated fraction (in percentage) should also be based on these 2 or 3 decimals, and not all decimals, and the fraction should also be reported in 2 or 3 decimals i.e., the percent change in mean total Hg fish tissue concentrations between 2016 and 2005 is 6.45% (and not -6.7%) and the percent change in mean U.S. EGU-attributable Hg fish tissue concentrations is % (and not -65%), etc. Page 43, Table 2-5. It is unclear why the fractions reported in Table 2-5 differ from the fractions reported in Table 2-4. See also the second comment above for further details, since Tables ES-3, ES-4 and Tables 2-4, 2-5 are practically identical. Page 51, Tables 2-6 and 2-7. Percentile risk estimates are reported for the high-end female consumer population assessed nationally for the 2005 and 2016 scenarios. Interestingly, for 2016 IQ losses (points) and RfD-based hazard quotients (HQs) due to total Hg deposition at the 90th percentile and above are higher than those in However, EPA estimates in this proposed rulemaking that total Hg deposition decreases from 2005 to 2016 (see for example Tables ES-1, ES-2, 2-1 and 2-3). In addition, EPA ascertains that the underlying concentration-response function for IQ loss estimates is linear. Based on these two arguments, IQ losses and RfD-based HQs should be proportionally lower in 2016 compared to 2005, not higher. EPA states that the IQ loss estimates at the 95th and 99th percentile are subject to greater uncertainty, but since a linear concentration-response function underlies these calculations, uncertainty shouldn t increase at higher percentiles. Page 63, 3rd bullet. It is stated that, Comparing the magnitude of Hg fish tissue levels with total Hg deposition (as characterized at the watershed-level) suggests that there is not a strong correlation. This statement is based on Figure 2-17, which has been inserted below for clarity. G-3

182 It follows from Figure 2-17 that there is absolutely no linear relationship between 2005 Hg deposition and Hg fish tissue concentrations. In addition, Hg fish tissue concentrations can only be related to Hg deposition, and not vice versa. Hence, Figure 2-17 depicts a relationship, not a correlation and there is simply no relationship. Page 78. It is stated that, Several of these limitations are briefly discussed here, but a more complete discussion is presented in section of the RIA TSD supporting this regulatory review (USEPA, 2011). Section of the RIA is about Effects on Fish. Since the quoted sentence is regarding Mercury maps assumptions under Establishing the U.S. EGUattributable fraction of total exposure, the referral is incorrect. References Zhang L., Wright L.P., Blanchard P., A Review of Current Knowledge Concerning Dry Deposition of Atmospheric Mercury, Atmospheric Environment, 43, G.2 Additional Comments These comments address material contained in the Technical Support Document: National-Scale Mercury Risk Assessment Supporting Appropriate and Necessary Finding for Coal- and Oilfired Electric Generating Units. EPA-452/D , March 2011; and in the Regulatory Impact Analysis of the Proposed Toxics Rule: Final Report, March G-4

183 G.2.1 Substantial uncertainty exists in characterizing fish tissue methylmercury concentrations used to assign exposure levels to watersheds EPA makes multiple assumptions in assigning methylmercury exposure levels to watersheds. Per the TSD and the RIA, the Agency developed a master database of fish tissue methylmercury samples compiled from three varied sources and used information in the database to assign a single fish tissue methylmercury value to each HUC watershed. In the TSD, a subset of these samples was used to develop methylmercury exposure by watershed for use in exposure modeling. However, the specific criteria for choosing this subset of samples and the basis for selecting those criteria (such as excluding fish under 7 inches in length, page 71, TSD Appendix,) are not clearly delineated in either the TSD narrative or TSD Appendix B. In addition, no descriptive statistics are provided regarding the relative distribution of the fish tissue samples by location (individual watershed, state, region), inland water type (river, lake) or time (year) (see TSD, Section 2.4). TSD Appendix B states that, on average, 10 samples were available over the period for each of the 2,461 watersheds used in this analysis. However, at least one watershed had 270 measurements. Therefore, tabulating fish tissue methylmercury levels assigned to watersheds, along with indicator and other analytical variables for descriptive and statistical review would help provide clarity. It remains uncertain how variability in the distribution of methylmercury tissue samples would result in over- or underestimates of risk. Also, EPA does not justify other assumptions made in its exposure modeling, including: Applying the 75th percentile fish methylmercury level to establish exposure for each HUC12 watershed: Other than stating that subsistence fishers would possibly eat or favor larger fish with higher bioaccumulation potential (TSD, pages 22 and 71, among others), EPA presents no other underlying reasons, literature, or data support for this assumption. As discussed below, the Agency presents no discussion or quantitative analysis describing the effect of choosing the 75th percentile over the mean or median methylmercury level. Over 60% of the watershed exposure values were based on a single fish tissue sample; therefore, using the 75th percentile for watersheds lacks statistical robustness. Therefore, the Agency should provide a sensitivity analysis (for example, risk calculation using mean methylmercury levels) to illustrate the influence of this choice on the risk estimates and their variance, to aid interpretation. Creating river and lake methylmercury fish tissue levels for an individual HUC watershed, then assigning the higher of the two resulting 75th percentile values as the watershed-specific level: Given the limited data available for many HUC watersheds, how much variability does this additional parsing of sparse data add to the watershed exposure estimates? Since it appears that a substantial proportion of watershed exposure estimates (almost 25% lake and 40% river values) are based on only a few data points (< 2), using the 75th percentile may not adequately represent the watershed and in surrogate, the potentially exposed population. Again, the impact of such sampling choices in terms of potential uncertainty, overt influence, or bias is not clear. EPA goes on to state that using a central tendency (mean or median) estimate would low bias methylmercury levels, but presents no quantitative proof (TSD, page 72). This assertion could be tested by substituting mean fish tissue levels for 75th percentile values in the analysis, or G-5

184 excluding watersheds lacking a minimum number of samples. Without conducting and presenting the results of a sensitivity analysis to determine the effect of its exposure modeling choices, or providing HUC watershed-specific exposure data to allow additional review by others, the Agency severely limits reviewers ability to comment on the scientific appropriateness or population representativeness of its choices. G.2.2 EPA does not provide a clear, well supported description of assumptions about fish intake in population subgroups No summary or comparison tables describing study characteristics, population demographics, or fish intake descriptive statistics are presented by the Agency in either the TSD narrative or Appendix C. TSD, Appendix Table C-1 does offer observations from papers the Agency selected for the study. However, standardized criteria describing study strengths and weaknesses, and the applicability of these three studies (as well as others not selected) would make EPA s analytical decision-making process more transparent. Examples could include summary tables as previously used in by the Agency in other publications (as an example, see Tables 1 and 8 in [Moya et al., 2008], as well as a rationale for using or not using the individual studies. Other issues of potential concern related to assumptions about fish intake include the following: Small sample size in the three selected studies increases the likelihood of high variability among the highest consuming individuals (> 90th percentile): Small sample size in these studies [Burger, 2002; Schilling, 2010; Dellinger, 2004] could unduly influence the risk estimates, particularly at risk estimates at the higher intake levels focused on by EPA. For example, the Burger [2002] study offers intake estimates for one subpopulation of particular concern black subsistence fishers based on 39 interviewees, about 30 of whom (79%) consume fish. Thus, the highest distributions/intakes are based on very few individuals. Several estimates of intake rate are based on very small numbers of people, yet are applied in the national scale assessment including Vietnamese, Hispanic, Laotian, and Southeastern Blacks [Burger, 2002; Schilling et al., 2010; Dellinger, 2004]. However, there are no sample counts describing the highest fish consumers (> 90th percentile distribution) (see TSD, Sections 1.3, 1.4, 2.4, 2.6, and 2.7). EPA presents no summary table of fish intake values used in the analyses including basic descriptive statistics, such as the number of individuals who had the highest consumption rates: Although TSD Appendix D refers the fish ingestion rate [FIR] back to Table 1-1 in Section 1-3, no such summary table exists in the TSD document. As the sample size decreases by intake level in the chosen study subpopulations, uncertainty in the risk estimates increases. This extrapolation of intake values is compounded by other factors, such as poor linear risk model fit (see EPRI Comments, Section G.2.5 below). Without bounds (a confidence interval) or a range of risk estimates at a given consumption rate percentile, it remains unclear how risk estimates really differ with increasing intake (e.g. mean, 90th, 95th, 99th intake rate) across the distribution of watersheds, and between temporal scenarios (2005 versus 2016). Criteria are unclear for the inclusion/exclusion of individual fish intake studies used in the national assessment: EPA appears to use data within and between studies in an inconsistent manner. As one example, the Agency selected Laotian data from Schilling, et al. [2010] based on two types of survey information (intercept angler and community selected G-6

185 participants), but used angler only river data from the watersheds of California s Central Valley (see TSD, Section 1.3). EPA gives no justification using this mixed survey sample, nor does it acknowledge the potential bias inherent in using data compiled from two survey types. Furthermore, the TSD states that intake rates from the Schilling study [2010] fall within the range observed in other studies of U.S. Asian populations (see TSD, Appendix C, Table C-1), but no citations with relevant ethnic groups are presented in the narrative or Appendix C. In a second example, the Agency selected self-reported consumption data from the Dellinger [2004] study, thereby rejecting the actual, or measured fish intake data based on a subset of the larger study (187 versus 822) (see TSD, Appendix C, Table C-1). In this case, EPA explains exclusion of the measured intake data in TSD, Appendix C, Table C-1, on the basis of small sample size. But this subset appears to be as large as or larger than the other subsistence groups included, except for the Southeastern whites identified by Burger [2002]. The TSD narrative goes on to state that small numbers decrease the likelihood of identifying true high consumers, but does not describe the potential for underestimation presented by dietary recall over 12 months in the Dellinger [2004] study (see TSD, Section 1.3 and Appendix C, Table C-1). Also, EPA apparently selects the recall dietary data in preference to measured fish intake data because the high-end intake derived from the recall questionnaires appears closer to subsistence rates in other groups. Unfortunately, recall dietary studies, particularly one-time assessments extrapolated back for long periods, have demonstrated overestimation. For example, as noted by Dellinger [2004], a study in Quebec reported actual fish intake levels 6 times smaller than those recalled by recreational fishers. To summarize, data issues related to estimation of fish consumption rates for specific subpopulations (based, for example, on ethnicity or sex) may lead to over- or underestimates of actual intake and preclude valid statistical comparisons. In particular, small samples may not include enough people to represent the subpopulation and may suffer from large variability. It would be prudent for EPA to carefully define target subpopulations, and to evaluate how well studies used in its risk assessment represent national target population(s), by providing supporting demographic, social, and/or community health data. If such supporting data are unavailable, then the uncertainty created by this lack must be fully characterized for risk managers. References Burger, J., Daily Consumption of Wild Fish and Game: Exposures of High End Recreationists, International Journal of Environmental Research and Public Health, 12 (4), Dellinger J.A., Exposure Assessment and Initial Intervention Regarding Fish Consumption of Tribal Members of the Upper Great Lakes Region in the United States, Environmental Research, 95 (3), Moya J., Itkin C., Selevan S.G., Rogers J.W., Clickner R.P., Estimates of Fish Consumption Rates for Consumers of Bought and Self-caught Fish in Connecticut, Florida, Minnesota, and North Dakota, Science of the Total Environment, 15 (403) [1 3], G-7

186 Schilling F., White A., Lippert L., Lubell M., Contaminated Fish Consumption in California s Central Valley Delta, Environmental Research, 110 (4), G.2.3 EPA provides no clear description of its watershed level risk modeling or the metrics used to quantify risk The TSD summary risk presentation lacks a transparent, easy to follow description of the modeling methods used to assign watershed based risk levels. There is no initial, clear statement that the analysis estimates risk at the watershed level based on a subset of demographic groups determined by the Agency to represent at risk populations (see EPRI Comments, Sections 3.7 and 3.10 above). In the TSD Sections 1.2 and 2.6, the Agency presents risk estimates for two different metrics: (1) the number of at risk watersheds exceeding a threshold based on the EPA RfD (Hazard Quotient >1.5) or (2) the number of at risk watersheds exceeding categorical levels of IQ points (change greater >1 or >2 IQ points). However, the relative strengths and weaknesses of the two metrics are not stated or documented. The Agency does say that relying on IQ as the outcome of interest underestimates the potential number of watersheds at risk (see TSD, Sections 1.3, 1.4, 1.5, 2.6, and 2.7). Without supporting citations and discussion comparing the risk metrics, the reader is left with little understanding of this statement. As discussed above, the variability and uncertainty in the underlying exposure and population data inputs are not described, so their quantitative influence on the watershed level risk estimates remains unknown. G.2.4 Assigning higher-than-usual EGU deposition values to watersheds and subsistence populations lacks quantitative support EPA provides summary tables with risk estimates for percent watersheds at risk across a range of assumed EGU-attributable percentages of mercury deposition. However, the TSD does not clearly demonstrate why estimated EGU deposition contributions ranged up to 20% (see TSD, Section 2.6.2), compared to lower EGU deposition contributions reported in both the current TSD and previous EPA publications [Griffiths et al, 2007]. Although EPA s focus is on subsistence fishing populations (see EPRI Comments, Sections 3.7 and 3.10 above), the populations and assigned intake values used in the current calculations appear to overlap with definitions of recreational anglers. It is unclear how higher estimated EGU deposition contributions relate to higher risk watersheds and subsistence population subgroups. It appears that higher estimated EGU deposition contributions are as likely to overestimate as underestimate risk, especially given the high degree of uncertainty in the small, skewed sample sizes used to create and characterize surrogate test populations for subsistence fishing communities (see above discussion on exposure, population, and fish intake values). For the primary subsistence population based on watershed high-end female consumers, the Agency states that no substantial change in risk (HQ estimates or total IQ loss) occurs between the 2005 and 2016 scenarios, due to and therefore the relatively small fraction of total mercury deposition contributed by U.S. EGUs, even substantial reductions in U.S. EGU deposition is [sic] unlikely to substantially affect total risk. [TSD, Section 2.6.1, page 53] G-8

187 The Agency states that this is not the case for the subset of watersheds with a larger EGUattributable fraction of deposition. However, the potential number or location of these watersheds and their potential correspondence with subsistence populations are not clearly described in the TSD. Due to the lack of robust modeling in these higher level watershed/female fish intake strata (limited data are presented in TSD, Tables 2.6 and 2.7), the quantitative support for this statement remains unclear. Reference Griffiths C., McGartland A., Miller M., A Comparison of the Monetized Impact of IQ Decrements from Mercury Emissions, Environmental Health Perspectives, 115 (6), G.2.5 Given high uncertainty in the underlying data, EPA should provide more than a minimal description of risk results at the watershed level The TSD and TSD Appendices lack a full summary presentation of risk results. Only minimal summary tables are available to compare 2005 versus 2016 scenarios; there is no 2016 scenario to compare directly with the 2005 scenario summarizing percentile risk estimates for the subsistence fishing populations of interest (see TSD, Table 2-8). To aid in understanding the limited amount of data and associated uncertainty for the exposure/population data sources used, it would be helpful to have a summary presentation and discussion of the actual number of underlying observations in each intake level and risk estimate (or subset of cells). For example, in Table 2.6 (TSD, Section 2.6.1, page 51) under the RfDbased HQ metric, how many high-end female consumers were in the 99th intake percentile (e.g., 373 g/day, based on the 75th percentile of fish tissue values) to estimate the number of watersheds in the 95th risk percentile in order to give the RfD-based HQ of 3.2 for U.S. EGUs? Unfortunately, this kind of information is not provided in the TSD or TSD Appendices. An additional lack of robustness in the risk modeling seems to appear in the limited risk summary tables present, but is not discussed by the Agency. For example, for both the 2005 and 2016 scenarios (TSD, Tables 2-6 and 2-7) the underlying dose-response function for the risk modeling of the IQ metric appears to fall apart or lack biological plausibility (specifically in the upper intake percentiles 173 or 373 g/day for the 90 99th watershed percentiles in the highend female consumer. 9 In this case, EPA s risk model assigns estimates of high-end female exposure (by watershed) at methylmercury hair levels greater than those observed in the underlying high-exposure studies of high fish consumption (New Zealand) and high marine fish and whale consumption (Faroe Islands). How applicable is this risk model to U.S. populations? Although EPA dismisses the value of the IQ metric in the TSD (see TSD, Sections 1.3, 1.4, 1.5, 2.6, and 2.7), the IQ metric is later used in the RIA (see RIA, Chapters 5.7 and 5.8) to monetize U.S. risk in angler/high end fish consumer populations. In the other subsistence populations, based on race/ethnicity and much smaller samples for developing model parameters for estimating exposure and consumption, EPA presents no IQ loss metrics or 2016 RfD-based HQ scenario risk calculations in the TSD. The Agency states that these risks can be inferred. However, this presents an incomplete view of the risk estimation 9 This is the only instance in the TSD where EPA acknowledges the potential quantitative uncertainty it its risk modeling. G-9

188 process, particularly as IQ loss metrics are used by EPA in the RIA for these populations (see discussion below). G.2.6 Alternative estimates of populations at risk due to methylmercury exposure using an RfD-based approach EPA could derive estimates of the number of women of child bearing age (and children) at risk due to methylmercury exposure via other methods, or at least use such methods for comparison of estimates presented in the TSD and RIA. Specifically, EPA could apply available doseresponse data (used in the current EPA IRIS assessment) and U.S. biomonitoring data, as described below. EPA should use this methodology as a reasonable, transparent approach to compare with the methodology used in the TSD, particularly as EPA relies on substantial amounts of data and unsupported assumptions. Gentry et al., [2011] recently proposed an alternative methodology for estimating U.S. populations at risk due to population-weighted measures of methylmercury exposure relative to the RfD (EPA IRIS 5.8 ppb methylmercury in cord blood, based primarily on Faroe data). This work used four different modeling approaches to evaluate the potential for negative neurobehavioral outcomes related to methylmercury exposure at levels identified by EPA [EPA, 2000; EPA, 2001]. The modeling used benchmark dose values (BMD) from the Faroe Islands and Seychelles cohorts [Grandjean et al., 1997; Budtz-Jørgensen et al., 1999; Budtz-Jørgensen et al., 2000] and applied the most recent U.S.-specific mercury biomonitoring data from the National Health and Nutrition Examination Survey (NHANES) conducted in , involving women years old and children 1 19 years old. As a statistically representative U.S. sample, the data from NHANES provides population weights that allow researchers to estimate the expected number of cases for each demographic group (for example, sex and/or age). Gentry et al. [2011] demonstrated that the modeling approach including assumptions of zero risk at RfD, linearity, threshold, etc. widely influenced estimates of populations at risk. In the most conservative approach (linearity between the BMD and RfD) using the limited available data from the Faroe Islands cohort (BMD only, no underlying dose-response data), the highest (upper bound) estimates of total U.S. populations at risk were 389 children and 1936 women with adverse effects. However, the lack of available dose-response data from the Faroes limited additional modeling for the final three modeling approaches. Therefore, Gentry et al. modeled data from the Seychelles cohort. They derived a BMD and an RfD (10.5 ppb methylmercury in cord blood, as estimated by the k-power model). Upper bound estimates ranged from women with adverse effects (in three of the four modeling approaches), compared to 1 9 women with adverse effects in the conservative, linear approach applied to the BMD and RfD (5.8 ppb) derived from the Faroe Islands data. No children in the NHANES sample had blood levels above the derived 10.5 ppb RfD. Although Gentry, et al. studied the general U.S. population and not a select subgroup representing subsistence fishers their results do suggest a baseline range of women and children at potential risk from exposure to methylmercury. The strengths of the modeling approaches they investigated include: the use of U.S.-specific biomonitoring data from a statistically representative national sample, G-10

189 the ability to evaluate the potential fraction of exposed individuals above or below the RfD, including a range of values and upper bounds to indicate variability, and the ability to estimate risk at a specific dose or biomarker concentration, relative to the RfD. References Budtz-Jørgensen E., Keiding N., Grandjean P., Benchmark Modeling of the Faroese Methylmercury Data. Final Report to U.S. EPA. Research Report 99/5. Department of Biostatistics, University of Copenhagen. Budtz-Jørgensen, Grandjean P., Keiding N., White R.F., Weihe P., Benchmark Dose Calculations of Methylmercury Associated Neurobehavioral Deficits, Toxicology Letters, , EPA, Integrated Risk Information System. Methylmercury. EPA, Water Quality Criterion for the Protection of Human Health: Methylmercury. Final. Office of Science and Technology. EPA-823-R January. Gentry P.R., Van Landingham C., Aylward L., Hays S., Use of Biomarkers in the Benchmark Dose Process. Alliance for Risk Assessment. Grandjean P., Weihe P., White R.F., Debes F., Araki S., Yokoyama K., et al., Cognitive Deficit in 7-year Old Children with Prenatal Exposure to Methylmercury, Neurotoxicology and Teratology, 19, G.2.7 EPA provides no quantitative analyses describing the influence of variability and uncertainty on its final health risk estimates EPA presents a limited qualitative overview of variability and uncertainty in the TSD narrative (see TSD, Section 2.7) and Appendix F (Tables F-1 and F-2). Topics covered in Appendix F include qualitative discussions of variability in the spatial distribution of mercury deposition, fish tissue methylmercury levels within specific watersheds, fish tissue methylmercury levels versus mercury deposition, distribution of subsistence fishing populations, and variation in fish intake levels. Without supporting references, summary statistics, figures, or charts, reviewers are unable to evaluate the range of uncertainty in the underlying data and thus, in the resulting risk estimates offered by the Agency. No quantitative analysis is provided, but such work is needed to characterize the metrics presented. EPA does offer a limited number of sensitivity analyses to help identify important sources of variability and uncertainty in their risk modeling. But these analyses focus only on the MMaps approach for handling inclusion of HUC12 watersheds with current non-air mercury sources, G-11

190 and for applying fresh waterbody type. However, EPA presents no sensitivity analyses illustrating potential changes in risk estimates for subsistence fishing populations due to uncertainty and variability in watershed mercury deposition, fish tissue levels, fish intake, or assignment of at risk populations to particular watersheds. Sensitivity analyses directly related to decisions on using these modeling parameters would clarify uncertainties in the overall risk characterization. Finally, TSD sensitivity analyses focus on RfD-based HQ thresholds, not IQ point thresholds. However, sensitivity analyses of the IQ metric could help inform the RIA, which relies on IQbased benefits, not benefits assessed by using the RfD-based HQ. Potential sensitivity analyses for both the TSD (RfD-HQ risk metric) and RIA (IQ risk metric) could include the following: EPA assumptions of subsistence or other high-end consumers use only larger fish (>7 inches): How would inclusion of these samples influence any summary, HUC watershed specific exposure statistics (sample count, mean/median, 75th percentile, etc.) and final risk estimates (watershed or IQ benefits)? EPA selection and use of the highest 75th percentile mercury exposure value between river and lake samples for watershed level analysis: How does differentiating among waterbody types differ relative to combined watershed 75th percentile statistics? EPA use of a 75th percentile mercury fish tissue value for assigning watershed base exposure: How does this differ from using mean or median percentile statistics? EPA selection and use of cooking loss (adjustment factor) that assumes consistent concentration of mercury after cooking (factor 1.5): How does application of this factor versus no cooking loss or a range of values influence the risk estimates? EPA use of 90 99th percentile fish consumption estimates for subsistence populations obtained from the literature: How does using the extreme intake estimates from a few small studies (and fewer individuals at these consumption levels) versus using the mean/median levels from the same studies influence the risk estimates? EPA exclusion/inclusion of specific watersheds based on exposure or demographic criteria: How would risk estimates change if these watersheds were used in the analysis (i.e., sensitivity analyses)? G.2.8 Potential overlap between subsistence fisher and recreational angler populations leads to IQ benefit overestimation The RIA appears to quantify monetary benefits from avoided IQ loss for all recreational freshwater anglers (baseline analysis) and for demographic populations selected to represent subsistence fishers as described in the TSD except that the Vietnamese subgroup is dropped without explanation. Also, low socioeconomic status based on census poverty tract values is used to categorize risk in Southeastern white and black females only. The analysis estimates the number of children affected by prenatal exposure to methylmercury and the magnitude of their loss in IQ points. The RIA baseline assessment (all U.S. states, see TSD, Sections and 5.9.1) uses fish intake levels to estimate exposure in recreational anglers previously recommended by EPA and derived G-12

191 from four studies conducted in Maine, Michigan, and Lake Ontario [EPA, 1997]. Both fish consumers and non-consumers were included in the survey. In a second RIA analysis, EPA calculates risk in potentially high-risk subpopulations (see TSD, Sections and 5.9.4) using subsistence level intake rates that were derived, as in the TSD, from Schilling, et al. [2010]. However, in presenting results for subsistence populations (RIA, Tables 5-8 through 5-19) EPA reports results for Recreational/Subsistence, making it unclear whether the Agency considers any and all fishing in these groups to be high risk. Therefore, EPA appears to apply only upper bound, subsistence level assumptions to the groups. Likewise, it is unclear whether the Agency assumes that all individuals in any demographic stratum (e.g., low-income African-Americans) living within 20 miles of any HUC12 waterbody at any time from 1995 to 2007 are subsistence fishers at equal risk for methylmercury exposure thus justifying the application of subsistence fish consumption rates. If subsistence and recreational angler classifications overlap in EPA s risk estimates, then the IQ point benefits appear to be overestimated in these target populations. EPA should clarify the presentation of the summary risk estimate tables to include footnotes or other narrative describing the assumptions and input parameters. EPA should reconcile discrepancies in the calculated monetary value of avoided IQ loss. Table 1-3 on page 1-5 of the RIA states that, at the 7% discount rate, the estimated monetary value of avoided IQ loss associated with methylmercury exposure from self-caught fish consumption among recreational anglers is reported between $ $ (in billions of 2007$). This would translate to $5,000 $9,000. However, Table 5-6 on page 5-79 of the RIA reports that at the 7% discount rate this estimated monetary benefit is between $456,146 $1,000,149 [$21,806,502 $21,350,356 and $47,813,136 $46,812,987] translating to between $ $ in billions of 2007$ instead. References EPA, Exposure Factors Handbook Update. EPA May 1989, EPA 600-P August. EPA, Regulatory Impact Analysis of the Proposed Toxics Rule: Final Report. March 20. G.2.9 EPA could improve clarity by quantifying and presenting uncertainties in underlying assumptions used to estimate aggregate and subpopulation IQ losses In the RIA, Figure 5.4 presents a helpful schematic on how the different data inputs and related assumptions were used to estimate the number of prenatally exposed children in angler households at the state level. However, it would clarify the variability and uncertainty (range of input exposure and demographic parameters) for both the baseline population (results presented in Table 5.5) and the high risk populations (results presented in Tables 5.8 to 5.19) if summary descriptive statistics were provided at the state level. These statistics would describe input data number of childbearing-age females, fertility rates, number of anglers, populations > 15 years old; G-13

192 calculated population parameters number of pregnant women, number of prenatally exposed children in angler populations, proportion of the population that constitutes anglers; and calculated exposure inputs for estimating the annual number of prenatally exposed children in angler households by demographic group, distance from a waterbody, and waterbody type, referred to as NPA ijk (number of prenatally exposed children) in the RIA. (Access to these underlying data was limited, since they were released in June 2011). Further summary descriptive statistics for each population stratum NPA ijk would clarify Agency calculations under different scenarios for high risk subgroups, such as those estimating benefits for Recreational/Subsistence fishers. These statistics would describe exposure inputs average daily methylmercury ingestion rates, average methylmercury fish concentrations; and calculated average daily ingestion rates for each population stratum NPA ijk. The RIA provides a well rounded qualitative summary of the uncertainties and limitations inherent in the assumptions and methodology EPA used in its benefits analysis (RIA, pages 5-95 to 5-106). However, there appears to be sufficient data to conduct additional uncertainty analyses on major exposure and demographic assumptions, including those previously discussed for the TSD related to fish exposure and waterbodies with limited sample numbers (e.g. < 2, < 5, etc.) (see EPRI Comments, Sections 3.7 through 3.10 above). RIA-specific assumptions that could be reviewed may include using all-race fertility rates at the state level (unclear in the RIA); using cumulative fertility rates (all women ages 15 44) versus age-specific fertility rates; choice of the cut point for defining poverty (< $50,000 income); choice of a 20-mile versus a 100-mile distance to the nearest waterbody (subsistence versus aggregate IQ estimates). These assumptions do impact the risk estimates, which should be clarified by EPA. For example, Axelrad and Cohen [2011] recently evaluated a method to describe potential bias in the calculation of descriptive biomonitoring statistics for environmental contaminants when agespecific fertility rates (child-bearing rates in women ages 16 19) are not taken into account. Birth rates vary substantially by age and race/ethnicity, with the potential to strongly influence statistics and other related analyses of chemicals with an increasing age-related biological burden (such PCBs and mercury). In this recent publication, alternate definitions for women of childbearing age (varied by age range, adjusted by weighting for different age-specific natality rates and/or race/ethnicity) caused a maximum 10% variation in both the 50th and 95th percentile biological measures. However, this was not consistent across race/ethnic group; 95th percentile methylmercury blood estimates were 28 46% higher if no natality adjustment was made (the accepted practice when presenting NHANES biomonitoring data), potentially biasing related exposure statistics upward. The RIA graphs (RIA, Figures 5-12 and 5-13) illustrating the IQ loss distribution (2016 Base Case and 2016 Toxics Rule) for high risk populations ( Recreational/Subsistence benefits tables) help indicate the distribution of potential IQ loss. However, EPA should provide additional discussion of what the figures represent including a discussion of underlying data influences (e.g., fish intake levels based on small samples) to help reviewers understand the risk range. Additional graphics on exposure inputs, such as HUC watershed fish sample distributions, NPA distributions by high risk group, and so forth, would also be informative. G-14

193 Reference Axelrad D.A. and Cohen J., Calculating Summary Statistics for Population Chemical Biomonitoring in Women of Childbearing Age with Adjustment for Age-specific Natality, Environmental Research, 111, G.2.10 Improve clarity in presentation of national aggregate IQ losses and high risk subgroup distribution of IQ benefit estimates Clarity in reporting results could be improved in several portions of the RIA. Aggregate national baseline impacts on recreational anglers presented in RIA, Table 5.5 are referenced multiple times in RIA, Section 5.9. However, it could be stated more clearly up front and in the overall conclusions (RIA, Section 5.9.6) that these NPA baseline IQ impacts represent ALL sources of mercury leading to maternal consumption of methylmercurycontaminated fish, and not just EGUs affected by the rulemaking. EPA needs to clarify present marginal impacts. This would also clarify the state-specific data summarized in RIA, Table 5.5, where states such as California that largely lack U.S. EGU-attributable mercury deposition nonetheless have some of the highest estimated average maternal daily mercury ingestion rates (6.04 μg/day) and average child IQ losses per child (0.21 points). Explanation, illustration, and consistent application of EPA guidelines are needed for reporting significant digits and rounding numbers. Issues arise in presenting fractional IQ losses or benefits particularly in reporting aggregate IQ loss at four to five decimal places (RIA Tables 5-6 and 5-7), baseline aggregate IQ loss at two to three decimal places (RIA, Table 5-5), and high risk recreational/subsistence subgroup results at three decimal places (RIA, Tables 5-8 to 5-19). This interpretive point becomes more important in discussing fractional estimates of outcomes, especially when considering the high variability and uncertainty in the underlying data inputs for modeling when comparing scenarios. Greater consistency is needed in referencing or identifying recreational anglers, subsistence fishing, or freshwater self-caught fish consumption when these categories are meant to be different versus when they are meant to be the same in the RIA. This consistency of definition is important because it helps evaluators to determine which demographic and exposure data were applied to an individual analysis. EPA should reference the source tables for the individual results called out in RIA, Section G.2.11 Memorandum of ENVIRON to EPRI regarding ENVIRON s review of the Utility NESHAPS water modeling approach Date: July 22, 2011 MEMORANDUM To: Arnout Ter Schure, EPRI From: Kristen Lohman, ENVIRON Subject: Review of the Utility NESHAPS water modeling G-15

194 I have reviewed the EPA s Regulatory Impact Analysis of the Proposed Toxics Rule (USEPA, 2011) document, as well as earlier MMAPs documents (USEPA, 2001 and USEPA, 2005) to understand the watershed mercury (Hg) modeling performed in support of the USEPA s proposed National Emission Standards for Hazardous Air Pollutants (NESHAP) From Coal- and Oil-Fired Electric Utility Steam Generating Units and Standards of Performance for Fossil-Fuel- Fired Electric Utility, Industrial-Commercial-Institutional, and Small Industrial-Commercial- Institutional Steam Generating Units. In order to attempt to quantify the impact of the proposed regulation the EPA applied modeled atmospheric deposition rates to measured fish mercury concentrations through the Office of Water s Mercury Maps (MMAPs) approach (USEPA, 2001). In the base scenario 2005 atmospheric deposition to each watershed was simulated and compared against methylmercury (MeHg) concentrations measured in fish from 1995 to 2007 in waterbodies located within those watersheds. This was done to calculate a relationship for each watershed between deposition rates and fish MeHg concentrations. A 2016 emission inventory produced for modeling of the expected impacts of the Transport Rule was used to represent the current conditions. The 2016 atmospheric deposition was also simulated to calculate predicted MeHg in fish for The 2005 fish MeHg concentrations were scaled by the ratio of 2016 deposition over 2005 deposition rates. Finally, a third scenario was calculated by modifying the 2016 emission inventory to represent reductions in utility mercury emissions should the new NESHAP become effective. The differences in modeled Hg deposition rates from 2016 to the NESHAP version of 2016 were then used in the MMAPs model to estimate the potential change to the MeHg concentration of fish in each modeled watershed. The MMAPs approach is an attempt to determine the impact of changes in Hg emissions of U.S. utility sources without modeling individual watersheds. In fact the impact at each lake or river will be governed by more than just the change in local atmospheric Hg deposition. It will also be governed by changes to other chemical species deposited to or already existing in the watershed, historical deposition of mercury currently stored in the soil and any changes to the area due to development, water borne releases, or any other human or natural disturbances. The MMAPs methodology is not complex enough to simulate a detailed level what will happen should changes be implemented. This method would be a reasonable first or screening level study but to give it more weight than that given all of the limitations is not realistic. The EPA acknowledges a list of flaws in the MMAPs model. These include: 1) the assumption that there is a linear, steady-state relationship between concentrations of MeHg in fish and present day air deposition mercury inputs, 2) the requirement that all other environmental conditions in the watershed remain constant over time, 3) the lack of ability to represent watersheds that have a significant of Hg from sources other than atmospheric deposition, and 4) the lack of a time lag between changes in Hg deposition and changes in MeHg concentrations in fish. The report then goes on to say that despite all of these limitations this is the only workable way of representing the impacts of the proposed regulation on a national scale. G-16

195 The MMAPs method assumes that steady state has been achieved when in reality Hg emissions and deposition are changing. Atmospheric deposition of mercury can enter a waterbody in one of two ways. The first is through direct deposition onto the waterbody s surface. The second is by way of deposition onto the terrestrial portion of the watershed (soils and vegetation) some of which eventually travels by way of evasion, runoff and erosion into the waterbody. Direct deposition can have a fairly rapid impact on the mercury levels of the waterbody and its fish, however deposition to the terrestrial watershed can have a long lag time before its impact can be found in the fish living in the waterbodies (Hintelmann et al, 2002). Therefore, lag times would need to be included in the modeling and be able to vary from watershed to watershed and even sometimes from waterbody to waterbody within a watershed. Another problem with the instantaneous steady state assumption is that the emission rates of Hg due to U.S. sources have been decreasing for more than a decade, while emissions due to sources outside the U.S. have been increasing (Streets et al, 2009). Therefore, the system is not at steady-state, a basic premise of the model. Figures included in the reports (USEPA, 2001) show that only a small fraction of the watersheds in the country are included in the study due to a combination of lack of fish MeHg data in local waterbodies and the existence of local nonatmospheric sources of mercury. The MMAPs approach fails in either of these situations, both prevalent in the western states. Because of this, the MMAPs approach is not representative for close to half the country. In fact only 2,461 watersheds out of approximately 88,000 in the contiguous U.S. were modeled (3%) and of these the sites were heavily biased towards the eastern US. As there is no data to compare the remaining watersheds to, we do not know if these watersheds are representative enough to stand in for all of the other watersheds. The only real way to determine whether a watershed will be impacted is to individually model it, preferably with a significant amount of past and present data to use to calibrate the model. This is obviously too onerous to do for every one of the 88,000 watersheds in the U.S. However, the MMAPs method has so many flaws and caveats that its usefulness is extremely limited. It seems as though it is used because the USEPA is required to model something so they did even though it does not really provide much in the way of useful information. Each HUC only needed one measurement to be included in the study. This increases the possibility of nonrepresentative values being used due to mismeasurement or measurement at a time of unusual environmental circumstances. The MMAPs study could be useful as a method of identifying watersheds that would be of most interest for further study. Additional studies of individual watersheds would help in determining whether the trends estimated by the MMAPs method are reasonable. These studies would need to include both modeling and measurements of Hg in the watershed. For example, there are several models such as WASP or WARMF that calculate MeHg concentrations in water, or D-MCM that calculates MeHg in water and fish. These more detailed models provide a higher level of detail necessary to make a thorough simulation of Hg deposition through aquatic concentrations. It is not feasible to model all of the watersheds with G-17

196 these detailed models, but a comparison run for some of them would provide my confidence in the results. References: Hintelmann, H., R. Harris, A. Heyes, J. Hurley, C. Kelly, D. Krabbenhoft, S. Lindberg, J.W.M. Rudd, K. Scott and V. St. Louis Reactivity and mobility of new and old mercury deposition in a boreal forest ecosystem during the first year of the METAALICUS study. Environ. Sci. Technol., 36, Streets, D., Q. Zhang, and Y. Wu, Projections of Global Mercury Emissions in Environ. Sci. Technol., 43, United States Environmental Protection Agency (USEPA), Regulatory Impact Analysis of the Proposed Toxics Rule: Final Report, March USEPA, Regulatory Impact Analysis of the Clean Air Mercury Rule: Final Report, Office of Air Quality Planning and Standards, EPA-454/R USEPA, A Quantitative Spatial Link Between Air Deposition and Fish Tissue: Peer Reviewed Final Report, EPA-823-R G-18

197 H APPENDIX: NATIONAL ATMOSPHERIC DEPOSITION PROGRAM MAPS OF NATIONAL TRENDS IN CHLORIDE AND SULFATE DEPOSITION, These maps are provided by the NADP National Trends Network, which offers a long-term record of the acids, nutrients, and base cations in U.S. precipitation. The maps are available at H-1

198 H-2

199 H-3

200 H-4

201 H-5

202 H-6

203 H-7

204 H-8

205 H-9

206 H-10

207 H-11

208 H-12