Control Chart Methodology for Detecting Under-reported Emissions

Size: px
Start display at page:

Download "Control Chart Methodology for Detecting Under-reported Emissions"

Transcription

1 Control Chart Methodology for Detecting Under-reported Emissions Introduction: EPA uses the control chart methodology described in this paper to analyze the CO2 concentration data recorded at coal-fired units with certified CO2 monitors. The purpose of the data analysis, which is performed quarterly, is to identify possible sampling system in-leakage or other monitoring system problems that can result in under-reporting of emissions. Such problems are often detected only by performing relative accuracy test audits (RATAs), which are typically conducted only once per year. Control charts are an effective tool for identifying unusual shifts in data. A typical control chart consists of (1) data points representing average measurements taken over an interval of time; (2) a center line, drawn at the overall mean for the data set; and (3) upper and lower control limits that indicate the threshold at which the data are considered to be statistically unlikely. This paper describes how the Agency uses the control chart methodology to identify abnormal or suspect emissions data. Data Needed for the Analysis: Hourly CO 2 concentration data Load Bin data Method of Determination Code (MODC) data Completion date of the previous CO 2 RATA Procedure: Step1: Identify an Appropriate Load Bin to Evaluate It is necessary to evaluate data from a narrow operating band, to remove (or at least to reduce) the effects that operating conditions have on data variability. EPA focuses on the most frequently used load bin (or in some instances, the two most frequently used load bins) in a given quarter to maximize the amount of data available for the evaluation. However, this evaluation can be run at any desired load bin. Note that it may be useful to run the evaluation on several load bins to determine whether a finding is dependent on load. Figure 1, below, demonstrates a load bin analysis where the control chart evaluation would be conducted on the load bin 6 data. 1

2 Step 2: Creating the Baseline Data Set Figure 1. Load Bin Evaluation To create the baseline data set, collect approximately 30-days of hourly data from the load bin(s) of interest (generally the most frequently used load bin(s) in a given quarter) starting with the day following the completion of the most recent CO 2 RATA. For the example load bin evaluation in Figure 1, load bin 6 was clearly the most frequently used load bin. Therefore, only load bin 6 data in the 30-days immediately following the most recent CO2 RATA would be used to construct the baseline and to set the control limits of 2 and 3 standard deviations. Only quality assured CO2 data from the primary monitoring system where the MODC for the hour is equal to 1. The daily average CO2 concentration would then be calculated for each day in the baseline data set where there are at least 6 valid hourly measurements available, using Equation A, below. h C h 1 C d = h Eq. A C d = Daily average CO 2 concentration (% CO2) C h = Measured hourly CO 2 concentration (% CO2) h = Number of measured hourly CO 2 concentrations in the day EPA believes that only including days where 6 or more valid hourly measurements are available ensures that the daily averages will be more meaningful. EPA recommends that the baseline be based on at least 15 daily averages. If at least 15 daily averages are not available within the 30 days immediately following the most recent CO2 RATA, the baseline data 2

3 collection period should be extended accordingly. In Figure 2 below, the black diamonds represent the daily average CO2 concentrations for a typical baseline period, immediately following a CO2 RATA. The blue diamonds represent the daily average CO2 concentrations following the baseline period (see Step 5, below). Notice that starting in the 4 th quarter of 2006 and continuing into the 2 nd quarter of 2007, the daily averages show a significant drop in CO2 concentration. It also appears that at the time of the 2007 RATA, corrective actions appear to have been taken to restore the measured CO2 concentrations to the expected level. Figure 2. Daily CO 2 Averages Step 3: Determining the Baseline Parameters (A) The baseline mean value is determined by using Equation B to calculate the arithmetic average of the daily averages. C B d Cd = 1 Eq. B d 3

4 C d = Individual daily average CO 2 concentration, from Equation A d = Number of baseline days (between 15 and 30 is recommended) (B) The standard deviation of the daily average CO 2 concentrations during the baseline period is calculated using Equation C: d 2 ( C C ) d B 1 σ B = Eq. C ( d 1) σ B = Standard deviation of the daily average CO 2 concentration values during the baseline period. C d = Individual daily average CO 2 concentration, from Equation A d = Number of baseline days Step 4: Calculating the Control Limits Control limits are calculated by adding and subtracting multiples of the standard deviation to or from the baseline mean as follows: Upper Control Limit: Lower Control Limit: UCL = C B + 3σ B Eq. D-1 LCL = C B 3σ B Eq. D-2 UCL = Upper Control Limit LCL = Lower Control Limit σ B = Standard deviation of the daily average CO 2 concentration values during the baseline period. 4

5 Note: EPA uses the ± 3σ-level (3 times the standard deviation) for audit purposes, which corresponds to a 99.7% certainty that the data should fall within that range. EPA recommends that sources electing to use the control chart methodology should set a warning control limit at a ± 2σ-level (2 times the standard deviation) and take investigative action whenever the data are outside this warning level. Upper Warning Limit: UWL = C B + 2σ B Eq. D-3 Lower Warning Limit: LWL = C B 2σ B Eq. D-4 UWL = Upper Warning Limit LWL = Lower Warning Limit σ B = Standard deviation of the daily average CO 2 concentration values during the baseline period. Figure 3. Typical CO 2 Control Chart 5

6 Figure 3, above, shows the baseline mean CO2 concentration (the green line), the upper and lower audit control limits (the red lines), and the upper and lower warning control limits (the yellow lines). Step 5: Analyzing the Subsequent CO2 Data Once the baseline data and control limits have been set, the CO2 data for each subsequent operating day are analyzed as follows. This data analysis may either be for a single calendar quarter or may span across quarter boundaries. For each operating day following the end of the baseline period, calculate the daily average CO 2 concentration for the same load bin that was selected for the baseline data, using Equation A in Step 2, above. As described in Step 2, above, use only quality assured data from the primary monitoring system where the MODC for the hour is equal to 1 for evaluation purposes. If you have used the recommended minimum of 6 hourly values to make up the daily averages in the baseline data set; discard the data for any operating day where the count of valid hourly average CO2 concentrations is less than 6. Then, compare each daily average CO 2 concentration value to the baseline average and the calculated upper and lower control limits from Step 4. Flag any daily value that is outside the control limits and track High and Low deviations separately. Whenever 7 or more consecutive daily average values are out-of-bounds, (more than 3σ low as compared to the baseline average), EPA flags that unit as having suspect data. In Figure 4, below, the red diamonds represent the CO2 daily averages that are outside of the ± 3σ audit control limits, and are considered to be suspect data. 6

7 Figure 4. CO 2 Control Chart with Daily Average Outside of the ± 3σ Control Limits Recommendations For on-site quality control, EPA recommends that sources implement the control chart methodology on a regular basis. Constructing control chart graphs such as Figure 4 provides a simple, yet effective way to track monitoring system performance. It has come to EPA s attention that certain data acquisition and handling system (DAHS) vendors have programmed the control chart methodology and can provide this functionality to their customers. Sources electing to use the control chart methodology should investigate and document any actions taken whenever: (1) three of four consecutive daily averages fall beyond the ±2σ-level warning control limits; (2) any daily average falls beyond the ±3σ-level control limits; or (3) seven or more consecutive daily averages fall beyond the ±3σ-level control limits. 7