GADSWG Meeting. Jay Powell 2/26 2/27/2013

Size: px
Start display at page:

Download "GADSWG Meeting. Jay Powell 2/26 2/27/2013"

Transcription

1 GADSWG Meeting Jay Powell 2/26 2/27/2013

2 Anti-Trust Notice It is NERC s policy and practice to obey the antitrust laws and to avoid all conduct that unreasonably restrains competition. This policy requires the avoidance of any conduct that violates, or that might appear to violate, the antitrust laws. Among other things, the antitrust laws forbid any agreement between or among competitors regarding: prices availability of service product design terms of sale division of markets allocation of customers any other activity that unreasonably restrains competition 2 RELIABILITY ACCOUNTABILITY

3 Anti-Trust Notice On these conference calls, we can talk about: Reliability matters relating to the bulk power system, including operation and planning matters such as establishing or revising reliability standards, special operating procedures, operating transfer capabilities, and plans for new facilities. Matters relating to the impact of reliability standards for the bulk power system on electricity markets, and the impact of electricity market operations on the reliability of the bulk power system. Proposed filings or other communications with state or federal regulatory authorities or other governmental entities. 3 RELIABILITY ACCOUNTABILITY

4 Public Notice Announcement Participants are reminded that this meeting is public. Notice of the meeting was posted on the NERC website and widely distributed. Participants should keep in mind that the audience may include members of the press and representatives of various governmental authorities, in addition to the expected participation by industry stakeholders. 4 RELIABILITY ACCOUNTABILITY

5 Agenda February 26 th, PM 5 PM Administrative PAS Update 2013 SOR GADS idashboard GADS White Paper February 27 th, AM 3 PM Administrative Cause Codes Sub Team Updates GADS Change Order #1 GADS Future Meetings 5 RELIABILITY ACCOUNTABILITY

6 Icebreaker 6 RELIABILITY ACCOUNTABILITY

7 Preliminary Wind Data Collection Plan Wind Subcommittee Update GADS Wind Data Reporting Instructions Finalize Wind DRI Enhance Voluntary Data Collection Effort Mandatory Data Request Mandatory Data Request Approval Process webe-gads Wind Enhancement Draft Functional Specification for Wind Collection Application Development Mandatory Data Reporting (Phased Approach) Mandatory Wind Data Collection (Beginning 2016) 7 RELIABILITY ACCOUNTABILITY

8 Wind Subcommittee Update Preliminary Milestone Chart Task Name Date Feb-13 Apr-13 Jun-13 Aug-13 Oct-13 Dec-13 Feb-14 Apr-14 Jun-14 Aug-14 Oct-14 Dec-14 Feb-15 Apr-15 Jun-15 Aug-15 Oct-15 Dec GADS Wind Data Reporting Instructions Finalize Wind DRI Enhance Voluntary Data Collection Mandatory Data Request Draft Mandatory Request Approval Process webe-gads Wind Enhancement Draft webe-gads Wind Functional Specifications webe-gads Development webe-gads Testing Mandatory Data Reporting Phase 1 Phase 2 Phase 3 Mandatory Data Collection (Phased Approach) Phase 1 (2016): Plants 200 MW+, Commercial 1/1/2005 Phase 2 (2017): Plants 100 MW+, Commercial 1/1/2005 Phase 3 (2018): Plants 25 MW+, Commercial 1/1/ RELIABILITY ACCOUNTABILITY

9 2012 webe-gads Results 2012 Reporting Status Report (as of 2/25/2013) 763 of 821 FEs completed their data submittals (93% complete) o Completion requires checking for any latent errors found in the data o Completing the checklist How do we improve these results? Training Notifications 9 RELIABILITY ACCOUNTABILITY

10 Regional Coordinator GO Form Change Order #2 Metrics at the Regional, DRE, and NERC level Error Free Export of GADS Data csv and txt format Automatically generated report once data submission is complete Reserve Shutdown reporting for Gas Turbines post 2012 Data Additional Design Data Working Group will define fields Derate issue with a changing NDC value while the event crosses multiple months 10 RELIABILITY ACCOUNTABILITY

11 Retired performance records to be auto generated Notes Top 5 for a future change order List of units by user id instead of by company for validation purposes (event and performance records) 11 RELIABILITY ACCOUNTABILITY

12 Future Meetings Keep monthly phone call 2 nd Thursday each month Notes Evaluate work plan to decide on future face to face meeting time Next monthly call we will decide face to face meeting location and time Next GADS workshop August 6 th 8 th in Nashville at TVA offices 12 RELIABILITY ACCOUNTABILITY

13 Questions and Answers 13 RELIABILITY ACCOUNTABILITY

14 GADS CO #2 Event Description Form Event Form Fields (Modified from TADS) GOs assign their own event id codes. An Event is a generation incident that results in the forced outage of one or more units. Entities would only enter event ids for a multiple unit forced outage or in relation to an Event Analysis event. The table below describes the data collected for the event id code: Field Name Event ID Event Analysis Tracking ID Disturbance report filed Event Description (Optional) Table 1 Field Description The Event ID Code associated with one or more outages. This is assigned by the GO. For a given GO, the same Event ID Code cannot be defined more than once on the Event Description Form. The GO cannot define the same code again in any subsequent reporting period. Entities may not create event ids that begin with NMU except through a special button entitled Create Multi-Utility Event ID. Enter the Event Analysis tracking ID for the GADS event, if applicable. Only GADS events within the Event Analysis Process should be given Event Analysis tracking ids. Format (alphanumeric and underscores) This field indicates whether a disturbance report was filed that was associated with the Event or if the event is within the Event Analysis process. The choices are contained in a drop-down menu (Yes or No). If unsure for a given set of outages, GOs should contact their GADS Regional Entity Coordinator. Optional input: Provide a brief description of the Event s outage(s) for any Event ID Code. Please limit the description to 500 characters or less. Outage Form/Report Changes A new field should be added to both bulk upload and graphical user interface entitled GADS Event ID. Only event ids defined in the event form should be accepted on the outage form. Alternatively, the outage ids could be defined in the Event Form as an additional field. However, in this case, the outage detail report should give the event id for each outage (if applicable). The user would not have to input this data on the outage form.

15 Decrease Net Dep Cap from Winter to Summer is the May NDC 230 is the June NDC 180 Event 1 has Avail Cap = Event 2 has Avail Cap = 170 The reduction for Event 1 is kept constant at 55 MW, but the available capacity floats according to the Net Dep Cap. The reduction for Event 2 will be calculated using the floating available capacity of event 1. The reduction is 5 MW. May June

16 Decrease Net Dep Cap from Winter to Summer is the May NDC 230 is the June NDC Event 1 has Avail Cap = 180 Event 2 has Avail Cap = 178 In this case Event 2 should be rejected because it is claiming more capacity than it is declared by event 1. It is hard to perform this validation by simply looking at the two events without referencing the 95 Card, therefore this must be a level 2 validation. May June

17 Increase Net Dep Cap from Summer to Winter In the reverse case Event 2 will have a reduction of 10 MW Sep Oct

18 GADS Data Analysis for Gas- Fired Generation Trinh Ly February 26, 2013

19 Introduction Gas-Electric Interdependency Issue Different schedules in nominating gas Increase in natural gas-fired power generation Supply and pipeline interruptions/curtailments 2 RELIABILITY ACCOUNTABILITY

20 Importance of GADS NERC-Wide Planned Capacity Additions Current 2022 Planned 2022 Planned & Conceptual Capacity Share Capacity Share Change Capacity Share Change Coal 307, % 290, % -16, , % -28,426 Petroleum 52, % 51, % , % -1,121 Gas 397, % 429, % 31, , % 131,863 Nuclear 113, % 123, % 9, , % 22,540 Other/Unknown % 5, % 5,569 12, % 12,412 Renewables 160, % 180, % 20, , % 21,505 TOTAL 1,031, % 1,082, % 51,015 1,190, % 158,773 3 RELIABILITY ACCOUNTABILITY

21 Figure 5 : Gas-Fired Generation Significantly Increases Over Next Three Years Importance of GADS 1,150 Gas-Fired Generation Significantly Increases Over Next Three Years 1,125 27% Gigawatthours (GWh) 1,100 1,075 1,050 1,025 1, % 27% 26% 26% 26% 26% 25% 25% % Natural Gas Generation (GWh) Natural Gas Generation Mix (% of Total) 4 RELIABILITY ACCOUNTABILITY

22 G & E Interdependency Study GADS Analysis Methods o NERC codes used 9130 Lack of fuel (water from rivers or lakes, coal mines, gas lines, etc.) where the operator is not in control of contracts, supply lines, or delivery of fuels 9131 Lack of fuel (interruptible supply of fuel part of fuel contract) o Removed outages that appeared to be significantly skewing the analysis (e.g., units in RFC that were out for months at a time) 5 RELIABILITY ACCOUNTABILITY

23 Results Number of Occurances All Regions Linear (All Regions) Total Outage Duration (Hours) 15,000 10,000 5, All Regions Linear (All Regions) 6 RELIABILITY ACCOUNTABILITY

24 Feedbacks 9130 (Generally defined as supply or transportation curtailments) 1901 occurrences 9131 (Generally defined as a supply or transportation interruption based on contract) 239 occurrence Appears Counterintuitive Correction actions as needed? 7 RELIABILITY ACCOUNTABILITY

25 Potential Next Steps Sample survey More granular codes to define: when a generator is interrupted because of contract agreements to do so vs. when there is a disruption on pipeline or supply source GADS training 8 RELIABILITY ACCOUNTABILITY

26 2013 State of Reliability May of Peachtree Road NE Suite 600, North Tower Atlanta, GA

27 Table of Contents Executive Summary... 4 Introduction State of Reliability... 4 Bulk Power System Reliability Remains Adequate... 5 Transmission Availability Performance is High... 5 Generating Unit Availability Supports Adequate Reserve Margin... 5 Resource Mix Changes Necessitate New Metrics... 6 Future Advancements... 6 Report Organization... 7 Chapter 1 - Key Findings and Conclusions Overall Reliability Performance... 9 Key Finding 1: Bulk Power System Reliability Remains Adequate... 9 Daily Performance Severity Risk Assessment... 9 Transmission System Availability and Metrics Have No Significant Change Generating Unit Availability Supports Adequate Reserve Margin Load Loss Events Up from Prior Years Key Finding 2: Frequency Response is Stable with No Deterioration Key Finding 3: Protection System Misoperations are a Significant Reliability Issue Misoperations Identified in Key Transmission Related Events Misoperations Found in Transmission Common and Dependent Mode Outages Key Finding 4: Equipment Failure Warrants Further Analysis Key Finding 5: Resource Mix Changes Necessitate New Metrics Influence of Wind on Reliability Demand Response Deployment Key Finding 6: More Data and Research is Needed Chapter 2 Daily Performance Severity Risk Assessment NERC Assessment Eastern Interconnection Assessment Western Interconnection Assessment ERCOT Interconnection Assessment Québec Interconnection Assessment Chapter 3 Top Risk Issues Overview Chapter 4 Reliability Indicator Trends Overview ALR1-12 Interconnection Frequency Response ALR Characteristic: Contingencies ALR1-4 BPS Transmission Related Events Resulting in Loss of Load ALR Characteristic: Integrity ALR3-5 Interconnection Reliability Operating Limit/ System Operating Limit (IROL/SOL) Exceedances ALR Characteristic: Protection ALR2-3 Activation of Under Frequency Load Shedding ALR4-1 Automatic AC Transmission Outages Caused by Protection System of 66

28 Equipment-Related Misoperations ALR6-2 Energy Emergency Alert 3 (EEA3) ALR6-3 Energy Emergency Alert 2 (EEA2) Chapter 5 Risk Assessment of Standards Violations Introduction Conclusions Chapter Post Seasonal Assessment Operational Highlights MISO: Challenges in Wind Integration ERCOT: Wind Generation Decline and Emergency Energy Alerts (EEAs) Demand Response Contributed to Improved Reliability PJM: Energy Demand Response and Emergency Energy Alerts (EEAs) ERCOT: Energy Demand Response and Emergency Energy Alerts (EEAs) ISO-NE: Maintaining Operating Reserve with Demand Response Transmission Constraints SERC: Emergency Energy Alerts and Transmission Loading Relief SPP: Weather Related EEAs and TLRs Protection and Control Misoperations WECC: Protection System Misoperations; Generation and Load Loss Events ERCOT Chapter 7 Spare Equipment Database Overview Abbreviations Used in This Report Contributions Acknowledgements NERC Industry Groups Regional Entity Staff NERC Staff of 66

29 Executive Summary Introduction The North American Electric Reliability Corporation (NERC) 2012 State of Reliability report represents NERC s independent view of ongoing bulk power system reliability trends to objectively analyze the state of reliability based on metric information and provide an integrated view of reliability performance. The key findings and recommendations serve as technical input to NERC s Reliability Standards and project prioritization, compliance process improvement, event analysis, reliability assessment, and critical infrastructure protection. This analysis of bulk power system performance not only provides an industry reference for historical bulk power system reliability, it also offers analytical insights towards industry action, and enables the discovery and prioritization of specific actionable risk control steps. The 2012 report builds on the 2011 foundational report, 2011 Risk Assessment of Reliability Performance. 1 It was prepared by the NERC staff and NERC Performance Analysis Subcommittee 2 (PAS) under the direction of the Operating and Planning Committees. Further, this year s report was developed in collaboration with many technical groups, 3 including the: Operating Committee (OC): Resources Subcommittee (RS) Frequency Working Group (FWG) Event Analysis Working Group (EAWG)/Event Analysis Subcommittee (EAS) Operating Reliability Subcommittee (ORS) Planning Committee (PC) Reliability Assessment Subcommittee (RAS) System Protection and Control Subcommittee (SPCS) Transmission Availability Data System Working Group (TADSWG) Generating Availability Data System Working Group (GADSWG) Demand Response Availability Data System Working Group (DADSWG) Since its first 2009 annual reliability metrics report, 4 the PAS (formerly the Reliability Metrics Working Group 5 ) has advanced data collection and trend analysis for all 18 reliability indicators 6 through NERC s voluntary or mandatory data requests. In this year s report, the metric assessment also includes frequency response, system voltage performance, and protection system misoperations 2013 State of Reliability Bulk power system reliability is stable since the metrics show no significant upward or downward trends for the period 2008 to The severity risk index and 18 metrics that measure characteristics of adequate level of reliability (ALR) indicate the htp:// of 66

30 Executive Summary bulk power system is within the defined acceptable ALR conditions. The system achieves an ALR when it possesses the following six characteristics:7 1. Controlled to stay within acceptable limits during normal conditions; 2. Perform acceptably after credible contingencies; 3. Limit the impact and scope of instability and cascading outages when they occur; 4. Facilities are protected from unacceptable damage by operating them within facility ratings; 5. Integrity can be restored promptly if it is lost; and 6. Have the ability to supply the aggregate electric power and energy requirements of the electricity consumers at all times, taking into account scheduled and reasonably expected unscheduled outages of system components. An improved ALR definition is being developed.8 Its trend analysis and risk control measures will continue to evolve as an understanding of what factors contribute to, or indicate the level of, reliability. Also the cause-effect model will be expanded with deeper analysis of initiating events and their impact. Bulk Power System Reliability Remains Adequate From 2008 to 2011, excluding the events caused by factors other than the performance of the transmission system (e.g., weather initiated events), the number of bulk power system transmission related ev ents resulting in loss of firm load has been relatively flat, with an average of between eight to 10 events per year. The 2011 daily severity risk index (SRI)9 value, measuring events resulting in the loss of transmission, generation, and load showed the majority of the year s performance was improved compared to 2008, 2009 and However, when weather initiated events are included, 2011 had more high stress days (eight days with an SRI greater than 5.0) than had been experienced in prior years (two to five days with an SRI higher than 5.0). Transmission Availability Performance is High As shown from the transmission performance data, availability of the bulk transmission system continues to remain high with no statistically significant change from 2008 to The AC circuit availability is above 98 percent, and the transformer availability is above 96 percent for both 2010 and On average, roughly two percent of the events per year (92 events out of 4,185 total events) contain between three and 14 momentary and sustained automatic outages. Some of these events are beyond the NERC TPL (transmission planning) Reliability Standards,10 including the Category B events commonly used in planning and daily operations. Further, additional analysis is needed to improve the understanding of the assumptions used for contingency criteria which form the basis for system planning adequacy studies and daily operations. A recent survey of 133 of these events (from 2008 to 2011) shows that over 60 percent of the returned survey responses involve abnormal clearing of one or more of the element outages. Nearly half of them involved element outages initiated by protection system misoperations. Industry efforts11 are underway to understand the risks from protection system misoperations and to develop effective risk controls needed to reduce future misoperation occurrences.12 Generating Unit Availability Supports Adequate Reserve Margin Based on the 2011 generating unit availability data, generating units continue to be sufficiently reliable to support adequate electric capacity. Individual generating equipment outages did not have any significant impact on the reliability of the bulk The draft of a revised ALR definition, including socio-economic impact consideration, can be viewed at: of 66

31 Executive Summary power system. Transmission and weather events (tornadoes, lightning, and ice storms) resulted in more multiple unit forced outage trips, hours of down time, and power production lost during 2008 to 2011 than individual unit equipment specific outages. New generating units are replacing older ones to maintain power production capability to meet load, increase the efficiency of the existing fleet and to address environmental rules. Most new generating units are gas-fired, though new coal-fired unit capacities are also being added to the system. Resource Mix Changes Necessitate New Metrics Wind generation has grown in many parts of the bulk power system, and this trend is expected to continue.13 The variable and uncertain nature of wind generation complicates the inclusion of wind resources in the calculation of reserve margins. Improvements in wind forecasting are necessary to assure that adequate resources are in place in operational horizon. As recommended in NERC s Accommodating High Levels of Variable Generation report, 14 and further developed in the follow-on study Flexibility Requirements and Potential Metrics for Variable Generations: Implications for System Planning Studies,15 reliability metrics are needed to measure system flexibility requirements to account for large-scale wind generation integration. Considerations should include ramping requirements, minimum generation levels, required shorter scheduling intervals, and transmission interconnections. Additional data is needed to support the system flexibility and resource adequacy assessment. As the role of demand response increases, both NERC and stakeholders need to analyze its performance to measure its benefits and impacts on reliability. In 2011, NERC collected reliability related demand response information across 133 entities within North America. This includes enrollment and actual performance data from April 1 to September 30, There were 664 deployments during this period with July (247 events, 16,283 MW, 70 percent realization rate16) and August (153 deployments, 10,896 MW, 77 percent realization rate) being the most active. Future Advancements The 2012 State of Reliability report marks a vital step towards an overall view of the risk to bulk power system reliability. The goal is to quantify performance, highlight areas for improvement as well as reinforce and measure success in controlling these risks. To address this objective, a number of activities are ongoing: In 2011, the Sandia National Laboratories (SNL) and U.S. Energy Information Administration (EIA) both provided technical support to identify and monitor areas for improving the value of indices such as the SRI. Based on their recommendations,17,18 the PAS will consider applying risk cluster and other statistical analysis applications to link the reliability data used in the event-driven index (EDI),19 standards-driven index (SDI)20 and condition-driven index (CDI) to identify significant initiating events and measure their reliability impact. The resulting model could be used to characterize and monitor the state of bulk power system reliability, and cause-effect relationships may emerge. In March 2012, the EDI measure was endorsed by the Operating and Planning Committees. The SDI measure was endorsed by the Compliance and Certification Committee. The PAS s index team is finalizing the details of the CDI measure. The trend analysis of these three indices, measuring different reliability aspects, will be included in future reliability reports. Further, under the direction of the Critical Infrastructure Protection Committee (CIPC), the PAS is collaborating with the Bulk Electric System (BES) Security Metrics Working Group to develop bulk electric system security performance metrics. The PAS recognizes the importance of detecting system reliability risks promptly to ensure risks to reliability are identified and actionable steps can be deliberated upon by industry to address these risks. However, many datasets and performance Special Report: Accommodating High Levels of Variable Generation Realization Rate is defined as Actual MW achieved for the event divided by MW requested for the event of 66

32 Executive Summary analyses conducted in this report are still in an early stage. The PAS expects that as it continues its performance analysis and measurement of reliability risks, limitations and enhancements of data and metrics will be identified, which will require review and modification of the reported data. These enhancements will be captured in future reports. At the present time, the defined characteristics of ALR are being reviewed and refined. Once the enhanced definition becomes final, the PAS will evaluate the current ALR metrics and modify them accordingly. Report Organization Following the key findings and conclusions chapter, the second chapter details the severity risk index trend analysis. The third chapter presents assessments for eighteen reliability metrics. The fourth chapter provides a brief summary of reported disturbances based on event categories described in the enhanced event analysis field test process document.21 The fifth chapter outlines transmission system performance results that the TADSWG has endorsed using the four year history of TADS data. Reviewed by the GADSWG, the sixth chapter provides an overview of generating availability trends for over 70 percent of generators in North America. The seventh chapter is an overview of 2011 reliability related demand response information summarized by the DADSWG. It includes enrollment and actual performance data from April 1, to September 30, The final chapter highlights 2011 summer operations of 66

33 Executive Summary 8 of 66

34 Chapter 1 - Key Findings and Conclusions Chapter 1 - Key Findings and Conclusions 2011 Overall Reliability Performance Bulk power system reliability is stable since the metrics show no significant upward or downward trends for the period 2008 to The SRI and 18 metrics that measure characteristics of ALR indicate the bulk power system is within the defined acceptable ALR conditions. Based on the data and analysis in the later chapters of this report, the following six key findings were identified: 1. Bulk power system reliability remains adequate. 2. Frequency response is stable with no deterioration. 3. Protection system misoperations are a significant reliability issue. 4. Equipment failure warrants further analysis. 5. Resource mix changes necessitate new metrics. 6. More data and research is needed. Key Finding 1: Bulk Power System Reliability Remains Adequate Daily Performance Severity Risk Assessment Bulk power system reliability remains adequate. Error! Reference source not found. captures the daily SRI22 from 2008 to 2011 including the historic significant events. Based on the SRI values, routine daily performance is consistent across all four years. However, including weather initiated events, 2011 had more high stress days (SRI greater than 5.0) with eight days, compared to prior years with averages between two and five days. For SRI values less than 5.0, 2011 had more improved performance days than previous years. The SRI is a daily, blended metric where transmission loss, generation loss, and load loss events are aggregated into a single value that represents the performance of the system. Accumulated over a year, these daily performance measurements are sorted in descending order to evaluate the year-on-year performance of the system. Since there is significant disparity between normal days and events in terms of SRI values, the curve is depicted using a logarithmic scale of 66

35 Chapter 1 - Key Findings and Conclusions As the year-on-year performance in Error! Reference source not found. is evaluated, certain portions of the graph become relevant for specific analysis. First, the left side, where the system has been substantially stressed, should be considered in the context of the days which were noteworthy. Next, the slope of the central part of the graph may reveal year-on-year changes in performance for the majority of the days of the year and demonstrate routine system resilience. Finally, the right portion of the curve may also provide useful information about how many best days occurred during the year, contrasted to prior years. As mentioned earlier, 2011 had more high stress days than had been experienced in prior years. However, further examination of the central part of the graph shows that 2011 had improving day-to-day performance demonstrated by the more pronounced downward slope than occurred in previous years. Finally examination of the right portion of the curve indicates 2011 shows improvement over prior years. Thus, while more serious days may have occurred in prior years, the majority of the 2011 s performance was improved compared to previous years. Table 1 lists the highest 10 SRI event dates in Every event that occurred on the date had DOE OE filed. Nine of these 10 events were influenced by weather. The three events that required event analysis were the February 2 cold snap event, June 30 disturbance event, and September 8 Arizona-Southern California outage event. Table 1: 2011 NERC Top 10 SRI Event Days General Indicators Date NERC Score SRI Weather Influenced? Event Analysis? Highest Event Category 24 on Date OE-417 Filed? 25 10/29 2/ * of 66

36 Chapter 1 - Key Findings and Conclusions 8/27 4/27 9/8 8/28 6/30 4/4 7/11 4/28 * Indicates partial influence Less than Less than 2 * 2 Less than 2 Less than 2 Less than 2 Transmission System Availability and Metrics Have No Significant Change Reliability of the transmission system continues to remain high with no statistically significant change in performance from 2008 to 2011, as evidenced in the Reliability Indicator Trends section of this report. AC circuit availability is above 98 percent, and transformer availability is above 96 percent for both 2010 and These are the first two years for which planned outage data (an integral component in total availability) is available. Planned outages for maintenance and construction have a long-term positive impact on transmission system reliability. AC circuit and transformer unavailability was well below five percent, as shown in Figure 1. The unavailability due to automatic sustained outages was less than one percent. These relative percentages provide an indication of the overall availability of the transmission system operated at 200 kv and above. In addition to AC circuit and transformer availability and unavailability, the performance indicators include four transmission availability related metrics which are used to measure outage rates for areas deemed important to reliability. Each of the four metrics is statistically analyzed to determine improvement or deterioration. The four metrics are: ALR 6-11: Automatic AC transmission outages initiated by failed protection system equipment ALR 6-12: Automatic AC transmission outages initiated by human error ALR 6-13: Automatic AC transmission outages initiated by failed AC substation equipment, and ALR 6-14: Automatic AC Transmission Outages initiated by failed AC circuit equipment., and Sep Figure 1: NERC Transmission System Unavailability by Outage Type ( ) 11 of 66

37 Chapter 1 - Key Findings and Conclusions Percent Unavailability Auto Sustained Outages Operational Outages Planned Outages NERC ALR 6-16 Transmission System Unavailability 5% 4% 3% 2% 1% 0% AC Circuits Transformers Based on the study framework recommended by the United States Department of Energy s Energy Information Administration (EIA),26 the statistical significance of four transmission reliability metrics were tested from year-to-year. Results indicate there are no statistically significant changes in performance of these metrics from 2008 to Assumptions used and detailed analyses are included in the Transmission Availability Analysis and Reliability Indicator Trends chapters, respectively. tember 8 Arizona-Southern California outage event. 30 disturbance event, and September 8 Arizona- Southern California outage event. Generating Unit Availability Supports Adequate Reserve Margin Based on the 2011 generating unit availability data, generating units continue to be sufficiently reliable to support adequate electric capacity. Individual generating equipment outages did not have any significant impact on bulk power system reliability. Transmission and weather events (tornadoes, lightning, and ice storms) resulted in more multiple unit forced outage trips, hours of down time, and power production lost during 2008 to 2011 than individual unit equipment specific outages. New generating units are replacing older ones to maintain power production capability to meet load, increase the efficiency of the existing fleet and to address environmental rules. Most new generating units are gas-fired, though new coal-fired capacity is also being added to the system. Load Loss Events Up from Prior Years During 2011, the SRI evidenced slightly higher occurrence of load loss events. Further analysis of this data indicates that weather events including tornadoes, hurricanes, ice storms, extreme cold and extreme heat were the predominant causes. Continued investigation into these occurrences is recommended, including post-event review. Further, load loss reporting of 66

38 Chapter 1 - Key Findings and Conclusions should be refined to distinguish differences between consequential27 and non-consequential load loss, and include identifications of initiating events. Key Finding 2: Frequency Response is Stable with No Deterioration Based on the frequency event data collected from 2009 to 2011, the Frequency Response performance of the Eastern Interconnection (EI) appears relatively stable, as shown in Figure 2. When comparing 2011 performance to 2009 and 2010, the inter-quartile range (the difference between the first and third quartiles) has a greater value. However, the median performance across the time period is consistent. As evidenced in the Reliability Indicator Trends chapter of this report, the wider inter-quartile range is due to more and smaller frequency events captured in These results are not attributable to deterioration. These events have less Frequency Response due to dead band settings and fewer governors triggered for smaller frequency deviations. Notably, with the more consistent and streamlined data collection procedures developed in 2011, smaller events in EI were captured.28 More data is still needed to apply statistical significance tests and calculate a confidence interval to establish a trend. Also additional analysis on time of year, load levels, generation on-line, and response withdrawal should be considered when interpreting the trend. Similarly, the Western Interconnection (WI), ERCOT Interconnection, and Québec Interconnection (QI) have shown no changes in Frequency Response performance. Frequency Response (MW/0.1 Hz) Figure 2: Box Plot Analysis for 2009 to 2011 Eastern Interconnection Data First Quartile Minimum Median Maximum Third Quartile Period Key Finding 3: Protection System Misoperations are a Significant Reliability Issue Misoperations Identified in Key Transmission Related Events Nine transmission related events resulted in loss of firm load in Events caused by factors other than transmission system performance were not included (such as the cold snap event, tornadoes, etc.). These nine events were due to equipment failures, misoperations, and/or human error. Table 2 summarizes the contribution of their leading causes to these nine events. Definitions of these cause codes are in Table A historic perspective of the frequency response trend calculations in the Eastern Interconnection can be found in and 13 of 66

39 Chapter 1 - Key Findings and Conclusions Table 2: Transmission Related Events Resulting in Load Loss from ALR1-4 Date Event Category Generation Loss (MW) Transmission Loss Load Loss (MW) Equipment Failure Protection System Misoperation Human Error 1/13/ lines 135 3/27/ lines 295 4/25/ lines 1 transformer 140 5/10/ lines 462 6/6/ lines 450 6/30/ transformers 797 7/13/ lines 1 back-to-back converter 2 transformers 665 7/22/2011 N/A 0 1 line 206 9/8/ * * * * * * Protection system misoperations remain a significant reliability issue. As described in the 2011 Post Summer Assessment chapter, on June 6, in the El Paso area of Texas local transmission system protective relaying issues contributed to a service interruption to about 162,000 customers and the loss of 106 MW of generation. On July 13, transmission system protective relaying issues resulted in a loss of 599 MW of generation and the interruption of service to approximately 93,000 customers in the Pueblo area of Colorado. In fact, every major system disturbance since the 1965 Northeast Blackout has been caused or exacerbated by protection system performance ranging from incorrect relay settings to communication failures The 9/8/2011 event analysis is currently underway. On September 8, WECC experienced a widespread power outage affecting portions of Arizona, southern California, and northern Baja California. According to the Department of Energy Form OE-417 report filed for the event, service to approximately 2.8 million customers was interrupted, and seven thousand megawatts of generation tripped offline. The final report regarding this event is not presently available, but it is expected that it will be publicly released in a joint NERC/FERC event analysis report of 66

40 Chapter 1 - Key Findings and Conclusions NERC is in the process of revising a number of Reliability Standards involving protection system misoperations.31 To increase awareness and transparency, NERC has and will continue to conduct industry webinars32 on protection systems and document success stories on how Generator Owners/Transmission Owners are achieving high protection system performance. The quarterly protection system misoperation trending by NERC and the Regional Entities can be viewed on NERC s website.33 Misoperations Found in Transmission Common and Dependent Mode Outages On a NERC-wide basis as shown in Figure 3, an average of two percent of events per year (92 out of 4,185 total events34) contain between three and 14 momentary and sustained automatic element outages, some of which meet the TPL (transmission planning) standards requirements,35 while other events went beyond the Category B events commonly used in planning and daily operations. Additional analysis is needed to improve the understanding of the assumptions used for contingency criteria which form the basis for system planning adequacy studies and daily operations. A recent survey of 133 of these events (from 2008 to 2011), shows more than a majority of these events (63 percent) includes abnormal clearing of one or more of the element outages, as shown in Table 4.36 A near majority (47 percent) of these events involved element outages initiated by protection system misoperations. Table 3 : Com m on Ca u se Code s Protection System Misoperations Events caused by relay and/or control initiated operations when not desired or the failure to operate when desired. This category also includes incorrect relay or control settings that do not coordinate with other protective devices. Equipment Failure Events caused by the failure of equipment. Use this code only when the equipment failed even though it was operated within design specifications. The failed equipment could be (i) a component of an Element (such as a failed insulator), or (ii) part of an AC Substation (such as a failed circuit breaker). Human Error Events caused by any incorrect action traceable to employees and/or contractors for companies operating, maintaining, and/or providing assistance to the Generator Owner/Transmission Owner are reported in this category. Also, any human failure or interpretation of normal industry practices, operating procedures, and guidelines that cause an outage are reported in this category. To understand root causes and develop reduction solutions, NERC has collected nearly a year of protection system misoperations data using a uniform misoperations reporting template across the eight Regional Entities. To increase awareness and transparency, the misoperations data aggregated at regional and NERC levels is made available on the public website. 37 Focus areas include training on digital relay specifications and settings. A deeper investigation into the root causes of protection system misoperations which contribute to dependent and common mode events is a high priority. As announced at the March 2012 NERC Planning Committee meeting, a Protection System Misoperation Task Force (PSMTF) has been formed to analyze misoperations The detailed outage information is available in the Transmission Availability Analysis chapter of this report Event Type 60 Breaker Failure: one or more automatic outages with delayed fault clearing due to a 200 kv and above circuit breaker being stuck, slow to open or failure to interrupt current. Event Type 61 Dependability (failure to operate): one or more automatic outages with delayed fault clearing due to failure of a single protection system (primary or secondary backup) under either of these conditions: Failure to initiate the isolation of a faulted power system Element as designed, or within its designed operating time, or In the absence of a fault, failure to operate as intended within its designed operating time. Event Type 62 Security (unintended operation): one or more automatic outages caused by improper operation (e.g. overtrip) of a protection system resulting in isolating one or more TADS elements it is not intended to isolate, either during a fault or in the absence of a fault. Event Type 90 Automatic Outage(s): event with abnormal clearing not covered by Event Type 60 through of 66

41 Chapter 1 - Key Findings and Conclusions 16 Figure 3: TO Reported Outages per TADS Event ( ) 14 Number of Events (500 out of 3,800+) Number of Outages per TADS Event Sustained + Momentary Automatic Outages #Outages 2009 #Outages 2010 #Outages 2011 #Outages Table 4: Survey Summary - Three Outages or More per TADS Event Description TADS Events Percent Total Common and Dependent mode events with 3 or more circuit outages percent Survey Responses (Number of Events) percent or (133 of 369) Normal Clearing Events percent or (49 of 133) Abnormal Clearing Events (TADS Event Types ) Events with Misoperation Contribution (TADS Event Types 61+62) percent or (84 of 133) percent or (63 of 133) 16 of 66

42 Chapter 1 - Key Findings and Conclusions Key Finding 4: Equipment Failure Warrants Further Analysis As described in Table 2 supporting the discussion of Key Finding 3, equipment failure was involved in six events out of a total of nine transmission related events resulting in loss of firm load. Excluding weather and unknown causes, the transmission availability performance data also indicates equipment failure is one significant contributor to sustained automatic outages, as shown in Figure 4. As detailed in the Transmission Availability Analysis chapter of this report, from 2008 to 2011, nearly 20 percent of automatic sustained outages were initiated by either failed AC substation equipment or failed AC circuit equipment. The following recommendations will support further risk control analysis: Additional data is needed to investigate and analyze equipment failure. For this purpose, secondary cause codes to gather more insights into each sustained outage event could be developed. This information will provide specific causal factors and identify initiating events, which could support the development of a statistical cause-effect model. A small subject-matter-expert technical group could be formed to further understand the problem of substation and AC circuit equipment failures and provide risk control solutions to improve performance. Figure 4: NERC AC Circuit Top Ten Sustained Automatic Outage Occurrences by Initiating Cause Code 38 Failed AC Substation Equipment 10.54% Human Error 11% Lightning 12% Unknown 14% Failed AC Substation and Circuit Equipment 20% Weather, excluding lightning 17% Failed AC Circuit Equipment Failed Protection 9.44% System Equipment 8% Power System Condition 4% Foreign Interference 4% Other 4% 38 Percent is out of total number of outages. Remaining 7.40 percent include: Fire, Contamination, Vegetation, etc. 17 of 66

43 Chapter 1 - Key Findings and Conclusions Key Finding 5: Resource Mix Changes Necessitate New Metrics Resources such as wind generation and demand response are the non-traditional resources that perform differently than conventional generators. New metrics should be developed in order to determine what, if any, impact of these differences have on reliability. Influence of Wind on Reliability Wind nameplate capacity has grown in many parts of the bulk power system consistently for past summers. This trend will increase due to state/provincial renewable portfolio energy mandates and potential future Federal mandates.39 The variable and uncertain nature of wind generation complicates the inclusion of wind resources in the calculation of reserve margins. Regional Entities use regionally adjusted percentages of nameplate capacity to estimate the proven capacity of wind. Improvements in wind forecasting are also necessary to assure that adequate resources are in place in operational horizon. In MISO, wind output during the 2011 summer peak month ranged from 13 MW to 4,909 MW, as shown in Figure 5. MW 10,000 9,000 8,000 7,000 6,000 5,000 4,000 3,000 2,000 1,000 0 Figure 5: MISO Average Wind Generation and Registered Wind Capacity ( Summer) Wind Generation Registered Capacity Jun-09 Jul-09 Aug-09 Jun-10 Jul-10 Aug-10 Jun-11 Jul-11 Aug-11 Month-Year As recommended in NERC s Accommodating High Levels of Variable Generation report,40 and further developed in the follow-on study Flexibility Requirements and Potential Metrics for Variable Generations: Implications for System Planning Studies,41 reliability metrics are needed to measure flexibility requirements of the bulk power system to address largescale wind generation integration. Considerations should include ramping requirements, minimum generation levels, required shorter scheduling intervals, equivalent load carrying capability, and transmission interconnections. Additional data is needed to support the bulk power system flexibility and resource adequacy assessment. Demand Response Deployment As the role of demand response increases, both NERC and stakeholders need to develop metrics and quantify its performance in order to further measure reliability risks from substantial developments. In 2011, NERC collected the first ever reliability related demand response information across 133 entities within North America. This includes enrollment and actual performance data from April 1 to September 30, With the second warmest North American summer on record at an average temperature of 74.5 F; 42 the months of June, July, and August had the largest number of demand response events called and also the largest number of enrolled resources and capacity of 66

44 Chapter 1 - Key Findings and Conclusions August had the largest enrollment out of the six months with a total of 7,360,605 resources and 56,601 MW, close to five percent of 2011 summer generation capacity. There were 664 deployments during this period with July (247 events, 16,283 MW, 70 percent realization rate) and August (153 deployments, 10,896 MW, 77 percent realization rate) being the most active. As evidenced in the 2011 Post Summer Assessment chapter of this report, PJM and ISO-NE both issued energy emergency alerts and called for use of reliability related demand response over peak load conditions. Key Finding 6: More Data and Research is Needed Many datasets and performance analyses conducted in this report are still in an early stage. A number of metrics are limited to occurrence count values; no additional details, such as duration and/or intensity are available to provide a better observation of how each occurrence impacts bulk power system reliability. The PAS recognizes the importance of producing system reliability risks promptly on an annual basis to ensure risks to reliability are identified and actionable steps can be deliberated upon by industry to address these risks. That said, the PAS expects that as it continues its performance analysis and measurement of reliability risks, limitations and enhancements of data and metrics will be identified, which will require review and modification of the reported data. These enhancements will be captured in future reports. Further, when important trends emerge, NERC will highlight concerns swiftly to ensure industry remains informed and can take actions to control risks to reliability. 19 of 66

45 Chapter 2 Daily Performance Severity Risk Assessment Chapter 2 Daily Performance Severity Risk Assessment NERC Assessment Figure 6 captures the daily SRI43 value from 2008 to Other features are also shown on this graphic, including the historic significant events that were used to help substantiate the measurement system. Also, event categories are roughly scaled onto the secondary y-axis. A thumbnail explodes the left side of the curve to further highlight stressed days for the year. Figure 6: NERC Annual Daily Severity Risk Index (SRI) Sorted Descending with Historic Benchmark Days As the year-to-year performance is evaluated in Figure 6, certain portions of the graph become relevant for specific analysis. First, the left side of the graph, where the system has been substantially stressed, should be considered in the context of the days which were noteworthy. Next, the slope of the central part of the graph may reveal year-on-year changes in performance for the majority of the days of the year and demonstrate routine system resilience. Finally, the right portion of the curve may also provide useful information about how many stellar days occurred during the year, contrasted to prior years. The thumbnail shown in Figure 6 indicates 2011 had more high stress days experienced in prior years. Table 5 lists the 10 event dates with highest daily SRI values in Every event occurred on the date had OE filed. Nine of these 10 events were influenced by weather. The three events that required event analysis were the February 2 cold snap event, June 30 disturbance event, and September 8 Arizona-Southern California outage event of 66

46 Chapter 2 Daily Performance Severity Risk Assessment Table 5: 2011 NERC Top 10 SRI Event Days General Indicators Date NERC Score SRI Weather Influenced? Event Analysis? Highest Event Category 45 on Date OE-417 Filed? 46 10/29 2/2 8/27 4/27 9/8 8/28 6/30 4/4 7/11 4/28 * Indicates partial influence * 4 Less than Less than 2 * 2 Less than 2 Less than 2 Less than 2 However, further examination of the central part of the graph shows that 2011 had improving day-to-day performance demonstrated by the more pronounced downward slope than occurred in previous years. Finally, examination of the right portion of 2011 shows improvement over prior years. Thus, while more serious days may have occurred than in prior years, the majority of the year s performance was a noticeable improvement over other years. In Figure 7 to Figure 10, additional statistical data is provided to show the trend in median and outlier performance across the years. Generally the median value of daily SRI values ranged from 1.10 to 1.44 with evidence of more outliers occurring in Figure 7: 2008 Statistical SRI Summary of 66

47 Chapter 2 Daily Performance Severity Risk Assessment Figure 8: 2009 Statistical SRI Summary Figure 9: 2010 Statistical SRI Summary 22 of 66

48 Chapter 2 Daily Performance Severity Risk Assessment Figure 10: 2011 Statistical SRI Summary In Figure 11, annual cumulative performance of the bulk power system is shown. The graph can be read by evaluating the lack of significant steps that occur on any given day. If a big step happens, it represents a stress day as measured by the SRI. It can be seen the year-end performance of the system improved over prior years but was impacted by several steps. These seemed to fall more toward the latter part of the year. However, without additional analysis and review of completed events analyses, no trends about the time of the year could be concluded. 23 of 66

49 Chapter 2 Daily Performance Severity Risk Assessment NERC Cumulative SRI Figure 11: NERC Cumulative SRI Performance History ( ) 2008 SRI 2009 SRI 2010 SRI 2011 SRI /1 2/1 3/1 4/1 5/1 6/1 7/1 8/1 9/1 10/1 11/1 12/1 Day of Year Figure 12 breaks down the cumulative performance by bulk power system segment (generation, transmission, and load loss). From 2008 to 2011, the generation component is the largest followed by the transmission and then the load loss component. Notably, the components of Figure 12 exhibit a choppier characteristic in 2011 compared to prior years, which can be attributed to the higher number of high stress days in Although 2011 had more high stress days, the total cumulative performance remained below 2008, 2009, and This may be due to a reduction of cumulative slope in all three components starting in mid-september. The change of cumulative slope is caused by a number of days with very good performance from mid-september until the end of Figure 12: NERC Cumulative SRI by Bulk Power System Segment ( ) 24 of 66

50 Chapter 2 Daily Performance Severity Risk Assessment Eastern Interconnection Assessment In Figure 13 and Figure 14, SRI performance history and annual cumulative performance of the bulk power system in the Eastern Interconnection are shown, respectively. The graph can be read by evaluating the lack of significant steps that occur on any given day. If a big step happens, it represents a stress day as measured by the SRI. Significant stress days for 2011 in the Eastern Interconnection include: the April 27 and 28 Southeastern tornadoes, Hurricane Irene on August 27 and 28, and the Northeastern snowstorm on October 29. Figure 13: Eastern Interconnection SRI Performance History ( ) 25 of 66

51 Chapter 2 Daily Performance Severity Risk Assessment Eastern Interconnection Cumulative SRI Figure 14: Eastern Interconnection Cumulative SRI Performance History ( ) /1 2/1 3/1 4/1 5/1 6/1 7/1 8/1 9/1 10/1 11/1 12/1 Day of Year Figure 15 breaks the cumulative SRI performance of Figure 14 into the bulk power system components of generation, transmission, and load loss. In 2011, the ratio between the generation component, transmission, and load loss component shows the load loss and transmission component being proportionally larger than in prior years. Figure 15: Eastern Interconnection Cumulative SRI by Bulk Power System Segment ( ) 26 of 66

52 Chapter 2 Daily Performance Severity Risk Assessment Western Interconnection Assessment In Figure 16 and Figure 17, SRI performance history and annual cumulative performance of the bulk power system in the Western Interconnection are shown, respectively. The graph can be read by evaluating the lack of significant steps that occur on any given day. If a big step happens, it represents a stress day as measured by the SRI. Significant stress days for 2011 in the Western Interconnection include a February 2 generation inadequacy event and a load loss event on September 8. Figure 16: Western Interconnection SRI Performance History ( ) 27 of 66

53 Chapter 2 Daily Performance Severity Risk Assessment 700 Figure 17: Western Interconnection Cumulative SRI Performance History ( ) Western Interconnection Cumulative SRI /1 2/1 3/1 4/1 5/1 6/1 7/1 8/1 9/1 10/1 11/1 12/1 Day of Year Figure 18 breaks the cumulative SRI performance of Figure 17 into the bulk power system components of generation, transmission, and load loss. In 2011, the ratio between the generation component, transmission, and load loss component shows the generation loss component comprising a much smaller component and the load loss component comprising a larger proportion than in previous years. 28 of 66

54 Chapter 2 Daily Performance Severity Risk Assessment Figure 18: Western Interconnection Cumulative SRI by Bulk Power System Segment ( ) ERCOT Interconnection Assessment In Figure 19 and Figure 20, SRI performance history and annual cumulative performance of the bulk power system in the ERCOT Interconnection are shown, respectively. The graph can be read by evaluating the lack of significant steps that occur on any given day. If a big step happens, it represents a stress day as measured by the SRI. Significant stress days for 2011 in the ERCOT Interconnection include a generation inadequacy event from February 2 to 4. Figure 19: ERCOT Interconnection SRI Performance History ( ) 29 of 66

55 Chapter 2 Daily Performance Severity Risk Assessment ERCOT Interconnection Cumulative SRI Figure 20: ERCOT Interconnection Cumulative SRI Performance History ( ) /1 2/1 3/1 4/1 5/1 6/1 7/1 8/1 9/1 10/1 11/1 12/1 Day of Year Figure 21 breaks the cumulative SRI performance of Figure 20 into the bulk power system components of generation, transmission, and load loss. In 2011, the ratio between the generation component, transmission, and load loss component shows the generation loss component to be more predominant in 2011 than Figure 21: ERCOT Interconnection Cumulative SRI by Bulk Power System Segment ( ) 30 of 66

56 Chapter 2 Daily Performance Severity Risk Assessment Québec Interconnection Assessment There is insufficient historic daily generation outage and event data for Québec Interconnection (QI). The QI SRI assessment will be provided in a future report. 31 of 66

57 Chapter 2 Daily Performance Severity Risk Assessment 32 of 66

58 Chapter 3 Top Risk Issues Chapter 3 Top Risk Issues Overview Building upon last year s metric review, the results of eighteen performance metrics continue to be assessed. Due to data availability, each of the performance metrics does not address the same time periods (some metrics have just been established, while others have data over many years). At this time, the number of metrics is expected to remain constant; however, other metrics may supplant existing metrics which may have more merit. Each metric is designed to show a measure for a given ALR characteristic. In Table 6, each metric is placed into a grid showing which ALR characteristic that metric represents. Also, the standard objective areas are shown for each ALR metric to provide a connection between standard objectives and ALR characteristics. Table 6: Adequate Level of Reliability Characteristics 47 Standard Objectives Boundary Contingencies Integrity Protection Restoration Adequacy Reliability Planning Operating Performance and ALR1-4 ALR3-5 ALR4-1 ALR1-3 ALR6-1 ALR6-11 ALR6-12 ALR6-13 ALR6-14 ALR6-15 ALR6-16 Frequency Voltage Performance and ALR1-5 ALR1-12 ALR2-4 ALR2-5 ALR2-3 Reliability Information Emergency Preparation ALR6-2 ALR6-3 Communications and Control Personnel Wide-area View Security 47 The blank fields indicate no metrics have been developed to assess related ALR characteristics at this time. 33 of 66

59 Chapter 4 Reliability Indicator Trends Chapter 4 Reliability Indicator Trends Overview Building upon last year s metric review, the results of eighteen performance metrics continue to be assessed. Due to data availability, each of the performance metrics does not address the same time periods (some metrics have just been established, while others have data over many years). At this time, the number of metrics is expected to remain constant; however, other metrics may supplant existing metrics which may have more merit. Each metric is designed to show a measure for a given ALR characteristic. In Table 6, each metric is placed into a grid showing which ALR characteristic that metric represents. Also, the standard objective areas are shown for each ALR metric to provide a connection between standard objectives and ALR characteristics. Table 7: Adequate Level of Reliability Characteristics 48 Standard Objectives Boundary Contingencies Integrity Protection Restoration Adequacy Reliability Planning Operating Performance and ALR1-4 ALR3-5 ALR4-1 ALR1-3 ALR6-1 ALR6-11 ALR6-12 ALR6-13 ALR6-14 ALR6-15 ALR6-16 Frequency Voltage Performance and ALR1-5 ALR1-12 ALR2-4 ALR2-5 ALR2-3 Reliability Information Emergency Preparation ALR6-2 ALR6-3 Communications and Control Personnel Wide-area View Security 48 The blank fields indicate no metrics have been developed to assess related ALR characteristics at this time. 34 of 66

60 Chapter 4 Reliability Indicator Trends These metrics exist within a reliability framework, and, overall, the performance metrics being considered address the fundamental characteristics of an ALR. 49 Each of the performance categories being measured by the metrics should be considered in aggregate when making an assessment of the reliability of the bulk power system with no single metric indicating exceptional or poor performance of the power system. Importantly, due to regional differences (size of the Region, operating practices, etc.) comparing the performance of one Region to another would be erroneous and inappropriate. Furthermore, depending on the Region being evaluated, one metric may be more relevant to a specific Region s performance than another, and assessments may be more subjective than a purely mathematical analysis. Finally, choosing one Region s best metric performance to define targets for other Regions is also inappropriate. Another metric reporting principle is to retain anonymity of any individual reporting organization. Thus, granularity will be attempted only to the point that such action will not compromise anonymity of any individual reporting organization. An overview of the ALR metric ratings for 2011 is provided in Table 7. Although a number of performance categories have been assessed, some do not have sufficient data to derive conclusions from the metric results. Continued assessment of these metrics should continue until sufficient data is available of 66

61 Chapter 4 Reliability Indicator Trends ALR Table 8: 2011 Metric Trend Ratings Boundary Trend Rating 1-5 System Voltage Performance * 1-12 Interconnection Frequency Response * Contingencies 1-4 BPS Transmission Related Events Resulting in Loss of Load 2-4 Average Percent Non-Recovery Disturbance Control Standard Events * 2-5 Disturbance Control Events Greater than Most Severe Single Contingency 3-5 Integrity Interconnected Reliability Operating Limit/ System Operating Limit (IROL/SOL) Exceedances Protection 2-3 Activation of Under Frequency Load Shedding 4-1 Automatic Transmission Outages Caused by Failed Protection System Equipment 1-3 Planning Reserve Margin 6-1 Transmission Constraint Mitigation 6-2 Energy Emergency Alert 3 (EEA3) 6-3 Energy Emergency Alert 2 (EEA2) Adequacy 6-11 Automatic AC Transmission Outages Initiated by Failed Protection System Equipment 6-12 Automatic AC Transmission Outages Initiated by Human Error 6-13 Automatic AC Transmission Outages Initiated by Failed AC Substation Equipment 6-14 Automatic AC Transmission Outages Initiated by Failed AC Circuit Equipment 6-15 Element Availability Percentage (APC) * 6-16 Transmission System Unavailability * Trend Rating Symbols Significant Improvement Slight Improvement No Change / Inconclusive Slight Deterioration Significant Deterioration New Data * 36 of 66

62 Chapter 4 Reliability Indicator Trends ALR1-12 Interconnection Frequency Response Background This metric is used to track and monitor Interconnection Frequency Response. Frequency Response 50 is a measure of an Interconnection s ability to stabilize frequency immediately following the sudden loss of generation or load. It is a critical component to the reliable operation of the bulk power system, particularly during disturbances and restoration. The metric measures the average Frequency Response for all events where frequency drops more than the Interconnection s defined threshold as shown in Table 8. Assessment The following are Frequency Response calculations of Eastern Interconnection (EI), Western Interconnection (WI), ERCOT Interconnection and Québec Interconnection. While the calculations may show trends from year to year, no attempt has been made in this analysis to determine or state what indicates an acceptable level of Frequency Response for any of the interconnections. Rather, they show the relative performance from year-to-year and can be a basis for further root-cause analysis. Further, the Frequency Response should not be compared between interconnections as their bulk power system characteristics differ significantly in terms of number of facilities, miles of line, operating principles and simple physical, geographic and climatic conditions. Some annualized Frequency Responses are higher due to the large number of disturbances in the dataset where frequency changes were greater than the generator dead-bands. Also, in earlier studies, the capacity of the unit or gross output at the time of the unit trip rather than the net generation 51 MW loss to the interconnection was reported. No conclusions as to the absolute value of any of these calculations can be drawn at this time. Figure 22 shows the criteria for calculating average values A and B. The event starts at time t ±0. Value A is the average from t -16 to t -2 and Value B is the average from t +20 to t +52. The difference of value A and B is the change in frequency 52 used for calculating Frequency Response. These lengths of time used to calculate these values accounts for the variability in System Control and Data Acquisition (SCADA) scan rates that vary from two to six seconds in the multiple-balancing Authority interconnections. Figure 22: Criteria for Calculating Value A and Value B The frequency events used to calculate the Interconnection Frequency Response are the same events to be used by Balancing Authorities for the calculation of their Frequency Bias and, eventually, compliance with the proposed Frequency Response Standard (FRS) BAL-003. The monthly frequency event candidate lists are posted on the NERC 50 Frequency Response is in fact a negative value. However to reduce confusion for the reader, Frequency Response is expressed in this report as positive values (the absolute value of the actual calculated value). 51 There could be a coincident loss of load also. 52 Definitions of Value A and Value B are on Slide 18 of Presentation 1 in: 37 of 66

63 Chapter 4 Reliability Indicator Trends Resources Subcommittee 53 webpage. The data collection process is described in the BAL-003 Frequency Response Standard Supporting Document. 54 The triggers 55 for significant frequency events are shown in Table 8. Table 9: Frequency Event Triggers Interconnection Frequency (mhz) MW Loss Threshold Rolling Windows (seconds) Eastern Western ERCOT Québec The actual MW loss for the flagged frequency events is determined jointly by NERC and Regional Entity situation awareness staff. Both the change in frequency and the MW loss determine whether the event qualifies for further consideration in the monthly frequency event candidate list. Table 9 shows the number of frequency events per year for each interconnection. In recent years, better tools have been put in place to detect frequency events and their underlying causes. There are also more systematic procedures to document and verify these events. Given that the generation fleet in the EI has not changed much and Frequency Response has been consistent, the number of frequency events that occur would be expected to remain the same for the last several years. The increase in sample size for the EI in 2011 as shown in Table 9 is because more frequency events are captured due to better tools and procedures developed in This increased sample size leads to increased confidence in the statistics generated by the data. Table 10: Sample Sizes of Yearly Events Interconnection Western Eastern ERCOT Québec The Frequency Response box plots for each Interconnection are shown below in Figure 23 to Figure 26. The Québec Interconnection started the data collection process in Figure 27 to Figure 29 are box plots of the sizes of events as loss in MW. These show that, with the more consistent and streamlined data collection procedures developed in 2011, smaller events in the Eastern Interconnection are being captured. There would be fewer governors triggered due to the dead band for smaller frequency deviations and so these events will have less Frequency Response %20Frequency%20Response%20Standard%20Supporting%20Document%20-%20RL-%202011%2007%2011.pdf 55 The frequency thresholds were changed to ensure that an appropriate number of significant events are captured for all interconnections. In September 2011, the Québec threshold was changed to 200 mhz. At the beginning of 2012 the Québec threshold was again increased to 300 mhz and the Eastern Interconnection threshold was increased to 40 mhz. 38 of 66

64 Chapter 4 Reliability Indicator Trends ALR Characteristic: Contingencies ALR1-4 BPS Transmission Related Events Resulting in Loss of Load Background This metric measures bulk power system transmission-related events resulting in the loss of load, excluding weather-related outages. Planners and operators can use this metric to validate their design and operating criteria by identifying the number of instances when loss of load occurs. For the purposes of this metric, an event is an unplanned transmission disturbance that produces an abnormal system condition due to equipment failures or system operational actions, and results in the loss of firm system demand for more than 15 minutes. The reporting criteria for such events are outlined below: 56 Entities with a previous year recorded peak demand of more than 3,000 MW are required to report all such losses of firm demand totaling more than 300 MW. All other entities are required to report all such losses of firm demands totaling more than 200 MW or 50 percent of the total customers being supplied immediately prior to the incident, whichever is less. Firm load shedding of 100 MW or more used to maintain the continuity of the bulk power system reliability. Assessment Figure 30 illustrates that the number of bulk power system transmission-related events resulting in loss of firm load from 2002 to 2011 is relatively constant. On average, eight to 10 events are experienced per year. The top three years in terms of load loss are 2003, 2008, and 2011 as shown in Figure 31. In 2003 and 2011, one event accounted for over two-thirds of the total load loss while, in 2008, a single event accounted for over one-third of the total load loss. Further analysis and continued assessment of the trends over time is recommended. Number of Events Figure 23: ALR1-4 BPS Transmission Related Events Resulting in Load Loss ( ) Year Figure 24: Bulk Power System Transmission Related Event Related Loss of Load ( ) 56 Details of event definitions are available at: 39 of 66

65 Chapter 4 Reliability Indicator Trends Total Annual Load Loss (MW) 70,000 60,000 50,000 40,000 30,000 20,000 64,850 Colors represent individual events. 10,000 7,085 4,950 8,942 3,763 2,249 11,045 4,432 4,078 10, Year Special Considerations The collected data does not indicate whether load loss, during an event, occurred as designed or not as designed. Further investigation into the usefulness of separating load loss as designed and unexpected load loss should be conducted. Also, differentiating between load loss as a direct consequence of an outage compared to load loss as a result of operation action to mitigate an IROL/SOL exceedance should be investigated. ALR Characteristic: Integrity ALR3-5 Interconnection Reliability Operating Limit/ System Operating Limit (IROL/SOL) Exceedances Background This metric measures the number of times that a defined Interconnection Reliability Operating Limit (IROL) or System Operating Limit (SOL) was exceeded and the duration of these events. Exceeding IROL/SOLs could lead to outages if prompt operator control actions are not taken to return the system to within normal operating limits. Also, exceeding the limits may not directly lead to an outage, but it puts the system at unacceptable risk if the operating limits are exceeded beyond T v. 57 With the Operating Reliability Subcommittee s and former Reliability Coordinator Working Group s technical support, this metric was endorsed by NERC s Operating and Planning Committees in June 2010 and a data request was subsequently issued in August The reporting of IROL/SOL exceedances became mandatory in While the IROL is used in the Eastern, ERCOT and Québec Interconnections, there are no pre-identified IROLs in the Western Interconnection. 57 T v is the maximum time that an Interconnection Reliability Operating Limit can be violated before the risk to the interconnection or other Reliability Coordinator Area(s) becomes greater than acceptable. Each Interconnection Reliability Operating Limit s T v shall be less than or equal to 30 minutes. 40 of 66

66 Chapter 4 Reliability Indicator Trends Assessment Figure 34 and Figure 35 show the number of IROL/SOL exceedances separated by quarter and duration for the Eastern and Western Interconnections, respectively. The second quarter of 2011 shows the most exceedances for the Eastern Interconnection. For the Western Interconnection, the third and fourth quarter of 2011 show a marked increase in the number of SOL exceedances that lasted between 10 seconds and 10 minutes. There is insufficient data to assess any trends in IROL/SOL exceedances. Number of IROL Exceedances Number of SOL Exceedances Figure 25: 2011 Eastern Interconnection ALR3-5 IROL Exceedances by Quarter and Duration Q1 2011Q2 2011Q3 2011Q Quarter Figure 26: Western Interconnection ALR3-5 SOL Exceedances by Quarter and Duration Q1 2011Q2 2011Q3 2011Q4 Quarter 10 Secs < Duration 10 mins 10 mins < Duration 20 mins 20 mins < Duration 30 mins Duration > 30 mins 10 Secs < Duration 10 mins 10 mins < Duration 20 mins 20 mins < Duration 30 mins Duration > 30 mins Figure 27: ERCOT Interconnection ALR3-5 IROL Exceedances by Quarter and Duration (2011) 41 of 66

67 Chapter 4 Reliability Indicator Trends Number of IROL Exceedances Q1 2011Q2 2011Q3 2011Q4 Quarter 10 Secs < Duration 10 mins 10 mins < Duration 20 mins 20 mins < Duration 30 mins Duration > 30 mins ALR Characteristic: Protection ALR2-3 Activation of Under Frequency Load Shedding Background The purpose of Under Frequency Load Shedding (UFLS) is to balance generation and load when an event causes a significant drop in frequency of an interconnection or islanded area. The UFLS activation metric measures the number of times it is activated and the total MW of load interrupted in each Regional Entity and NERC-wide. Assessment Table 10 illustrates a history of UFLS events from 2001 through Notably, single events had a load shedding range from 24 MW to 17,644 MW. The activation of UFLS is the last automated reliability measure associated with a decline in frequency in order to rebalance the system. Further assessment of the MW loss for these activations is recommended. Because of the large range of load lost in UFLS events, the need to establish a UFLS total load loss threshold is under evaluation. The significance of a UFLS activation compared to the total activation load loss will be assessed over time. 42 of 66

68 Chapter 4 Reliability Indicator Trends Table 11: ALR2-3 Under Frequency Load Shedding MW Loss FRCC 1,273 MRO 486 NPCC 17, RFC 6, SPP 673 (2 Events) SERC TRE 1,549 WECC ALR4-1 Automatic AC Transmission Outages Caused by Protection System Equipment-Related Misoperations Background Originally titled Correct Protection System Operations, this metric has undergone a number of changes since its initial development. To ensure that it best portrays how misoperations affect transmission outages, it was necessary to establish a common understanding of misoperations and the data needed to support the metric. NERC s ERO-Reliability Assessment and Performance Analysis (ERO-RAPA) group evaluated several options of transitioning from existing procedures for the collection of misoperations data to a consistent approach which was introduced at the beginning of With the NERC System Protection and Control Subcommittee s (SPCS) technical guidance, NERC and the Regional Entities have agreed upon a set of specifications for misoperations reporting including format, categories, event cause codes, and reporting period to have a final, consistent reporting template. 59 Based on the standardized misoperations reporting, two quarters of data in 2011 have been collected. To calculate the metric, the Transmission Availability Data System (TADS) is used to find the total number of automatic outages. Currently, TADS only collects data for transmission outages on elements greater than 200 kv. Therefore, only automatic transmission outages 200 kv and above, including AC circuits and transformers, will be used in the calculation of this metric. Assessment Currently, efforts are underway to reconcile misoperation reporting with TADS event reporting. Therefore, the values for this metric are based on preliminary, not final data. However, based on the preliminary misoperations reporting, some observations can be made. With less than a full year s data, a full assessment for protection system misoperation trending is not possible. The decrease in misoperations from the second to third quarter, as shown in Table 11, is notable, and a year-to-year quarterly comparison would provide a better insight into how the numbers of misoperations are changing over time. 58 MW loss due to Under Frequency Load Shedding Misoperations have been removed from the Table. Except for SPP in 2008, all MW loss was in a single UFLS event. 59 The current Protection System Misoperations template is available at: 43 of 66

69 Chapter 4 Reliability Indicator Trends Table 12: Protection System Misoperations by Regional Entity and Quarter Regional Entity 2011Q2 Misoperations 2011Q3 Misoperations FRCC MRO NPCC RFC SERC SPP RE TRE WECC Although it is too early to trend misoperation counts, there are other interesting observations. The vast majority of misoperations in both quarters, shown in Figure 37, are categorized as unnecessary trips. Also, Figure 38 illustrates the top three cause codes assigned to misoperations in both quarters: incorrect setting, logic, or design error; relay failures/malfunction; and communication failure. To further investigate the root causes of protection system misoperations, NERC s Planning Committee has formed a task force known as the Protection System Misoperation Task Force (PSMTF). The PSMTF will investigate protection system misoperations in order to improve performance and mitigate risks due to misoperations. Figure 28: NERC-wide Misoperations by Category (2011Q2 and 2011Q3) 63% 32% Unnecessary Trip during Fault Unnecessary Trip without Fault Failure to Trip Slow Trip 1% 4% Figure 29: NERC-wide Misoperations by Cause Code (2011Q2 and 2011Q3) 44 of 66

70 Chapter 4 Reliability Indicator Trends 12% 7% 7% 3% Incorrect setting, logic, or design errors Relay Failures / Malfunctions 16% Communication Failures Unknown / Unexplainable 32% As-left Personnel Error 23% AC System DC System ALR6-2 Energy Emergency Alert 3 (EEA3) Background This metric identifies the number of times Energy Emergency Alert Level 3 (EEA3) is issued. EEA3 events are firm-load interruptions imminent or in progress due to capacity and/or energy deficiencies. EEA3 events are currently reported, collected and maintained in NERC s Reliability Coordinator Information System (RCIS), defined in the NERC Standard EOP The number of EEA3s per year provides a relative indication of performance measured at a Balancing Authority or Interconnection level. As historical data is gathered, trends provide an indication 60 The latest version of EOP-002 is available at: 45 of 66

71 Chapter 4 Reliability Indicator Trends of either decreasing or increasing use of EEA3s, signaling real-time adequacy of the electric supply system. This metric can also be considered in the context of the Planning Reserve Margin. Significant increases or decreases in EEA3 events with relatively constant ALR1-3 Planning Reserve Margins could indicate changes in the adequacy of the bulk power system requiring review of resources. However, lack of fuel and dependence on transmission for imports into constrained areas can also contribute to increased EEA3 events. Assessment Figure 41 shows the number of EEA3 events from 2006 to 2011 at a Regional Entity level. An interactive presentation is available at the Reliability Indicator s page. 61 Due to the large number of EEA3s in the Acadiana Load Pocket (ALP) region, the SPP RC coordinated an operating agreement in 2009 with the five operating companies in the ALP to improve reliability performance as an interim step while a $200 million transmission construction program was initiated by Cleco Power, Entergy, and Lafayette Utilities. The addition of two 230 kv lines, one in spring 2011 and another in fall 2011, alleviated a portion of the most limiting elements. An additional 230 kv line was energized in February 2012 and two other 230 kv lines are expected to be energized by the summer of Completion of these projects should further alleviate transmission congestion in this area and will conclude the planned ALP projects. SPP RTO continues to coordinate operating plans with the operating entities in this area. Mitigation plans and local operating guides in place are expected to provide sufficient flexibility should issues arise. NERC will continue to monitor this area for reliability impacts and coordinate any actions with the SPP Reliability Coordinator and SPP Regional Entity. Number of EEA3s Figure 30: ALR6-2 Energy Emergency Alert 3 (EEA3) by Regional Entity and Year ( ) FRCC MRO NPCC RFC SERC SPP TRE WECC Regional Entity and Year Special Considerations The need to include the magnitude and duration of the EEA3 declarations in this metric is under evaluation. ALR6-3 Energy Emergency Alert 2 (EEA2) Background The Energy Emergency Alert 2 (EEA2) metric measures the number of events declared for deficient capacity and energy during peak load periods. The number of EEA2 events, and any trends in their reporting, indicates how robust the system is in supplying aggregate load requirements. The EEA2 declarations may also serve as a leading indicator of energy and capacity shortfall in the adequacy of the electric supply system. EEA2 declarations provide 61 The EEA3 interactive presentation is available on the NERC website at: of 66