Tracking performance of Software Security Assurance programs five essential KPIs

Size: px
Start display at page:

Download "Tracking performance of Software Security Assurance programs five essential KPIs"

Transcription

1 Tracking performance of Software Security Assurance programs five essential KPIs Technical white paper Table of contents Background... 2 The difference between metrics and KPIs... 2 KPIs... 3 Key KPI #1: WRT... 3 Key KPI #2: DRW... 4 Key KPI #3: RDR... 5 Key KPI #4: SCM... 6 Key KPI #5: SQR... 7 Conclusion... 8

2 Background Every day, technology organizations amass terabytes, even petabytes, of data from various sources about the performance of their technology. Information security teams are no different in their obsession with gathering security data. There is a vast difference between data, metrics and key performance indicators (KPIs). While information security has been gathering volumes of vulnerability data and consolidating that data into metrics for almost two decades, the inability to provide business-centric context to those metrics alienates business leaders from the knowledge they need to make informed IT risk decisions. KPIs take information security metrics to the next level, providing critical business alignment and bridging the language barrier between technology and business. Making the leap from data points to contextualized, business-oriented KPIs is not a trivial endeavor. Many mature IT organizations struggle with how to make the data they are collecting work to help achieve their goals. A pivotal moment takes place when the Information Security organization comes to grips with the fact that the business doesn t always relate to its metrics. IT security must take great care to dialogue with the business to ensure transparency and clear understanding of business goals so that security s wins are properly expressed through KPIs that serve both technology and the business risk strategy. IT security teams potentially collect terabytes of data when it comes to application security, at the center of which are vulnerability numbers. Security teams showcase the number of critical security vulnerabilities that they have uncovered and ultimately remediated. While an application security test may produce tens or maybe hundreds of critical security vulnerabilities, what do these issues mean to the business? Why and which critical vulnerabilities require action and spend? More important, how can spending business resources on remediating these issues or implementing compensating controls create a positive business impact? These are the KPIs that need to be measured by IT security. While the IT security team may tend to focus on the number of critical vulnerabilities, the business thinks in terms of total risk. When analyzed carefully, nearly every business application can have critical security issues that can lead to some negative business impact. Business resources, including capital, human talent, and time are not infinite and must be allocated carefully to address truly business-critical issues, but how can an IT department tell if this is being done? Today s metrics simply do not give us the answer, and leave IT security managers struggling to provide business context to their security programs. The secret to making key business decisions is translating the volumes of information security-related data into language the business can understand. The same goes for providing hard evidence on the success or failure of a Software Security Assurance (SSA) program. This paper reveals the five SSA program KPIs, their methods of collection, their meaning and importance to the organization, and how to present them in a way that demonstrates measurable success of your security strategy. Combining knowledge and experience in building complex, successful SSA programs, this paper sets the groundwork to advance beyond simple metrics. The difference between metrics and KPIs Concrete KPIs are realistically the only way to prove that a particular SSA program is succeeding. This paper focuses on the five essential KPIs that can transform vulnerability data and security metrics into proven success. The key to making the leap from numbers to concrete metrics is in the way that data is framed against business goals of risk reduction. Again, when it comes to looking at information security, simply stating we have not been hacked is insufficient. It is up to the IT security team to provide context to go beyond mere data. The major difference between a metric and a KPI deals with long-term strategic business goals. Metrics tend aggregate context-less data while KPIs align metrics to strategic business goals. Aggregating vast data points into metrics is merely the first step in demonstrating success or failure. Pulling information from business analysts and providing business context to those metrics produces distilled KPIs. KPIs can be thought of as a super-metric where many individual metrics can feed a single KPI to contextualize the various numbers and link them together to derive deeper meaning. KPIs can be a very powerful tool when implemented correctly and in accordance with phased maturity of an enterprise IT risk mitigation program. Harnessing the potential of the five KPIs presented here requires increasing levels of organizational maturity, which builds over time and involves a cross-functional approach to SSA. This approach may initially feel foreign but is based on time-tested methods across software quality disciplines. 2

3 KPIs There are five KPIs that are useful in defining and proving the success or failure of an SSA program. They are Weighted Risk Trend (WRT), Defect Remediation Window (DRW), Rate of Defect Recurrence (RDR), Specific Coverage Metric (SCM), and Security to Quality defect Ratio (SQR). These five KPIs are not trivial to gather; it is unlikely that a young organization can have the ability to assemble all five KPIs. As organizational maturity grows so will that organization s ability to gather incrementally more metrics. As the KPIs get incrementally more difficult to attain the payback becomes greater as well. Collecting and aggregating KPIs alone cannot make a security team or program successful, rather, the KPIs can more clearly illustrate the success or failure of the implemented program effort. The KPIs are defined as such: WRT the weighted risk score over time, iterations of development DRW how long a defect takes to remediate to a fixed state RDR rate at which a defect is re-introduced over the life of an application SCM total addressable attack surface of the applications functionality tested SQR the number of defects in a testing cycle logged as security defects as a ratio to all quality defects This paper explores how each key KPI is gathered and used for validation of the SSA program. While this paper focuses on building KPIs for Web applications it is a realistic extension to compile these same KPIs for any type of software development endeavor. Key KPI #1: WRT Definition: The first step to an organization s commitment to security is creating a repository of the totality of Web applications. The repository should contain metadata on the application s owner, business purpose, and other data among them should be the business criticality metric that identifies a clear and straight-forward ranking of applications and their criticality to the business. The formalization of this metric is out of scope for this document but should be addressed numerically. The WRT takes into account the business value metric and combines it in various mathematical formulas (one such formula is suggested here) with the vetted defect criticalities themselves to obtain a numerical representation of application risk with business criticality in mind. WRT provides business context to security vulnerability metrics. Organizational importance: The WRT is the single-most important metric early-on in the maturity of a secure organization. It tracks application weight, defect count, trends over time, and revision of the application; giving a holistic long-term picture of whether the identified security issues are being mitigated strategically or tactically. The WRT provides a business-weighted view of the application security risk over the life of the development process, best measured from sunrise to sunset of an application (where possible). Weighted risk trending allows an organization to assess whether resources are being properly allocated to mitigate the greatest organizational risk and can be used to gauge impact of a new program, initiative, or expenditure. The application weight should be pulled not from an IT security exercise but from an existing repository. The disaster recovery plan is a fantastic source for application weighting information. As the business must rank each application in order of criticality for the purposes of disaster recovery and because each level of uptime requirement has an associated monetary value, it is more likely that the business will realistically stratify applications in terms of real business criticality. Furthermore, this data should be readily obtainable in most businesses and provides concrete artifacts from the business on application importance. 3

4 Gathering method: The WRT is a plot of weighted risk over time. The WRT formula is not static, meaning it is not the same for every organization. A baseline for measuring is the following: ([defect criticality] x [number of defects])* x [application weight] = Risk score *Per each defect class critical high medium low Example: Application weight (0 1.0):.75 Severity scoring**: Critical = 10 high = 5 medium = 2 low = 1 Defects discovered: 10 critical, 7 high, 30 medium, 39 low ((10 x 10) + (5 x 7) + (2 x 30) + (1 x 39)) x.75 = **The severity scoring used here can be anything that the business determines is appropriate; these numbers should be nonlinear to demonstrate nonlinear risk progression. Figure 1. Sample graphed WRT Vulnerability reduction, with business context ERP Retail Marketing Key KPI #2: DRW Definition: The DRW is a period of time metric between the identification (more specifically, the validation that the defect is real by either a developer or a qualified application security professional) of a defect and the time when it is verified to be remediated (again, in the same way it was validated as real). The DRW measures how long an organization takes to fix a documented, verified defect that may be impacting the application right now. It is important to note here that the main goal of the DRW is to measure the amount of developer impact and, therefore, is best measured as the number of man-hours a developer or team takes to remediate the defect. The DRW measures an organization s responsiveness to defect remediation and can serve as a good measure of organizational maturity. The defect remediation window clearly demonstrates the impact a holistic security program has on development by showing the decrease in the time it takes to fix application security defects. As security programs mature, and development teams accept and establish security best practices, they can stop fixing each issue individually and can rather address application security defects by large-scale one-time architectural fixes. This approach diminishes the time it takes to fix, for example, 50 cross-site scripting (XSS) vulnerabilities from 50 individual fixes, to possibly two-three more precise fixes, which is a remarkable difference in time spent in development making applications more secure. 4

5 This metric should not be confused with the exposure window, a metric that identifies how long a defect potentially existed in the wild before it was closed. The exposure window is much more difficult to attain and less valuable than the DRW. Organizational importance: The DRW serves to identify an organization s responsiveness to security defects in its Web applications. How quickly an organization responds to a real defect speaks volumes about how serious that organization is about its security and the availability of resources (developers, time, funding). The shorter the DRW is consistently across the enterprise the more mature an organization is considered with respect to its SSA program. Gathering method: The most common approach for measuring the DRW is to use a formal defect tracking system that can store defect data such as when the defect was identified, validated, retested, and then closed. Ideally this metric would be plotted over time to show a decrease in the amount of time needed to respond and close an identified defect. The most important components to capture when tracking defects for this purpose are the following: Defect discovery date/time Defect validation date/time Defect work time how long it takes the defect to be worked on (for each cycle) Defect retest iterations (cycles) Defect close date/time Figure 2. Graphed DRW over iterations of development Ma n-hours As an SSA program matures over time, the DRW should naturally start to drop in stages. As a new concept for developers, initially the remediation window can be long. As developers start to understand security concepts, the amount of hours per defect remediated drops at first slowly then more dramatically as common modules and code re-use are introduced. The organization should expect several small dips in the DRW metric in the initial stages of security program implementation; then a dramatic drop as the entire development organization begins a mature cycle of code re-use and a security-infused software development lifecycle (SDLC). Key KPI #3: RDR Definition: The RDR is the rate at which previously closed (security) defects are re-inserted into the application. The recurring defects can be re-inserted into the subsequent release cycle or at later development cycles what counts is that the same defect is reintroduced into the same place, in the same manner, into the same application. To measure the RDR successfully, an organization needs a way to track defects clearly with respect to specific defect type (a specific label such as SQL injection, stored cross-site scripting, and so on) and location within the application (line of course code, or specific URL, or other way of identifying the specific location). Note that the specific execution instance of the defect is irrelevant. This means that reflected cross-site scripting (rxss) using one character string is indistinguishable (not separate) from the same attack using a slightly modified attack string. This prevents developers from fixing a specific instance of a defect rather than eliminating the defect and all permutations of it. 5

6 Organizational importance: RDR measures an organization s ability to close defects permanently. This does not necessarily mean that the same type of defect will not re-occur in another application or elsewhere in the same code at a later cycle only that the same defect (same placement, type) does not recur once it has been successfully tested to close. Initially, development organizations have a tendency to bandage code, which makes the code susceptible to defect recurrence if the bandage is removed or tampered; but as the organization actually learns security concepts and secure coding practices, the defects can be remediated systemically with a lessened chance of recurrence. This RDR KPI demonstrates a development organization s maturity in absorbing information security concepts, adoption of secure coding practices, secure code reuse, and other mature practices. RDR is plotted over time and should trend to be as low to zero as possible, decreasing over time. Gathering method: RDR is best gathered using a formal defect tracking tool with the capabilities of tracking a defect across multiple release cycles. Defect tracking of this nature is accomplished best through tools, although it can be done manually. An organization reaching this level of organizational maturity typically has the tools to perform this type of defect tracking without the need for manual maintenance of spreadsheets, databases, and so on. The RDR is plotted as the number of recurred defects (defects recurred from previous releases) per application over each release cycle. Key KPI #4: SCM Definition: SCM defines the flow-based or component-based coverage that security testing has achieved in the application. The SCM measures the percentage of the components that are able to be tested against the total flow or components of the application under review. It is important to note that the SCM is specific to the portion of the total components currently under review and thus does not necessarily take into account the entire application. The SCM is best used to quantify, in a definitive way, how much of the application (surface area) was tested without resorting to guess work. Organizational importance: Achieving measurement of the SCM is a milestone in enterprise maturity as it means that the organization is now being able to quantify testing coverage with respect to security testing. While functional testing teams have been providing these types of metrics for quite some time, this is a relatively new and complex concept for security teams. Measuring the SCM requires advanced levels of maturity in both process and tools. Accurately reporting the SCM requires both a complete understanding of the application attack surface under test and a way to measure testing being performed against that attack surface. This means a clear understanding and mapping of the application s flows or components, along with a way to measure the security testing coverage accurately against those flows. Reporting the SCM provides a definitive way of noting that an application was fully tested, or it provides a direct pointer as to which flows or components were missed and more important, why they were missed. Organizationally, this KPI is important as it provides direct, concrete proof of coverage. This KPI not only gives the business peace of mind but also typically serves as a way of demonstrating compliance more concretely. Historically, security testing teams have struggled with providing a real answer to the question how much was tested? Achieving a maturity state where the SCM can be accurately reported not only gives the security testing teams more credibility but also more confidence that their processes and testing methods are creating a more secure application, not just randomly testing findable features. Gathering method: Gathering methods vary by organization, but the SCM is a percentage of the covered application flows/components (Fc) against the whole of the applications/components under test (Ft). Obtaining the total surface area of the application under test either flow or component based is possible through data extraction from functional specifications or functional testing. This metric is generally collected and passed from the functional testing organization to the security testing organization and always based upon functional specifications for the application under test. Gathering the percentage of the application covered in security testing requires advanced security tools and processes. This metric may be gathered by hand (if the application is being tested by hand) or through the use of automation if the tool has a way of accurately demonstrating which flows/components have been tested completely. 6

7 Key KPI #5: SQR Definition: SQR measures the number of security-specific defects against the overall defects the application testing team uncover during testing cycles. The SQR metric provides deeper insight into the quality of the applications development in terms that quantify the impact security defects are having on the application. The SQR is mapped against a business risk tolerance to provide a way to assist the business with making go/no-go decisions. In organizations that are successfully tackling information security risk reduction systemically, the SQR more heavily skewed towards quality defects. This metric can be used to compare with similar metrics mapped to the various other defect types thus creating an at-a-glance view at the types of defects an organization is creating and where focus should be given with respect to education, mitigation, and process. This metric is best represented as a fraction (security defects/quality defects) where the term quality defects is used to describe the totality of defects in the testing organization s test cases. Figure 3. Graphical SQR Functional 75% Performance 60% Security 35% Organizational importance: The SQR identifies the organization s security posture with respect to minimizing risk from security defects in the application. This metric clearly demonstrates where the priorities of an organization are being spent and whether efforts to minimize various risks (specifically information security risks) are effective. The organizations with a very low SQR produce low-security-defect code, which means that they are producing more riskaverse applications. Gathering Method: The SQR is best gathered through automated methods from quality and security testing tools. Obviously the simplest approach to gathering this metric would be to utilize a single testing suite that can gather and return all the various types of defect metrics in detail and in aggregate. Gathering and reporting the SQR metric involves being able to look at the totality of testing defects, quantify the security defects against that whole, and then report that ratio as a fraction (security/quality aggregate) over the various testing cycles (time). 7

8 Conclusion Over time these KPIs can provide vastly improved business value. These KPIs contextualize the value of an SSA program and provide a means to have a more meaningful conversation with the business you support. You are encouraged to work through these KPIs. While few organizations will be able to gather the necessary metrics to feed all five KPIs right from the start, your organization will realize the long-term benefits of having a more meaningful and more cross-functional approach to your SSA program. HP Software offers a portfolio of software quality, performance, and security technologies and services foundational to the maintenance of a business-grounded SSA program. When your organization decides it s time to build security into your software, look to HP. We offer a full suite of integrated technologies that can guide you to attain the critical KPIs discussed in this paper. Besides, we deliver services to make sure you ve got the experience and knowledge to build the right security. If you re ready to have a better conversation with your business about the security state of your applications, come talk to HP Software at (U.S.) or visit Copyright 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. 4AA3-3079ENW, Created February 2011; Updated September 2011, Rev. 1