Introduction to Software Metrics

Size: px
Start display at page:

Download "Introduction to Software Metrics"

Transcription

1 Introduction to Software Metrics Outline Today we begin looking at measurement of software quality using software metrics We ll look at: What are software quality metrics? Some basic measurement theory A framework for software measurement Well also focus on several examples of product metrics: External product metrics defect metrics Internal product metrics size metrics, complexity metrics CSCI 3060U Lecture 19 Slide 1

2 Software Quality Metrics Applying Measurement to Software Software metrics are measurable properties of software systems, their development and use Includes wide range of different measures, of: properties of the software product itself the process of producing and maintaining it its source code, design, tests, etc. Examples are: number of failures time to build number of lines of code number of failures per 1,000 lines of code number of lines of code per programmer per month number of decisions per 1,000 lines of code CSCI 3060U Lecture 19 Slide 2

3 What are Metrics Good for? Reliability and Quality Control Metrics help us to predict and control the quality of our software Example: By measuring relative effectiveness of defect detection and removal of various testing or inspection methods, we can choose best one for our software products Cost Estimation and Productivity Improvement Metrics help us predict effort to produce or maintain our software, and to improve our scheduling and productivity Example: By measuring code production using different languages or tools, we can choose those that give the best results Quality Improvement Metrics help us to improve code quality and maintainability Example: By measuring complexity of our program code, we can identify sections of code most likely to fail or difficult to maintain CSCI 3060U Lecture 19 Slide 3

4 Kinds of Metrics Three Basic Kinds There are three kinds of software quality metrics: product metrics, process metrics and project metrics Product Metrics Product metrics are those that describe the internal and external characteristics of the product itself Examples: size, complexity, features, performance, reliability, quality level Most common software metrics are of this kind CSCI 3060U Lecture 19 Slide 4

5 Kinds of Metrics Process Metrics Process metrics measure the process of software development and maintenance, in order to improve it Examples: effectiveness of defect removal during development, pattern of defect arrival during testing, response time for fix Project Metrics Project metrics are those that describe the project characteristics Examples: number of developers, development cost, schedule, productivity CSCI 3060U Lecture 19 Slide 5

6 Measurement Basics If You Want to Know, Measure... When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager kind. - William Thomson (Lord Kelvin), Physicist...But Make Sure You Know What You Are Measuring In truth, a good case could be made that if your knowledge is meager and unsatisfactory, the last thing in the world you should do is make measurements. The chance is negligible that you will measure the right things accidentally. - George A. Miller, Psychologist CSCI 3060U Lecture 19 Slide 6

7 Measurement Basics Definition of Measurement Measurement is the process of empirical objective assignment of numbers to entities, in order to characterize an attribute What Does That Mean? An entity is an object or event, such as a source program An attribute is a feature or property of an entity, such as the size of the program Objective means measurement must be based on a well-defined rule whose results are repeatable, such as counting the number of source lines in the program In Other Words... Each entity is given a number, which tells you about its attribute Example: Each source program has a source line count, which tells you about its size CSCI 3060U Lecture 19 Slide 7

8 Measurement Basics Example Measurements Source: Fenton, Agena Corp CSCI 3060U Lecture 19 Slide 8

9 Measurement Basics Common Mistakes in Software Measurement It s easy to make mistakes in choosing what or how to measure software characteristics To avoid mistakes, stick to the definition of measurement (1) You must specify both an entity and an attribute,not just one or the other Example: Don t just say you are measuring a program, say what property of the program you are measuring Example: Don t just say you are measuring the size of the software, say what artifact of the software you are measuring the size of (e.g., source programs) (2) You must define the entity precisely Example: Don t just say program, say program source code CSCI 3060U Lecture 19 Slide 9

10 Measurement Basics Common Mistakes in Software Measurement (continued ) (3) You must have a good intuitive understanding of the attribute before you propose a measure for it Example: We have good evidence that size is related to number of source lines It is a mistake to propose a measure if there is no clear consensus on what attribute it is characterizing Example: Number of defects per KLOC (1000 lines of code) - characterizes quality of code, or quality of testing? It is a mistake to redefine an attribute to fit an existing measure Example: If we ve measured #defects found this month, don t mistake that as an indicator of code quality CSCI 3060U Lecture 19 Slide 10

11 Kinds and Uses of Software Measurement Kinds of Measurement Two distinct kinds of measurement, 1. direct measurement, and 2. indirect measurement Uses of Measurement Two distinct uses for measurement, 1. assessment (the way things are now), and 2. prediction (the way things are likely to be in future) Measurement for prediction requires a prediction system CSCI 3060U Lecture 19 Slide 11

12 Direct Measurement Some Direct Software Measures Direct measures are numbers that can be derived directly from the entity without other information Examples: Length of source code (measured by number of lines) Duration of testing process (measured in elapsed hours) Number of defects discovered during the testing process (measured by counting defects) Effort of a programmer on a project (measured by person-months worked) CSCI 3060U Lecture 19 Slide 12

13 Indirect Measurement Some Indirect Software Measures Indirect measures are numbers that are derived by combining two or more direct measures to characterize an attribute Examples: CSCI 3060U Lecture 19 Slide 13

14 Predictive Measurement Prediction Systems Measurement for prediction requires a prediction system A prediction system consists of: A mathematical model Example: E = a S b, where E is the effort (to be predicted), S is the estimated size in lines of code, and a and b are constant parameters A procedure for determining the model parameters Example: Analyze past project data to determine a and b A procedure for interpreting the results Example: Use Bayesian probability analysis to determine the likelihood that our prediction is accurate within 10% CSCI 3060U Lecture 19 Slide 14

15 A Framework for Software Products, Processes and Resources Measurement Process A software-related activity or event (e.g., designing, coding, testing,...) Source: Fenton, Agena Corp Product An item that results from a process (e.g., test plans, source code, design and specification documents, inspection reports,...) Resource An item that is input to a process (e.g., people, hardware, software,...) CSCI 3060U Lecture 19 Slide 15

16 A Framework for Software Internal and External Attributes Measurement Let X represent any product, process or resource The external attributes of X are those attributes which can only be measured by how X interacts with its environment Example (product): Mean time to failure of a program Example (product): Maintainability of source code The internal attributes of X are those attributes which can be measured purely in terms of X itself Example (product): Length of source code Example (product): Complexity of source code These are related to direct and indirect measurements (but are not the same - e.g., because indirect measurements can be made involving only internal attributes) CSCI 3060U Lecture 19 Slide 16

17 A Framework for Software Applying the Framework Measurement Source: Fenton, Agena Corp CSCI 3060U Lecture 19 Slide 17

18 Product Metrics Lecture 19 Slide 18

19 External Product Metrics Measures of the Software in its Environment External metrics are those we can apply only by observing the software product in its environment (e.g., by running it) Includes many measures, but particularly: CSCI 3060U Lecture 19 Slide 19

20 Reliability Definition of Reliability Reliability is the probability that a system will execute without failure in a given environment over a given period of time Implications: no single number for a given program - depends on how the program is used (its environment) use probability to model our uncertainty time dependent CSCI 3060U Lecture 19 Slide 20

21 Reliability Definition of Reliability Reliability is the probability that a system will execute without failure in a given environment over a given period of time Implications: no single number for a given program - depends on how the program is used (its environment) use probability to model our uncertainty time dependent Definition of Failure Formal view: Any deviation from specified behaviour Engineering view: Any deviation from required, specified or expected behaviour (by environment) (by user) CSCI 3060U Lecture 19 Slide 21

22 Errors, Faults and Failures Definitions A (human) error is a mistake or oversight on the part of a designer or programmer, which may cause... A fault, which is a mistake in the software, which in turn may cause... A failure when the program is run in certain situations Defects A defect is usually defined as a fault or a failure: Defects = Faults + Failures (or sometimes just Faults or just Failures) CSCI 3060U Lecture 19 Slide 22

23 Defect Density Metric Defect Density Defect density is a standard reliability metric: Defect Density = Number of defect found System Size Size is normally measured in KLOC (1000 s of lines of code), so units of defect density are defects found per 1000 lines Widely used as an indicator of software quality CSCI 3060U Lecture 19 Slide 23

24 Predictive Power of Defect Density However... Unfortunately, faults are not a good predictor of failures, and vice versa (Adams 1984) Source: Fenton, Agena Corp % of faults cause only about 1% of failures, and 35% of failures are caused by only about 2% of faults This finding makes historical defect density look like not such a good predictor of quality CSCI 3060U Lecture 19 Slide 24

25 (ASIDE) A Process Metric Using Defects Effectiveness of Defect Detection & Defect Removal Defect statistics can also be used for evaluating and improving software process Source: Fenton, Agena Corp Grady (1992) used defect metrics to show effectiveness of inspection vs. testing CSCI 3060U Lecture 19 Slide 25

26 Internal Product Metrics Measures of the Product Itself The vast majority of metrics in practice are internal product metrics, measures of the software code, design or functionality itself, independent of its environment The U.S. military lists literally hundreds of measures of code alone These measures are easy to make and easy to automate, but it s not always clear which attributes of the program they characterize (if any) CSCI 3060U Lecture 19 Slide 26

27 Code Metrics Software Size The simplest and most enduring product metric is the size of the product, measured using a count of the number of lines of source code (LOC), most often quoted in 1000 s (KLOC) It is used in a number of other indirect measures, such as productivity ( LOC / effort ) effort / cost estimation ( Effort = f(loc) ) quality assessment / estimation ( defects / LOC ) Many similar measures are also used KDSI (1000 s of delivered source instructions) NLOC (non-comment lines of code) number of characters of source or bytes of object code CSCI 3060U Lecture 19 Slide 27

28 Using Code Metrics Example: Toshiba Productivity Source: Fenton, Agena Corp CSCI 3060U Lecture 19 Slide 28

29 Problems with LOC Measures What Attribute is Measured? LOC really measures length of program (a physical characteristic), not size (a logical characteristic) Mistakenly used as a surrogate for measures of what we re really interested in - effort, complexity, functionality Does not take into account redundancy, reuse (e.g., XXXX Bank MLOC, only about 100 MLOC unique) Cannot be compared across different programming languages Can only be measured at end of development cycle CSCI 3060U Lecture 19 Slide 29

30 Better Size Measures Fundamental Size Attributes of Software Length - the physical size of the software (LOC will do as measure) Functionality - the capabilities provided to the user by the software (how big / rich is the set of functions provided) Complexity - how complex is this software? Problem complexity - measures the complexity of the underlying problem Algorithmic complexity - measures the complexity / efficiency of the solution implemented by the software Structural complexity - measures the structure of the program used to implement the algorithm (includes control structure, modular structure, data flow structure and architectural structure) Cognitive complexity - measures the effort to understand the software CSCI 3060U Lecture 19 Slide 30

31 Code Complexity Metrics Better Measures of Source Code Realization that we need something better approaching cognitive complexity led to work on complexity metrics for code Early explorations measured characteristics such as: number / density of decision (if) statements number / depth of blocks / loops number / average length of methods / classes and many more Best known and accepted source code complexity measures are Halstead s Software Science metrics McCabe s Cyclomatic Complexity and Data Complexity metrics CSCI 3060U Lecture 19 Slide 31

32 Halstead s Software Science Metric Operators and Operands Program source code considered as a sequence of tokens, each of which is either an operator or an operand n 1 = number of unique (different) operators n 2 = number of unique (different) operands N 1 = total number of operator uses N 2 = total number of operand uses Length of program N = N 1 + N 1 Vocabulary of program n = n 1 + n 2 CSCI 3060U Lecture 19 Slide 32

33 Halstead s Software Science Metric The Software Science Predictive Theory Using n 1, n 2, N 1 and N 2 as a basis, Halstead formulated a theory of software complexity and effort CSCI 3060U Lecture 19 Slide 33

34 McCabe s Cyclomatic Complexity: Metric Flow Graphs Again If the control flow graph G of program P has e edges and n nodes, then the cyclomatic complexity v of P is v (P) = e n + 2 v (P) is the number of linearly independent paths in G Example e = 16 n = 13 v (P) = = 5 More simply, if d is the number of decision nodes in G then v (P) = d + 1 McCabe proposed that for each module P v (P) < 10 Source: Fenton, Agena Corp CSCI 3060U Lecture 19 Slide 34

35 Other Flowgraph Metrics Flowgraph Complexity of Software McCabe is just one of many flowgraph-based complexity metrics, all with the advantage that they are independent of language Others measure things like: maximum path length number / interaction of cycles maximum number of alternative paths (width or fan out ) lots of others Can all be automatically computed once flowgraph is known (and it can be automatically computed too) Flowgraph decomposition, which partitions flowgraphs into independent one-node-in one-node-out subgraphs, provides a rigorous general theory of structured programming And it can also be automatically computed CSCI 3060U Lecture 19 Slide 35

36 Introduction to Software Metrics Summary Measurement is about characterizing the attributes of entities can be direct or indirect can be for either assessment or prediction Framework for software measurement is based on: classifying entities as products, processes and resources classifying attributes as internal and external determining whether we want assessment or prediction Product metrics include both external metrics (e.g., defect) and internal metrics (e.g., size, complexity) CSCI 3060U Lecture 19 Slide 36

37 Metrics II - Process Metrics Outline Today s topic is process metrics, the measurements we can make related to the software development and maintenance process We ll look at : Predictive process metrics - effort and cost Specification metrics CSCI 3060U Lecture 19 Slide 37

38 Process Metrics Lecture 19 Slide 38

39 COCOMO Software Cost Estimation - COCOMO COCOMO stands for COnstructive COst Model A method for modeling software development to yield estimates of project effort and cost before undertaking the project Based on a mathematical model of effort plus empirical constants to parameterize the model CSCI 3060U Lecture 19 Slide 39

40 Estimating Effort with COCOMO Simple COCOMO Effort Prediction The simplest COCOMO model uses the estimate Effort = a (Size) b Where Effort is measured in person-months, Size is the predicted size of the software in KDSI* And a and b are empirically derived constants depending on the kind of software: organic (stand-alone in-house data processing systems) a = 2.4 b = 1.05 embedded (real-time or hardware complicated systems) a = 3.6 b = 1.2 semi-detached (in between organic and embedded) a = 3.0 b = 1.12 * Recall: KDSI means thousands of delivered source instructions CSCI 3060U Lecture 19 Slide 40

41 Problems Estimating Size The Downside of COCOMO The simple COCOMO model is claimed to give good order of magnitude estimates of required effort But depends on size estimate - which some say is just as hard to predict as effort! Example: In one experiment managers were asked to estimate software size given the complete specifications The average deviation from the actual size was 64% Only 25% of the estimates were within 25% of the actual size CSCI 3060U Lecture 19 Slide 41

42 Estimating Time with COCOMO Simple COCOMO Development Time Prediction COCOMO uses a similar model to estimate time given effort Time = a (Effort) b Where Time is measured in months, and Effort is in person-months Again, a and b are (quite different) empirically derived constants depending on the kind of software: organic (stand-alone in-house data processing systems) a = 2.5 b = 0.38 embedded (real-time or hardware complicated systems) a = 2.5 b = 0.32 semi-detached (in between organic and embedded) a = 2.5 b = 0.35 CSCI 3060U Lecture 19 Slide 42

43 Regression Based Cost Estimation Where Does the COCOMO Model Come From? It is based on empirical measurements of the cost of actual software projects as a function of software size, and the derivation of a regression equation to explain them CSCI 3060U Lecture 19 Slide 43

44 Regression Based Cost Estimation Where Does the COCOMO Model Come From? Analysis of the historical data indicates that the logarithm of the effort required to produce a software system has a linear relationship with the logarithm of the size of the software, that is log (Effort) = log (a) + b log (Size) where log (a) is the y-intercept of the line and b is the slope of the line Solving for Effort yields the COCOMO effort estimation model Effort = a (Size) b A similar empirical observation of the historical relationship between Time and Effort yields the COCOMO model for estimating time required CSCI 3060U Lecture 19 Slide 44

45 Specification-Based Size Metrics Measuring Product Size Independent of Code Predictions of effort, cost and time depending on code size have two inherent difficulties: Prediction based on KDSI or KLOC just involves replacing one difficult prediction problem (effort, cost or time) with another difficult prediction problem (code size) KDSI and KLOC are actually measures of length, not size (which must take into account functionality) Code complexity size measures, which would be better, can t be predicted any more easily than code length If our size measures are based on the specification of functionality of the software rather than its eventual code, we should be able to more accurately and deterministically estimate size once the specification is known CSCI 3060U Lecture 19 Slide 45

46 Function Point Analysis Analyzing the Functional Specification Function Points (Albrecht 1979) is currently the most popular and widely used size metric in both North America and Europe FP s are supposed to reflect the size of the functionality of a piece of software from the end user s point of view, independent of the code that implements it Computed from the detailed system specification (available early in the development cycle) using the equation FP = UFC x TCF where UFC is the unadjusted function count, a simple count of the number of different user visible functions required by the specification, and TCF is the technical complexity factor, a constant between 0.65 and 1.35, determined by a set of 14 questions about the system CSCI 3060U Lecture 19 Slide 46

47 Function Point Analysis Counting Functions The UFC is obtained by summing weighted counts of the number of inputs, outputs, logical master files, interface files and queries visible to the system user, where: an input is a user or control data element entering an application; an output is a user or control data element leaving an application; a logical master file is a logical data store acted on by the application user (an internal file); an interface file is a file or input/output data that is used by another application (an external file); a query is an input-output combination (i.e. an input that results in an immediate output). An International Function Point Users Group formalizes rules for Function Point counting to ensure that counts are comparable across different systems and organizations CSCI 3060U Lecture 19 Slide 47

48 Use of Function Points A Better Size Metric FP s are used extensively as a size metric in preference to KLOC, for example in the equations for productivity, defect density and cost / effort prediction Advantages: language independent can be computed early in a project do not have to be predicted - derived directly from the specification Disadvantages: unnecessarily complex: evidence is that TCF adds little; effort prediction using the UFC is often no worse than when the TCF is added difficult to compute, contain a large degree of subjectivity some doubt they do actually measure functionality But: Common, popular, better than KLOC, and apparently works CSCI 3060U Lecture 19 Slide 48

49 Function Points: An Example Spell Checker Specification The spell checker accepts as input a document file, a dictionary file and an optional user dictionary file. The checker lists all words in the document file not contained in either of the dictionary files. The user can query the number of words processed and the number of spelling errors found at any stage in the process. A = #inputs = 2 B = #outputs = 3 C = #internal files = 1 D = #external files = 2 E = #queries = 2 UFC = 4 A + 5 B + 7 E + 10 D + 4 E = 58 CSCI 3060U Lecture 19 Slide 49

50 Function Points vs. Program Length CSCI 3060U Lecture 19 Slide 50

51 Metrics II - Product and Process Metrics Summary Process Metrics Attempt to predict properties of the software process, such as effort, time and cost To be useful, process predictions need good estimates of size Function points provide a good code-independent way to estimate the size of a software problem CSCI 3060U Lecture 19 Slide 51