Prediction. Basic concepts

Size: px
Start display at page:

Download "Prediction. Basic concepts"

Transcription

1 Prediction Basic concepts

2 Scope Prediction of: Resources Calendar time Quality (or lack of quality) Change impact Process performance Often confounded with the decision process

3 Historical data Y (dependent, observed, response variable) explained variance of observed Y i prediction interval of new observation Y 0 at x 0 known x 0 unknown X (independent, prediction variable)

4 Methods for building prediction models Statistical Parametric Make assumptions about distribution of the variables Good tools for automation Linear regression, Variance analysis,... Non-parametric, robust No assumptions about distribution Less powerful, low degree of automation Rank-sum methods, Pareto diagrams,... Causal models Link elements with semantic links or numerical equations Simulation models, connectionism models, genetic models,... Judgemental Organise human expertise Delphi method, pair-wise comparison, Lichtenberg method

5 The Lichtenbeg method process Staff the analysis group Describe the work to be estimated Define general constraints and assumptions Define the structure Individual judgement of MIN, MAX, LIKLEY Calculate common result Find workpagages with large variance Sub-devide them and rework 5-20 participants Never influence each others judgements MIN and MAX should be extreme 1% of the cases

6 Common SE-predictions Detecting fault-prone modules Project effort estimation Change Impact Analysis Ripple effect analysis Process improvement models Model checking Consistency checking

7 Introduction There are many faults in software Faults are costly to find and repair The later we find faults the more costly they are We want to find faults early We want to have automated ways of finding faults Our approach Automatic measurements on models Use metrics to predict fault-prone modules

8 Related work Niclas Ohlsson, PhD work 1993 AXE, fault prediction, introduced Pareto diagrams, Predictor: number of new and changed signals Lionel Briand, Khaled El Eman, et al Numerous contributions in exploring relations between faultproness and object-oriented metrics Piotr Tomaszewski, PhD Karlskrona 2006 Studies fault density Comparison of statistical methods and expert judgement Jeanette Heidenberg, Andreas Nåls Discover weak design and propose changes

9 Approach Find metrics (independent variables) Number of model elements (size) Number of changed methods (change) Transitions per state (complexity) Changed operations * transitions per state (combinations)... Use metrics to predict (dependent variable) Number of TRs

10 Capsules

11 State charts

12 Data model package capsule class port protocol attribute operation signal State machine State transition

13 Our project - modelmet RNC application - Three releases About 7000 model elements TR statistics database (2000 TRs) Find metrics Existing metrics (done at standard daily build) Run scripts on models Statistical analysis Linear regression, principal component analysis, discriminant analysis, robust methods Neural networks, Bayesian belief networks

14 Size Change Complexity Combined

15 Metrics based on change, system A

16 Metrics based on change, system B

17 Complexity and size metrics, system A

18 Complexity and Size metrics, system B

19 Other metrics, system A TRD = C states protocols modelelements

20 Other metrics, system B

21

22 How to use predictions Uneven distribution of faults is common 80/20 rule Perform special treatment on selected parts Select experienced designers Provide good working conditions Parallell teams Inspections Static and dynamic analysis tools... Perform root-cause analysis and make corrections

23 Results Contributions: Valid statistical material: Large models, large number of TRs Two change projects Two highly explanatory predictors were found State chart metrics are as good as OO metrics Problems: Some problems to match modules in models and TRs Effort to collect change data