Assessment of software quality metrics. Elisabetta Ronchieri, INFN CNAF Maria Grazia Pia, INFN - Genova CERN, 11/23/2016

Size: px
Start display at page:

Download "Assessment of software quality metrics. Elisabetta Ronchieri, INFN CNAF Maria Grazia Pia, INFN - Genova CERN, 11/23/2016"

Transcription

1 Assessment of software quality metrics Elisabetta Ronchieri, INFN CNAF Maria Grazia Pia, INFN - Genova CERN, 11/23/2016

2 Why should we measure the quality of software used by researchers? Determine and Improve Quality Reduce Effort to Maintain Code Make More Analysis and Research Simplify Reproducibility of Outcome Over Time As direct advantages To identify and predict the main code portions that tend to be error-prone, hard to understand and difficult to maintain To get an idea about the complexity of the code that penalizes software maintainability To reduce the time spent and cost estimation in fixing and debugging As indirect consequences To allow researchers to make more analysis and possibly new scientific discoveries To simplify the reproducibility of analysis

3 The scope of our metrics assessment To guarantee the maintainability of software used in scientific researches where critical use cases are addressed [1]. By using existing standards that identify software characteristics; By exploiting a set of product metrics that allows understanding the code state [2, 3, 4]; By assessing metrics measures with statistical methods to objectively quantify their meaning [5, 6].

4 Background

5 Quality There are many views of software quality, from IEEE [7], ISO[8], up to experts. However, a good definition must let us measure quality meaningfully. Quality Standards Software quality standards describe software quality models categorizing software quality into a set of characteristics. About software attributes, ISO/IEC 25010:2011 standard defines 6 software factors, each subdivided in subcharacteristics (or criteria) [9]. Measure and Metric Measure is a quantitative indication of the extent, amount, dimension, capacity or size of some software attributes of a product or process. Example: Number of errors. Metric is a quantitative measure of the degree to which a system, component or process possesses a given software attribute related to quality factors (also told characteristics) [7]. Examples: Maintenance needs; Number of errors found per person hours. Metric Classification Product metrics Explicit results of software development activities. Deliverables, documentation, by products. Include size, complexity and performance.

6 Software characteristics

7 Product metrics classification A subset of Product Metrics Program Size (PS) Code Distribution (CD) Control Flow Complexity (CFC) Object- Orientation (OO)

8 Software metrics tools They measure metrics per different programming languages. Available under free or commercial licenses. What we have noticed so far: Different interpretations of the same metric definition exist - different implementations of the same metric. Typically, these tools provide a subset of metrics with respect to the software factor or criteria they address. They use different names to refer to the same metric. To perform a comprehensive analysis, it is necessary to use several tools and put together data.

9 Metrics goodness thresholds The ones adopted in some software metrics tools are not always well-documented. Goodness thresholds are not univocally established: Existing thresholds are often based on researchers experience and pertinent to the context of the considered software. They must be revaluated in different context. Metrics data analysis The tools calculate metrics but do not provide instruments to analyze data.

10 Work status with Geant4

11 Assessment of software quality tools (free and under licenses) What done so far Tool Version License CCCC (C and C++ Code Counter) Free CLOC (Count Lines of Code) 1.60 Free Metrics Free Pmccabe 2.6 Free SLOCCount (Source Lines of Code Count) 2.26 Free Understand Commercial Selected Imagix4D to measure several product metrics at different levels Under open evaluation license Extended twice Levels #Metrics Variable 4 File 22 Class 24 Directory 21 Namespace 4 Function 21

12 What done so far Produced a large collection of measures over the whole Geant4 life time (Dec 1998 Dec 2015) Evaluated statistical methods to analyze data (see Maria Grazia s presentation)

13 The guinea pig: Geant4 Considered several Geant4 releases from version 0.0 up to 10.2 wget data from the public source repository Calculated metrics by using Imagix4D Came across bugs in Imagix4D Received bug fixes for the open evaluated license Verified correctness of the produced data Met some issues in measuring metrics from 0.0 up to 3.2 due to the lack of some external dependencies

14 List of collected metrics at class level Metric Name Class Hierarchy and Local Vars Coupling CK Coupling Between Objects, CK Depth of Inheritance Tree, CK Lack of Cohesion of Methods, CK Number of Children, CK Response For a Class, CK Weighted Methods per Class Class, Location Depth of Derived Classes External Methods and Coupling Fan In of Inherited Classes Halstead Intelligent Content, Halstead Mental Effort, Halstead Program Difficulty, Halstead Program Volume McCabe Average Cyclomatic Complexity, McCabe Maximum Cyclomatic Complexity Methods Called External, Methods Called - Internal Number of Member Attributes, Number of Member Classes, Number of Member Methods, Number of Member Private Methods, Number of Member Types, Number of Total Members

15 List of collected metrics at directory level Metric Name Comment Ratio Directory, Location Halstead Intelligent Content, Halstead Mental Effort, Halstead Program Difficulty, Halstead Program Volume McCabe Average Cyclomatic Complexity, McCabe Maximum Cyclomatic Complexity, McCabe Total Cyclomatic Complexity Maintainability Index Number of Declarations in Directory, Number of Files C, Number of Files - C++, Number of Functions in Directory, Number of Files Header, Number of Files Total, Number of Variables in Directory Size Bytes, Size - Lines of Comments, Size - Lines in Directory, Size - Lines of Source Code, Size - Number of Statements

16 List of collected metrics at file level Metric Name Comment Ratio File, Location Halstead Intelligent Content, Halstead Mental Effort, Halstead Program Difficulty, Halstead Program Volume Inclusion - Directly Included Files, Inclusion - Files Where Directly Included, Inclusion - Files Where Transitively Included, Inclusion - Transitively Included Files, Inclusion - Transitively Included Lines McCabe Average Cyclomatic Complexity, McCabe Maximum Cyclomatic Complexity, McCabe Total Cyclomatic Complexity Maintainability Index Number of Declarations in File, Number of Functions in File, Number of Variables in File Size Bytes, Size - Lines of Comments, Size - Lines in File, Size - Lines of Source Code, Size - Number of Statements

17 List of collected metrics at function level Metric Name Decision Depth Fan In of Calling Functions Direct, Fan In of Calling Functions Transitive, Fan Out of Functions Called Direct, Fan Out of Functions Called - Transitive Function, Location Halstead Intelligent Content, Halstead Mental Effort, Halstead Program Volume, Halstead Program Difficulty Knots McCabe Cyclomatic Complexity, McCabe Decision (Cyclomatic) Density, McCabe Essential Complexity, McCabe Essential Density Number of Parameters, Number of Returns, Number of Variables in Function Profile Data Coverage, Profile Data Frequency, Profile Data - Time Size - Lines in Function, Size - Lines of Source Code Variables Used Global, Variables Used - Static

18 Metric Name List of collected metrics at namespace level Location, Namespace Number of Member Attributes, Number of Member Classes, Number of Member Methods, Number of Total Members

19 List of collected metrics at variable level Metric Name Functions Reading Variable, Functions Setting Variable, Functions Using Variable Number of Sets of Variable Variable, Location

20 Further measures The current production has been obtained by using a default configuration of Imagix4D. The other 11 configurations are under evaluation on release : to assess their impact on the analysis to repeat them on the other releases if it is needed

21 Conclusion The use of metrics can contribute to monitor the internal quality of software. Statistical methods are valuable to analyse metrics data. Further investigation is in progress to identify methods to evaluate the quality of scientific software with metrics data.

22 References [1] Elisabetta Ronchieri, Maria Grazia Pia, Francesco Giacomini, Software Quality Metrics for Geant4: An initial assessment, In the Proceedings of ANS RPSD 2014, Knoxville, TN, September 14 18, 2014 [2] Elisabetta Ronchieri, Maria Grazia Pia, Francesco Giacomini, First statistical analysis of Geant4 quality software metrics, In Journal of Physics: Conference Series, Volume 664, 2015 [3] Elisabetta Ronchieri, Maria Grazia Pia, Marco Canaparo, Assessment of Geant4 Maintainability with respect to software engineering references, poster at CHEP 2016 [4] Elisabetta Ronchieri, Maria Grazia Pia, Tullio Basaglia, Marco Canaparo, Geant4 maintainability assessed with respect to software engineering references, poster at IEEE NSS/MIC, 2016 [5] Elisabetta Ronchieri, Mincheol Han, Sung Hun Kim, Maria Grazia Pia, Gabriela Hoff, Chan Hyeong Kim, Application of econometric and ecology analysis methods in physics software, presented at CHEP 2016 [6] Maria Grazia Pia, Elisabetta Ronchieri, Application of econometric data analysis methods to physics software, presented at IEEE NSS/MIC, 2016 [7] IEEE IEEE Standard Glossary of Software Engineering Terminology. [8] ISO 9000, [9] ISO/IEC 25010:2011,