Did We Test the Right Thing? Experiences with Test Gap Analysis in Practice

Similar documents
Ticket Coverage: Putting Test Coverage into Context

Analyzing the Evolution of Software by Change Analysis

Chapter 4 Document Driven Approach for Agile Methodology

Tools and technology usage in PFMS application lifecycle management process

This document describes the overall software development process of microcontroller software during all phases of the Company Name product life cycle.

Participation of Testing to Attain a Degree of Quality of Software

SAP - Redefining software testing. Sanujeet Puhan SAP Technical Architect

Software Assurance Marketplace Use Case

DevOps E m p o w e r Q u a l i t y A s s u r a n c e b e n e f i t s f o r y o u r p r o j e c t s

DevOps E m p o w e r Q u a l i t y A s s u r a n c e b e n e f i t s f o r y o u r p r o j e c t s

Advantages and Disadvantages of. Independent Tests. Advantages. Disadvantages

issue 5 The Magazine for Agile Developers and Agile Testers January free digital version made in Germany ISSN

Curs 2 17 Octombrie Adrian Iftene

Reactive Systems, inc

Squore Software Analytics

Software Engineering Tools and Environments. Ch. 9 1

Microsoft Exam Delivering Continuous Value with Visual Studio 2012 Application Lifecycle Management Version: 9.0

Software Engineering & Architecture

Test Management Test Planning - Test Plan is a document that is the point of reference based on which testing is carried out within the QA team.

SQE Architecture Have you lost your mind! Erik Stensland

COGNITIVE QA: LEVERAGE AI AND ANALYTICS FOR GREATER SPEED AND QUALITY. us.sogeti.com

The Impact of Agile. Quantified.

Using codebeamer to Achieve

Introduction to the Testing Maturity Model Enhanced TM (TMMe)

How to Successfully Collect, Analyze and Implement User Requirements Gerry Clancy Glenn Berger

Configuration Management

10 Steps to Mainframe Agile Development. Mark Schettenhelm, Sr. Product Manager September 28, 2017

Employ different thinking

Sample Exam ISTQB Agile Foundation Questions. Exam Prepared By

Architecting JIRA for the Enterprise. JIRA is a powerful product, both flexible and highly configurable. Miles Faulkner

Labnaf All-in-one Strategy & Architecture Framework

The Economic Benefits of Puppet Enterprise

Basics of Software Engineering. Carmen Navarrete

The Beginner s Guide to CRM

Behavioral Code Analysis In Practice

Course 2 October, 10, Adrian Iftene

Summary of MKinsight.net Audit Management Application

ALM 11 Training. Material contained in this document is priority to Northway Solutions Group.

Page # Configuration Management Bernd Brügge Technische Universität München Lehrstuhl für Angewandte Softwaretechnik 11 January 2005

BCS THE CHARTERED INSTITUTE FOR IT. BCS HIGHER EDUCATION QUALIFICATIONS BCS Level 6 Professional Graduate Diploma in IT SOFTWARE ENGINEERING 2

UPGRADE CONSIDERATIONS Appian Platform

Customer relationship management can sound intimidating to small- and mid-sized businesses. After all, if your company only has a handful of

Veritas NetBackup Self Service Release Notes

THE PURPOSE OF TESTING

DevOps: BPMLinks Approach. White Paper

AGILE AND SCRUM IN A SMALL SOFTWARE DEVELOPMENT PROJECT- A CASE STUDY

Integrating Configuration Management Into Your Release Automation Strategy

Vendor: IBM. Exam Code: C Exam Name: Rational Team Concert V4. Version: Demo

Testing of Web Services A Systematic Mapping

T Software Testing and Quality Assurance Test Planning

Test Strategies Around the World Winning the War on Bugs Through Strategy

Microsoft.Examsoon v by.RAMONA.53q

Anatomy of Excellence Development

IBM IoT Continuous Engineering on Cloud and IBM Collaborative Lifecycle Management on Cloud

BUSINESS PROCESS MANAGEMENT SUITE FUNCTIONAL DESCRIPTION

Appendix C: MS Project Software Development Plan and Excel Budget.

Chapter 16 Software Reuse. Chapter 16 Software reuse

SAMPLE REQUEST FOR PROPOSAL

Oracle Knowledge Analytics User Guide

The Systems Development Lifecycle

Top six performance challenges in managing microservices in a hybrid cloud

Configuration Management Report MOMO SOFTWARE

TRANSPORTATION ASSET MANAGEMENT GAP ANALYSIS TOOL

BOOKS MORE AGILE TESTING AIDABRAIDS

True stories about testing based on experiences

4/26/11. CTG - Company overview. True stories about testing based on experiences. CTG - Company overview. CTG - Company overview

Data Warehousing provides easy access

Centralizing Your Energy Supply Spend

Exam Name: Microsoft Delivering Continuous Value with Visual Studio 2012 Application Lifecycle Management

MICROSOFT EXAM QUESTIONS & ANSWERS

HP ALM: Less Test Cases, More Coverage September 17, 2014

Easy Project Easy Project. The Professional Project Management Software by Easy Software Ltd.

CHAPTER 7 DESIGN AND IMPLEMENTATION OF AN APPLICATION LAYER ORIENTED ARCHITECTURE FOR RFID SYSTEMS

Systems Management of the SAS 9.2 Enterprise Business Intelligence Environment Gary T. Ciampa, SAS Institute Inc., Cary, NC

7 Key Trends in Enterprise Risk Management

SPARK. Workflow for SharePoint. Workflow for every business. SharePoint Advanced Redesign Kit. ITLAQ Technologies

SharePoint 2013 Business Intelligence

Effective Reuse From consistent requirements to variant management

Enterprise Systems for Management. Luvai Motiwalla Jeffrey Thompson Second Edition

Creating a Lean Business System Prof. Peter Hines. Creating a Lean Business System Professor Peter Hines

Seamless Application Security: Security at the Speed of DevOps

Scrum, Creating Great Products & Critical Systems

NCOVER. ROI Analysis for. Using NCover. NCover P.O. Box 9298 Greenville, SC T F

Test Workflow. Michael Fourman Cs2 Software Engineering

F CTORY FR MEWORK. Product Description. USA.com

PSLT - Adobe Experience Manager: Managed Services Enterprise (2017v1.1)

Plans for the 1991 Redesign of the Canadian Labour Force Survey

Belatrix Software Factory

making money from customer use of kiosk attracting more customers to the store saving money if the kiosk replaces manual operations

Migrating from Numerix CrossAsset XL to CAIL. Value Proposition

1. Can you explain the PDCA cycle and where testing fits in?

Requirements Specification for Artifact Tracking System

White Paper Software the life blood to the Snom IP Telephone Snom software has a history of development and improvements spanning over 15 years and

CASE STUDY: Ensuring the Quality of Outsourced Java Development

Real World System Development. November 17, 2016 Jon Dehn

Building an Enterprise QA Centre of Excellence Best Practices Discussion IBM Corporation

Gain strategic insight into business services to help optimize IT.

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

Automated Black Box Testing Using High Level Abstraction SUMMARY 1 INTRODUCTION. 1.1 Background

Test Automation Event

Transcription:

Did We Test the Right Thing? Experiences with Test Gap Analysis in Practice Elmar Jürgens CQSE GmbH juergens@cqse.eu Dennis Pagano CQSE GmbH pagano@cqse.eu Abstract In long-lived systems, errors typically occur in code areas that have been frequently changed. As a consequence, test managers put great emphasis on thoroughly testing modified code. However, our studies show that even in well-structured testing processes code changes often accidentally remain untested. Test Gap analysis finds untested changes and gives testers the chance to test them in a timely manner. As a consequence, Test Gap analysis allows you to effectively monitor your testing processes. After an introduction to Test Gap analysis we describe our experiences both with the usage of this tool at our customers and during our own development over the last few years. We also go into detail on how Test Gap analysis can be applied in different phases of the testing process. HOW WELL ARE CODE CHANGES ACTUALLY COVERED BY TESTS IN PRACTICE? In many systems manual test cases still make up the majority of all tests. In a large system it is anything but trivial to choose manual test cases that actually run through the changes that were made since the last test phase and presumably contain most bugs. In order to better understand whether the tests actually reached those changes, we conducted a scientific study[1] on an enterprise information system. The examined system comprises approximately 340,000 lines of C# code. We conducted the study over a period of 14 months of development and analyzed two consecutive releases. Through static analyses we determined which parts of the source code were newly developed or changed for both releases. For both releases about 15% of the source code was modified. Furthermore, we analyzed all testing activities. In order to do so, we tracked the test coverage of all automated and manual tests over the course of several months. When evaluating the combination of modification and test data, it became apparent that approximately half the changes went into production untested despite a systematically planned and executed testing process. WHAT ARE THE CONSEQUENCES OF UNTESTED CHANGES? To quantify the consequences of untested changes for users of the system, we retrospectively analyzed all errors that occurred in the months following the releases. This brought to light that changed, untested code is five times more errorprone than unchanged code (and also more error-prone than changed and tested code). Our study makes plain that in practice changes very frequently reach production untested and that they cause the majority of field bugs. However, we thus observe a concrete starting point to systematically improving test quality: if we manage to test changes more reliably. HOW DOES CHANGED CODE ESCAPE THE TEST? The amount of untested production code actually surprised us when we conducted this study for the first time. In the meantime, comparable analyses have been performed in several systems, using several programming languages and in different companies, and the results are often similar. The causes of these untested changes are, however, to the contrary of what you may assume not a lack of discipline or commitment on the tester s part but rather the fact that without suitable analyses it is extremely hard to reliably catch changed code when testing large systems. When selecting cases, test managers often orient themselves on changes that are documented in the issue tracker (Jira, TFS, Redmine, Bugzilla, etc.). In our experience, this works well for changes made for functional reasons. Test cases for manual tests usually describe interaction sequences via the user interface to test specific functionality. If the issue tracker contains changes to a functionality the corresponding functional test cases are selected for execution. However, we have learned that there are two reasons for issue trackers not being suitable sources of information in consistently finding changes. Firstly, changes are often technically motivated, for example, clean-up operations or adaptations to new versions of libraries or interfaces to external systems. These technical changes do not make it clear to the tester which functional test cases need to be executed to run through the changed code. Secondly and more importantly, issue trackers often miss essential changes, be it due to time pressure or policy reasons. This leads to gaps in the issue tracker data. In order to consistently find changes without gaps we need reliable information about which changes were tested and which were not. WHAT CAN BE DONE? Test Gap analysis is an approach that combines static and dynamic analysis techniques to identify changed but untested code. It is comprised of the following steps:

GUI.Dialogs Authentication UI Controls GUI.Base Data Validation Figure 1. Treemap with the components of the system under test. Each rectangle represents one component. The surface area corresponds to the size of the component in lines of code. As an example, the primary function of individual components is depicted. Figure 2. Changes within the system under test since the previous release: Each small rectangle represents one method of the source code. Unchanged methods are depicted in gray, new methods in red and changed methods in orange. Static analysis. A static analysis compares the current state of the source code of the system under test to that of the previous release in order to determine new and changed code areas. In doing so, the analysis is intelligent enough to differentiate between varying kinds of changes. Refactorings, which do not lead to a modification of the behavior of the source code (e.g., changes to documentation, renaming of methods, or code moves), cannot cause errors and can thus be filtered out. This brings our attention to those changes that lead to a change in the behavior of the system. The changes to one of the systems we analyzed can be seen in Figure 2. Figure 1 serves to illustrate further the parts into which the underlying system is divided. Figure 3. Test coverage in the system under test at the end of the test phase. Untested methods are depicted in gray, tested methods in green. Dynamic analysis. In addition, test coverage is determined with the help of dynamic analyses. The crucial factor here is that all tests are recorded, that is both automated and manually executed test cases. The executed methods are depicted in Figure 3. Combination. Test Gap analysis then detects untested changes by combining the results of the static and dynamic analyses. Figure 4 shows a treemap containing the results of a Test Gap analysis for this system. In this case, the small rectangles in the different components represent the contained methods, their surface area corresponds to the length of the method in lines of code. The colors of the rectangles represent the following: Gray methods have remained unchanged since the previous release. Green methods were changed (or added) and were executed in any test. Orange (and red) methods were changed (or added) and were not executed in any test. It is plain to see that on the right side of the treemap whole components containing new or changed code were not executed in the testing process. No errors contained in this area can have been found. HOW TO USE TEST GAP ANALYSIS Test Gap analysis is useful when executed continuously, for example, every night, to gain insight each morning into the changes made and tests executed up to the previous evening. For this purpose, dashboards containing information about test gaps are created, as illustrated in Figure 5. These dashboards allow test managers to decide in a timely manner, whether further test cases are necessary to run through the remaining changes while the test phase is still ongoing. If the effort was successful, this will become evident on the newly calculated dashboards the next day.

Figure 4. Test gaps at the end of the test phase. Unchanged methods are depicted in gray. Changed methods that were tested are depicted in green. New untested methods are depicted in red, changed untested methods in orange. Test Changes Dev UAT Execution Test Cases Test Gap Analysis Untested Changes Test Manager All Figure 5. Usage in the testing process. Figure 6. Several dashboards are used for a detailed analysis. If multiple test environments are in use simultaneously, one dashboard should be created for each individual environment so that test coverage can be specifically assigned to the corresponding dashboard. In addition, there is a dashboard that encompasses all information from all environments. Figure 6 depicts an example with three different test environments: Test. This is the environment in which testers carry out manual tests. Dev. In this environment automated test cases are executed. UAT. The user acceptance test environment is used for end users carrying out exploratory tests with the system under test. All. Collects the results of all three test environments.

Figure 7. Methods changed during a hotfix. Figure 8. Tested and untested methods during confirmation testing of a hotfix. WHICH PROJECTS BENEFIT FROM THE USE OF TEST GAP ANALYSIS? We have used Test Gap analysis on a wide range of different projects: from enterprise information systems to embedded software, from C/C++ to Java, C#, Python and even ABAP. Factors that affect the complexity of the introduction include, among others: Execution environment. Virtual machines (e.g., Java, C#, ABAP) facilitate the collection of test coverage data. Architecture. For server-based applications, test coverage data has to be collected on fewer machines than for fat-client applications. Testing process. Clearly defined test phases facilitate planning and monitoring. WHAT ARE THE ADVANTAGES OF TEST GAP ANALYSIS DURING HOTFIX TESTING? Usually there is very little time to test hotfixes. The objectives of hotfix tests are to ensure that the fixed error does not re-occur as well as making sure that no new errors have been introduced. In the latter case, there should at least be a guarantee that all changes made in the course of the hotfix were tested. For this purpose, we define the last release as the reference version for Test Gap analysis and detect all changes made due to the hotfix (for example, on its own branch) as shown in Figure 7. With the help of Test Gap analysis we then determine whether all changes were actually tested during confirmation testing. The example in Figure 8 demonstrates that a part of the methods is still untested. Our experience shows that Test Gap analysis specifically helps to avoid new errors that are introduced through changes during hotfix tests. WHAT ARE THE ADVANTAGES OF TEST GAP ANALYSIS DURING A RELEASE TEST? In this article, a release test is defined as the test phase prior to a major release, which usually involves both testing newly Figure 9. Fewer test gaps when Test Gap analysis is used continuously. implemented functionality and executing regression tests. For this, different kinds of tests are frequently used. Our experience shows that Test Gap analysis significantly reduces the amount of untested changes that reach production. Figure 9 depicts a test gap treemap of the same system that can be seen in Figure 4. While Figure 4 was determined retrospectively, Figure 9 is a snapshot of an iteration in which Test Gap analysis is applied continuously. In this snapshot, both manual as well as automated tests are taken into account. It is plain to see that it contains much fewer test gaps. We have also observed that in many cases some test gaps are accepted, for example, when the corresponding source code is not yet available via the user interface. The key, however, is that these are conscious and well-founded decisions with predictable consequences.

WHAT ARE THE LIMITATIONS OF TEST GAP ANALYSIS? Like any analysis method, Test Gap analysis has its limitations. To know them is crucial in making the best use of the analysis. One of the limitations of Test Gap analysis are changes that are made on the configuration level without changing the code itself as these changes consequently remain hidden from the analysis. Another limitation of Test Gap analysis is the significance of covered code. Test Gap analysis evaluates which code was executed during the test. However, the analysis cannot figure out how thoroughly the code was tested. This potentially leads to undetected errors despite the analysis depicting the executed code as green. This effect increases with the coarseness of the measurement of code coverage. However, the reverse is true: red and orange code was not executed at all. Thus, no contained errors can have been found. Our experience in practice shows that the gaps brought to light when using Test Gap analysis are usually so large that substantial insights into weaknesses in the testing process are gained. Concerning these large gaps, the limitations mentioned above are not significant. OUTLOOK Another exciting field for applying the described analysis methods is the production environment. In this case, the recorded executions are no longer executed test cases but system interactions carried out by the end user. This helps to determine which of the features that were added to the previous release are actually used by the end user. In our experience, the results can frequently be surprising. Figure 10 illustrates the use of an enterprise information system in the style of a Gantt chart [2]. Each line represents one of the application s features. The x-axis shows the measurement period. On weekends as well as during the Christmas season the system is, as we expected, used less than on other days. However, the figure only depicts those features that were actually used. Yet the analysis showed that 28% of the application s features were not used at all, which came as a surprise to all stakeholders involved. In this case, the analysis led to the deletion of a quarter of the application s source code so that in the following releases less testing efforts were necessary or were applied in a more effective way 1. Figure 10. production. Using the features of an enterprise information system in REFERENCES [1] Sebastian Eder, Benedikt Hauptmann, Maximilian Junker, Elmar Juergens, Rudolf Vaas, and Karl-Heinz Prommer. Did we test our changes? assessing alignment between tests and development in practice. In Proceedings of the Eighth International Workshop on Automation of Software Test (AST 13), 2013. [2] Elmar Juergens, Martin Feilkas, Markus Herrmannsdoerfer, Florian Deissenboeck, Rudolf Vaas, and Karl-Heinz Prommer. Feature profiling for evolving systems. In Program Comprehension (ICPC), 2011 IEEE 19th International Conference on, pages 171 180. IEEE, 2011. FURTHER INFORMATION At www.testgap.io we compiled further materials on Test Gap analysis, such as research projects, blog posts and tool support. Furthermore, as the authors of this article, we are always happy to receive e-mails with questions and feedback (also negative) on this text or on Test Gap analysis in general. 1 For usage analysis on feature level, several methods are necessary that go beyond the ones described in this article. They are described in the referenced paper.