Improving Data Quality. A Beginner s Guide

Size: px
Start display at page:

Download "Improving Data Quality. A Beginner s Guide"

Transcription

1 Improving Data Quality A Beginner s Guide

2 Executive Summary Today, more than ever, organizations realize the importance of data. Organizations are realizing that simply collecting data is not enough and that they need to invest actively and continuously in ensuring that the quality of their data is top-class. The past decade has witnessed exponential growth in the volume and complexity of data. The growth of Social media, Telecom services, and introduction of new services like 3G/4G/LTE, IPTV, Mobile Apps, etc. have resulted in an enormous amount of data, both structured and unstructured being generated every day. This has in turn affected the IT strategy of organizations mandating the need for newer technologies and newer ways to deal with their most valuable asset, data. For any organization, their data warehouse remains the key component of the IT infrastructure serving varied business requirements including Operations, Business Intelligence, Master Data Management, Information Governance, and Analytics through data. With key decisions being made based on the reports from these systems, the underlying data behind these decisions need to be 100% reliable. The reality, however, is quite different. There are multiple inconsistencies, redundancies and anomalies in data as stored by organizations through their current systems and practices. These anomalies cascade further into the reports and can have potentially disastrous impact on business decisions and outlook. As highlighted in the 2015 Global Data Quality Research by Experian Data Quality, 91% Organizations are using data and data quality to optimise their customer experience 83% Organizations say revenue is affected by inaccurate or incomplete data 63% Organizations lack a centralised approach to managing data quality

3 How Do We Improve Data? So, how do we improve data? As mentioned earlier, it is not just about improving quality of the data, but also about empowering users to derive value out of data. Few of the recommended measures for doing this are listed below. Measuring current state: By measuring the defined data critical to all of our operational support functions and baseline as-is state of the data. Increasing Quality: By establishing value add projects to uplift the data together with the business so that is stays clean. Reducing data corruption: By establishing Housekeeping tasks, mandating Keep-It-lean reports, and Stopping the Rot. Making it Easier to Use: By creating cross system data views and how to knowledge base for each data area. Making it Self Service: By creating online reports for use by in-life data managers that show dashboards and discrepancies. Making it One Truth: By rationalizing towards a target architecture the ways that the same data can be stored / managed. KPIs for data improvement An effective Data Improvement strategy is a result of careful planning and analysis combined with continuous monitoring and improvisations. To come up such a strategy, it is recommended to follow a methodical and structured approach that tries to balance organizational challenges with industry best practices. In the following section, we discuss few KPIs that are generally recommended for consideration while defining the organization s Data Quality Management strategy. M Measure Q Quality orruption apability Measure of defined data-critical to all organizational functions Improvement of Quality status for defined data Measure of on-going data inconsistencies Measure of Easy to use, self serviceability and ability to provide one truth

4 Let s have a look at all these KPIs in detail with activities and actions involved in each of them: Measure Q M This is the first step in an organization s data improvement journey. In this phase we identify key business data, define functional requirements and business rules and then measure current state of data quality for benchmarking. Key Activities: Take in source data,compare it across sources, define technical data quality measures Provide a business description for the identified data quality measures in terms of people, process, system Present an overall data quality measure for that area so that it s scalable across the organization i.e. business intelligence for data. Measuring Parameters: Accuracy: The extent to which the data is free of identifiable errors and is reflecting real-world state. ompleteness: It defines if all the requisite information available and data values required for business functions are present and in a usable state. orrectness: At times data can be complete but inaccurate. Reliability defines correctness of the data available. onsistency: onsistency refers to the absence of apparent contradictions in a database. Integrity: State of the same data stored in multiple places with valid references to each other Accessibility: Data items that are easily obtainable and legal to access with strong protections and controls built into the process. Usability: The extent to which related data are useful for the purposes for which they were collected. Agree on the coverage model Weightage on volumes of data Baseline quality of defined data entities Rollout plan dependent on agreed programs with the business Quality Quality phase is about improving quality of the data based on issues identified and in measure phase. Key Activities: Perform one-time data cleansing Housekeeping to maintain data quality Ad hoc/bulk data cleansing Establish data quality gates at key places and functional check point Keep It lean reports deployed to in-life users so that they can self-correct identified issues Develop in-house dashboard Improved Data quality Online dashboard presence Keep it clean processes in place for continuous data improvement orruption This stage is important ensure ongoing data corruption is prevented and maintain required data quality levels for efficient business function. Key Activities: Establish DQ Issue Management processes Interlock with the business so that data stays clean and is monitored on and ongoing basis reate value propositions with the business to execute data uplift proactively Baseline and monitor the effect Keep a log of each data defect RA & Improvement Analyze recurring data issues Problem management - Root cause analysis, prioritized by impact/cost ategories cause of data issues by Data, System, Process, People Stop the Rot - Drive fixes or users stories Root cause analysis (RA) for key data quality corruption issues Keep record of every data problems identified, perform RA and drive them to strategic fixes Monitor data quality on regular basis apability Building capabilities with the help of improved data to drive business and get value out of data is an extremely important phase. It is divided into three key aspects as explained below. Easier to Use: Greater flexibility when using this data Making data available to business users anytime, anywhere entralized data management reports entralize knowledge about data as well as reference information to understand data Self-Service: Enabling operation and business users to view and interact with data using Portal / Inventory viewer Data viewing capability for consistency, consolidated, supported end-to-end view of data Ability to identify discrepancies and improve data quality One Truth: reating holistic views Empowering people with management information Executive dashboard to reflect data integrity and discrepancy reports Support operations team for tactical solution on data correction Getting data only once Retaining and sharing knowledge ross-platform Views Rapid deployment of business critical data Views Industrializing the tool Enabler for One Truth, Easy to Use and Self Service

5 Let us now look at a few examples of how effective Data Quality Management practices help organizations become more efficient by reducing revenue leakage and improving Data Quality. ase Study 1 Data Quality Management helps Europe s ity ouncils achieve multi-million Pound savings over two years Following good data quality management practices is not only beneficial in avoiding reporting errors, but also in establishing an efficient system of processes that reduces the cost of procurement, operating and maintaining organizational information. This ultimately leads to better business decisions, more efficient organizations, and significant revenue realization as demonstrated in the case of three of Europe s ity ouncils through their jointly initiated Data Quality Management project. This project aimed at combining specific areas of service delivery by migrating from existing legacy ERP applications to a common ERP platform. Following the approach outlined in this whitepaper, the ity councils, with the help of Tech Mahindra s DQM team, were able to transform and cleanse their HR & Finance data (to the tune of 5.1 million records) in three of their existing ERP applications and realize an overall business benefit worth millions of Pounds over two years. Where we were Initial Scope 2013 ity ouncils - Multiple Legacy OS (Oracle Apps/JDE/EDAR/i-Trent) 14 HR & Payroll Entities per ouncil 16 Finance Entities per ouncil 550 Attributes Extraction Layer SMEs Sources 1 N Automatic validation Of data mapping & business rules upload Log Upload Engine Error orrection re -load Analyse Transformation Layer Data Migration risks and mitigation plan Profile Staging Area leanse Data mapping and transformation Transform load Log Upload Engine Error orrection re -load Loading Layer SMEs Target (Agresso) Where we are today 2015 ity ouncils - 1 Managed service system (Agresso) 14 HR & Payroll Entitles per ouncil, Additional 15 Entities per ouncil 16 Finance Entities per ouncil 550 Attributes Additional 430 Attributes 5M Data Volume Receive Inputs Legacy data Agresso templates Entity Finalization Validation, leansing, Mapping, reporting and Feedback Mechanism Analyse Legacy data & Attribute mapping Define standardization rules & Business inputs Transformation and onsolidation Issue/Query resolution entralized view of DQ entralized control of mapping sheet, business rules Summary report of input data file load Summary report of transformation Details report of data load Notification 5M Data Volume Additional 8M Data volume ase Study 2 Good DQM Practices enable multi-million Dollar cost saving for UK Telco In case of one of UK s largest telecom operators, good Data Quality Management practices enabled the identification of incorrect billing practices being followed by a supplier for unused network elements, and subsequent cost savings of millions of dollars! A detailed quality analysis of the customer's billing system revealed that they were being charged for decomissioned equipment by the supplier due to incorrect reconciliation logic built into the system. Further investigation revealed many similar anomalies whose rectification enabled the customer to save millions of dollars from supplier overcharge and unused network assets. Investment from the customer since m Return on investment (Financial saving, Operational efficiency, Productivity 200m Where We Started % 60% 20% 43% 88% 11% Where we are today 2015 Data overage Quality Revenue Leakage Data overage Quality Revenue Leakage

6 Summary In essence: Data is here to stay. According to recent figures, over 90% of all the data in the world has been created in the last two years. It s an increasingly important asset for all organisations, regardless of industry or size. And those businesses that can extract the value from their data will reap the greatest reward. Global Data Quality Research, Experian Data Quality Organizations are fast realizing the importance of Data Quality and are actively taking measures to correct and improve their Data Quality Management practices. These measures are often a multi-pronged exercise including changes and improvements required in approaches, organizational models, techniques, and technologies used to store, measure, and use data for the purpose of business processes across the organization. If done right, the benefits are both long lasting and far-reaching. ABOUT TEH MAHINDRA: ONNET WITH US: Tech Mahindra is a specialist in digital transformation, consulting and business re-engineering solutions. We are a USD 3.5 billion company with 98,000+ professionals across 51 countries. We provide services to 674 global customers including Fortune 500 companies. Our innovative platforms and reusable assets connect across a number of technologies to deliver tangible business value to all our stakeholders. Tech Mahindra is also amongst the Fab 50 companies in Asia as per the Forbes 2014 List. dmscommunications@techmahindra.com We are part of the USD 16.5 billion Mahindra Group that employs more than 200,000 people in over 100 countries. Mahindra operates in the key industries that drive economic growth, enjoying a leadership position in tractors, utility vehicles, information technology, financial services and vacation ownership.