Twice as Long to Do It Wrong

Size: px
Start display at page:

Download "Twice as Long to Do It Wrong"

Transcription

1 Twice as Long to Do It Wrong Presented By: John H. Ossege and Scott E. Robinson ExxonMobil Technical Computing Company Information and Quality Conference (IDQ) November 4-7, 2013, University of Arkansas at Little Rock, Arkansas This presentation includes forward-looking statements. Actual future conditions (including economic conditions, energy demand, and energy supply) could differ materially due to changes in technology, the development of new supply sources, political events, demographic changes, and other factors discussed herein (and in Item 1 of ExxonMobil s latest report on Form 10-K). This material is not to be reproduced without the permission of Exxon Mobil Corporation. provided to attendees of the Information and Quality

2 INTRODUCTION: UTC /DMPO ( Management Practices & Operations): Charged with designing standards and best practices for managing data and information in ExxonMobil Upstream Objective of the study: Develop a process to quantify the cost of poor quality data in terms of ROI (Return on Investment) Presentation will focus on the methodology and results: Pioneering effort; invented metrics processes in the course of the study Named it Lost Quality Observation (LQO), parallel to LPO (Loss Prevention Observation) in ExxonMobil safety culture, to emphasize parallel between quality and safety as cultural values provided to attendees of the Information and Quality 2

3 DEVELOPING THE CONCEPT: Initial idea was direct observation: Conduct a direct observation of a person doing their work Encouraged to engage the Upstream for ROI analysis Multi-year Initiative: Project designed to clean up hardcopy records Goal: Identify and retrieve all potential items of interest Upgrade / Update metadata Populate predefined key attributes Scan the high-value data, Eliminate duplicate copies Create ArcGIS geodatabase indexing the scanned images Demonstrate the feasibility and business benefits of cleaning legacy data Locate rediscovered proprietary data that provides competitive advantage provided to attendees of the Information and Quality 3

4 The Initiative 25 Million Archived Documents 1.2 Million wells The Challenge Identify, capture and correctly catalog geoscience data Millions of poorly or inconsistently archived items Business is either unaware of or unable to rapidly access Transition from a project to a systematic change in how we manage documents Proactive document capture provided to attendees of the Information and Quality 4

5 Dual objectives of the LQO study: UTC : Document cost of poor quality in terms of ROI DI Project: Document project benefits and processes for use by future Initiatives Comparisons: Loading Cost It is more cost-efficient to load the data correctly than to clean it up afterwards. Finding Cost Cost to locate hardcopy data in the existing system vs the cost to locate the same data after cleanup. Finding X Represents the number of times a set of data must be retrieved from the current system to equal or exceed the cost of cleanup. Compares cost of clean-up to incremental cost of finding data now vs. after clean-up. Small X = Cost of clean-up is justified Large X = Cost of data clean-up will not be recovered. provided to attendees of the Information and Quality 5

6 The High Cost of Poor Quality Finding X (Hypothetical Example) Dollars ($ K) $15K Clean-Up Cost X = 3 Number of data searches required to equal the cost of clean up $25K Five Searches $20K Four Searches $15K Three Searches $10K Two Searches $5K One Search $3K Search Incremental Cost For this comparison, the cost to clean up the data is compared to the incremental cost savings due to faster data searches after clean-up. It cost $15K to clean up the data. The time saved doing a data search after clean-up vs. before clean-up is worth $5K. Three data searches equals the cost of clean-up in this example. If you plan to query the database more than three times, the price of clean-up has proved a worthwhile investment in this example. provided to attendees of the Information and Quality 6

7 LQO (Lost Quality Observation) a Three Step Process Step 1: Learn and document the project s data search and clean-up process Step 2: Identify where in the process to record metrics (install the meters) Step 3: Work with the staff performing the process to develop the best way for them to record the measures we wanted provided to attendees of the Information and Quality 7

8 DESIGNING THE MEASUREMENT METHODOLOGY Step 1: Understanding the process to be measured Taxonomy chart Illustrates all the different paths the data can take once it leaves the box Many ways a document can be cataloged incorrectly / incompletely Created a taxonomy chart to ensure we captured all the cases Detailed flowchart Developed a thorough understanding of the data cleanup process The process was long, complicated and not easy to understand Streamlined the detailed flowchart down to a linear process flow provided to attendees of the Information and Quality 8

9 Hardcopy Definitions and Concepts Documents Types: Report, Log, Map Duplicates redundant copies of same document Rediscovered misfiled records found during clean-up Items A document or A set of documents physically bound together Barcode Used in Records DB to track items. Sometimes attached to folder instead. Folder Often used to group items together Box Storage and transport vessel provided to attendees of the Information and Quality 9

10 CLASSIFICATION OF CORRECT AND INCORRECT RECORDS Taxonomy Chart: DATA Challenge INITIATIVE to understand NUMBERS the data cleanup process Multiple ways the data can be cataloged incorrectly and incompletely Chart captures all the cases Records Box In Box 25% of the data Lost in Place Barcode No Barcode Signed Out Not In Rec database In Rec database Duplicate Non-Technical Signee has data Duplicate Non-Technical Listed as in the Box Not listed as in the Box Destroy Discovered Re-discovered Keeps data Returns data Destroy Re-discovered Discovered Return to storage Catalog Return to storage Catalog Duplicate Non-Technical Good Record Record Duplicate Non-Technical Re-discovered Discovered Catalog Do nothing Check into Records Catalog Destroy Good Record Catalog Return to storage Destroy Return to storage Catalog DEFINITIONS Catalog - Ensures that an item has: Barcode Entry in Rec database Associated with a box Minimum Requirements Not In Box Not Signed Out Signee does not have data Discover Later Can not be Located Update Rec database Update Rec database Update Rec database provided to attendees of the Information and Quality LEGEND Black Classification Blue Ultimate Disposition 10

11 Cost Associated with a Box Retrieval Load the box on the truck Transport and Delivery $8.00/box Box Check-in Scan box code box items $40.00/box Box Check-Out Scan box code box items Return the box Load the box on the truck Transport and delivery $40.00/box $8.00/box Total cost per box $96.00 provided to attendees of the Information and Quality 11

12 Clean-up Process 1. Retrieval: search initiated based upon Business needs 2. Check-In (to the project): and all items are checked to onsite through corporate repository 3. : The review all high priority items 4. Triage: Classify items as in or out of scope. Regroup items by data type. Label documents to be scanned 5. Vendor Check-Out: - Documents replaced in box - Update box item report - checked out to vendor Quality Control: Only a representative number of boxes were QC d. This was later expanded to all boxes 7. Box Transport: are shipped to vendor 8. Scan: - Prepare and scan data - Send images back to ExxonMobil - Index images 9. Metadata: Create metadata for all documents and load data into repository 10. Checkout to Central Storage: - Send data pick up order - Verify box content - Send boxes back to storage 11. Destroy Duplicate provided to attendees of the Information and Quality provided to attendees of the Information and Quality 12

13 DESIGNING THE MEASUREMENT METHODOLOGY Step 2: Installing the meters DATA INITIATIVE PROJECT WORK FLOW SINGLE LINE PROCESS WORK FLOW DATA RETRIEVAL Check-in Triage Vendor Checkout Quality Check Transport Scan Metadata Update Storage Search Identified Request Confirm Delivery Checkedin Documents Keep in RCD yes Regroup Items By Type Label Items To Be Scanned Affix RCD Barcode QC d for Accuracy Transport Scan Original Item Capture MDRs Enter MDRs & Metadata Update RCD Send Original to RC Process 1- Clean-Up no Non Target or Destroy Store Scanned Images provided to attendees of the Information and Quality 13

14 Parallel Process Diagram Search DATA RETRIEVAL Request Identified DATA CLEANUP PROCESS DIAGRAM Confirm Delivery Check-in Checkedin Documents Documents Keep in RCD in RCD no Non Target or Destroy yes Triage Search Regroup Items By by Type Label Label Items Items to be To Scanned Be Scanned Vendor Checkout Affix RCD Barcode Quality Check QC d for QC d for Accuracy Accuracy (Box (Box Content) Content) Process 1- Clean-Up Transport Transport 29 Minutes Scan Scan Original Item Stored Scanned Images Capture MDRs Metadata Update Process-1 Enter Enter MDRs Enter MDRs & & Metadata Metadata MDRS & Metadata Clean-Up Update RCD Storage Send Original to RC IMPROVEMENTS TO FUTURE PROCESS review the documents Capture MDRs Label items to be scanned Scan original item Store scanned images Business Unit Keep Store in RCD in RCD no Non Target or Destroy or Destroy yes CURRENT PROCESS 20 Minutes Affix RCD Barcode Information Management System Process 3- Storing Currently Process-3 Enter Enter MDRS MDRs MDRs & & Metadata Metadata Current Allows us to estimate the time for processes that do not exist FUTURE PROCESS 27 Minutes Send Original to RC Document Documents s Keep Keep in in RCD RCD no Business Unit yes Capture MDRs 2013 ExxonMobil Technical Computing Non Non Target Target or Destroy or Images Company. May be provided to attendees of the Destroy Information and 2013 Quality ExxonMobil Conference Technical Computing Company. May be 14 Process 5- Storing in the Future All other provided rights reserved. 14 to attendees of the Information and Quality 2012 Conference ExxonMobil Technical All other Computing rights Company. reserved. May be provided to attendees at the 16th PNEC International Conference on Petroleum. All other rights reserved. Label Label Items Items To to be Be Scanned Scanned Affix RCD Barcode Information Management System Scan Original Item Stored Scanned Process-5 Enter MDRs & MDRs MDRS Metadata & Metadata Metadata Future Send Original to RC

15 DESIGNING THE MEASUREMENT METHODOLOGY Step 3: Developing the measuring process First half of measuring process performed at ExxonMobil Capture metrics for: Search and Retrieval Check-in (to the project) Check-out (back to storage) Second half of measuring process performed at Vendor location Capture metrics for: Triage / Prep (direct measurements) scanning / Metadata capture (average times) DI Project ended as metrics capture began provided to attendees of the Information and Quality 15

16 Comparing times and costs for Processes 1-5 Notes: Process 1 Paper Paperless Process 3 Current and proposed data storage costs are similar to data cleanup cost (processes 1, 3 and 5) Process 5 Paperless retrieval cost is significantly less than paper retrieval (Process 4). Paperless retrieval has additional advantages discussed later ExxonMobil Technical Computing Company. May be provided to attendees of the Information and Quality Conference All other rights reserved. 16

17 Calculating X from costs for Processes 1-5 X = Cost of Time of Clean-Up Cost of Time for Current Search Cost of Time for Future Search A paperless search retrieves digitized documents only, no hardcopy. It cuts out ALL intermediate steps in Process 4, and so takes significantly less time than a search that retrieves paper documents. provided to attendees of the Information and Quality 17

18 RESULTS: X=35 We were expecting it to be closer to 3. The chances of checking out all the documents in the library 35 times is extremely remote. Other motivators for cleaning up data Proprietary data often cannot be repurchased Discovered data may provide an irreplaceable competitive edge Repurchasing lost data is wasted money provided to attendees of the Information and Quality 18

19 CURRENT PROCESS 20 Minutes DATA CLEAN-UP PROCESS 29 Minutes Combined Time 49 Minutes No changes to data storage process the data clean-up investment will be lost over time High Quality X=35 Current Storage Process No Standardized Meta Document Repository Never Recover the Cost of Cleanup from Time Savings Cleanup Effort Takes twice as long to do it wrong X = Clean-up Time Search Time Low Quality FUTURE PROCESS 27 Minutes Load High Quality the first time Case for Improved Storage Process Load time increases by 7 mins Saves 29 minutes in clean-up New Storage Process Standardized Meta Document Repository Only takes ¼ of the time to load the data correctly as it does to perform a data cleanup provided to attendees of the Information and Quality 19

20 RESULTS The time saved in future records searches does not justify the time expended to clean the records up But, the time to clean the records up far exceeds the time to catalog them right the first time ( Twice as long to do it wrong ), therefore Storing documents correctly avoids big clean-up costs later. Standardized metadata on initial catalog entry is cost-effective. provided to attendees of the Information and Quality 20

21 CONCLUSIONS Clean legacy data once, load new data correctly. Since it takes four times as long to clean a record up as to store it correctly, avoiding the cleanup of ¼ of the data covers the time cost of storing all of it correctly. Historically, we can expect all our data to be re-used at least once in the future (example: new shale plays). Doing the process as an ongoing effort doesn't require Project Management (cheaper by 10-30%, based on Initiative experience). Small amount of standardized metadata (very quick to add) makes data significantly easier to find. provided to attendees of the Information and Quality 21

22 Conclusions (continued) Advantages of going digital Reduced risk of mishandling documents Shorter wait for document to be retrieved (milliseconds vs. days) Many people can check document out at once Original not lost when checked out Even with standardized metadata, hardcopy documents will get Lost in Place Eliminates source of waste in the system: 20-50% of time spent triaging out-of-scope items in boxes by our observations provided to attendees of the Information and Quality 22

23 Benefits of Lost Quality Observation Study for DI Results: Metrics justified modifications to proposed future process: Metrics supported the cost-effectiveness of data loading using standardized metadata Metrics supported Initiative s proposal to take the out of the future data loading process Metrics supported shift to digital document storage instead of paper Products from LQO study: Overall process flow documented in flowchart form Taxonomy chart Future process documented Value of digital vs. paper drive culture change provided to attendees of the Information and Quality 23

24 KEY LEARNINGS Thoroughly understand the process (vital) Before defining and capturing metrics Tailor length of study Pace of organizational change Reliable measurements Gathering metrics takes time Better to directly observe the process More efficient than iterative design Real time study difficult Capturing metrics vs. keep the business running provided to attendees of the Information and Quality 24

25 OBSERVATIONS Digital storage is superior to paper data storage Reduced risk of mishandling documents Shorter wait for document to be retrieved (milliseconds vs. days) Many people can check document out at once Original not lost when checked out Box-handling costs: Larger than expected Waste in the system: 20-50% of time spent triaging out-of-scope items Hardcopy documents Lost in Place Without improved clean-up or storage process provided to attendees of the Information and Quality 25

26 SUMMARY We were the first in ExxonMobil to measure an ongoing data-cleanup process to test some widely-held theories/beliefs about data management. Time spent cleaning the data would be recovered by the time saved finding data in the future: FALSE (X=35). It is more costly to clean data up after the fact than it is to store it correctly the first time: TRUE ( Twice as Long to do It Wrong ). Potential future applications of LQO methodology: Use as a quicklook tool to prioritize a set of data cleanup efforts. Use to justify metadata capture Estimate how much effort should be expended to improve a particular data set s quality. With modifications, potentially quantify ROI of data loading efforts provided to attendees of the Information and Quality 26

27 Q & A provided to attendees of the Information and Quality 27

28 ENTRANCE Management is more willing to take time for data quality if the cost benefits can be quantified This type of information is hard to obtain Without qualified cost-benefit data, it is difficult to determine how much $ it will save POSITION We conducted a study to help identify the cost of poor quality data Listen and compare our experiences to yours You may be able to apply our techniques to help justify your data quality expenditures PREVIEW The Methodology The Results Twice as Long to Do it Wrong CLOSING Discover some of our techniques may be beneficial to your data quality situation provided to attendees of the Information and Quality 28

29 Step 2: Deciding where to place the metrics PARALLEL PROCESS DIAGRAM DATA RETRIEVAL Check-in Triage Vendor Checkout Quality Check Transport Scan Metadata Update Storage Search Identified Request Confirm Delivery Checkedin Documents Keep in RCD yes Regroup Items By Type Label Items To Be Scanned Affix RCD Barcode QC d for Accuracy (Box Content) Transport Scan Original Item Capture MDRs Enter MDRs & Metadata Update RCD Send Original to RC no Process 1- Clean-Up Non Target or Destroy Store Scanned Images Search Identified Request Confirm Delivery Checkedin Document s Process 2- Finding Now Business Unit Store in RCD yes CURRENT PROCESS Affix RCD Barcode Information Management System Enter MDRs & Metadata Send Original to RC no Non Target or Destroy Process 3- Storing Currently Search Identified Request Confirm Delivery Checkedin Document s Process 4- Future Search FUTURE PROCESS Business Unit Information Management System Document s Keep in RCD yes Capture MDRs Label Items To Be Scanned Affix RCD Barcode Scan Original Item Enter MDRs & Metadata Send Original to RC no 2012 ExxonMobil Technical Computing Company. May be provided to attendees at the 16th PNEC International Conference on Petroleum. All other rights reserved. 29 Non Target or Destroy Store Scanned Images Process 5- Storing in the Future 29

30 DESIGNING THE MEASUREMENT METHODOLOGY - STEP 2 COMPARING THE COST OF DATA SEARCH Process 1A Clean-Up DATA RETRIEVAL Check-in Search Identified Request Confirm Delivery Checked in Process 2 Current Search Business Unit Search Identified Request Confirm Delivery Checked in Process 4- Future Search Business Unit Search Identified Request AAa Confirm Delivery Checked in Observations: Working with fewer boxes Blue Fewer boxes and items retrieved mean less time as compared to Process 1A provided to attendees of the Information and Quality 30

31 DESIGNING THE MEASUREMENT METHODOLOGY - STEP 2 COMPARING THE COST OF DATA STORAGE Triage DATA CLEAN-UP PROCESS Vendor Checkout Quality Check Transport Scan Metadata Update Storage Document s Keep in RCD yes Regroup Items By Type Label Items To Be Scanned Affix RCD Barcode QC d for Accuracy (Box Content) Transport Scan Original Item Capture MDRs Enter MDRs & Metadata Update RCD Send Original to RC no Non Target or Destroy Process 1B- Clean-Up Store Scanned Images CURRENT PROCESS Business Unit Information Management System Store in RCD yes Affix RCD Barcode Enter MDRs & Metadata Send Original to RC no Non Target or Destroy Process 3 - Storing Currently FUTURE PROCESS Business Unit Information Management System Document s Keep in RCD yes Capture MDRs Label Items To Be Scanned Affix RCD Barcode Scan Original Item Enter MDRs & Metadata Send Original to RC no Non Target or Destroy Store Scanned Images Process 5 - Storing in the Future provided to attendees of the Information and Quality 31

32 CURRENT PROCESS 20 Minutes DATA CLEAN-UP PROCESS 29 Minutes Combined Time 49 Minutes No changes to data storage process the data clean-up investment will be lost over time High Quality X=35 Current Storage Process No Standardized Meta Document Repository Never Recover the Cost of Cleanup from Time Savings Cleanup Effort Takes twice as long to do it wrong X = Clean-up Time Search Time Low Quality FUTURE PROCESS 27 Load High Quality the first time New Storage Process Minutes Standardized Meta Document Repository Case for Improved Storage Process Load time increases by 7 mins Saves 29 minutes in clean-up 1:4 ratio Only takes ¼ of the time to load the data correctly as compared to performing a data cleanup provided to attendees of the Information and Quality 32