Why Data Governance is Necessary Even for Big Data Platforms. Glen Sartain Practice Lead Big Data Analytics Noah Consulting

Size: px
Start display at page:

Download "Why Data Governance is Necessary Even for Big Data Platforms. Glen Sartain Practice Lead Big Data Analytics Noah Consulting"

Transcription

1 Why Data Governance is Necessary Even for Big Data Platforms Glen Sartain Practice Lead Big Data Analytics Noah Consulting

2 Purpose Built Approach

3 VALUE Maturity Stages of Analytics ACTIVATING MAKE it happen! page 3 OPERATIONALIZING WHAT IS happening? REPORTING WHAT happened? ANALYZING WHY did it happen? Increase in Ad Hoc Analysis Primarily Batch & Some Ad Hoc Reports PREDICTING WHAT WILL happen? Analytical Modeling Grows Continuous Update & Time-Sensitive Queries Become Important Maturity Event-Based Triggering Takes Hold Batch Ad Hoc Analytics Continuous Update/Short Queries Event-Based Triggering Noah Consulting, LLC 2015

4 VALUE Maturity Stages of Analytics Prescriptive ACTIVATING MAKE it happen! page 4 Predictive Descriptive REPORTING WHAT happened? Diagnostic ANALYZING WHY did it happen? Increase in Ad Hoc Analysis Primarily Batch & Some Ad Hoc Reports PREDICTING WHAT WILL happen? Analytical Modeling Grows OPERATIONALIZING WHAT IS happening? Continuous Update & Time-Sensitive Queries Become Important Event-Based Triggering Takes Hold Batch Ad Hoc Analytics Continuous Update/Short Queries Event-Based Triggering Noah Consulting, LLC 2015 Maturity

5 Is Big Data trying to replace Traditional Data Management? The variation in data and the computational complexity of the required analytics are asking questions that EDW was not set up to answer The velocity at which the data arrives (e.g. streaming data, SCADA, etc.) places additional burden on loading data to EDWs The sheer volume of data is filling current warehouse and storage environments

6 Is Traditional Data Management Still Relevant

7 Perspective #1: Paul, the Data Expert Data is an enterprise asset Data Management is a specific discipline Traditional methods still work and provide value Data Quality still matters Don t throw out the old just for the shiny new object (new technology)

8

9 Data Warehousing: Thirty Years Old and Still Delivering Value

10 Perspective #2 Michelle, the Tech Expert New Big Data and Industrial Internet technologies have come from others industries We have new data sources that need to be evaluated (sensors, social media, documents, etc.) We have to start adopting these technologies into our environment or we will fall behind

11

12 Perspective #3: Marjory, the Analytics Expert Need for deeper insight and faster decisions DOF becomes PD2A Data Visualization Physics vs Statistics vs Heuristics Reporting vs data mining Data Science and Data Story Telling

13

14

15 Are Algorithms Taking Over the World?

16 Perspective #4: Sean the enterprise architect Hybrid Architecture Analytics Platform Search and Discovery Data Modeling and data structure Data Lakes (ETL vs ELT) New tech/ new skills

17

18 Domain Data Types / Categories Domain Data Types / Categories Domain Data Types / Categories Facility Area / Spatial Geographical Area Geopolitical Area Geological Area Business Area Land Rights & Ownership Transportation Environment Pipeline Remote Sensing Well Head Offshore Platform Onshore Facility Sites Well Geophysics Well Header Directional Survey Borehole Geophysics Well Logs Well Tests Samples Stratigraphy Well Operations General Survey Data Seismic Navigation Seismic Trace Velocity Models Other Geophysical Measurements Seismic Reports Interpretation Production Reservoir Reservoir Engineering Operations Production Engineering Geologic Interpretation Geophysical Interpretation Multi-Disciplinary Interpretation

19

20 Survey Results 1. The major disadvantages of traditional data management practices come in the data environment and data architecture. 2. Current methods are not meeting business requirements for: Providing insight into production profiles and opportunities Having the ability to understand where to reduce costs; Provide greater data transparency along the value chain Addressing the growing concern about cybersecurity and data theft and Creating a more holistic view of asset operations and data lifecycle 3. Business units and functional groups are driving current requirements for data management, but that corporate standards and external partners were still not that influential. 4. New approaches are being introduced especially self-service BI and centers of excellence of analytics

21 Survey Results

22 Survey Results 5. Challenges with traditional methods include: managing growing volumes of data and the speed at which the data is being received, integrating the variety of data; maintaining adequate governance; problems with data quality and using data from the approved source; lack of standards, and master data management. 6. New digital technologies are being adopted now, especially: Big Data/ Data Lake (Hadoop, NoSQL), cloud based solutions, new analytics platforms and streaming data (realtime) applications. 7. The top two challenges in adopting these technologies are change management (getting people to do things differently) and lack of resources (budget and manpower) not lack of skills. 8. The current low oil and gas price environment is impacting current work as new projects and technologies have been curtailed and current practices are still maintained, but we have to do more with fewer resources.

23 Conclusion: Is Traditional Data Management Still Relevant? Yes, But Keep your existing data environment Focus on Data Governance (who owns data? A successful data governance strategy is all about change. It will change the way your business works for the better.) Implement improvements in Data Mgmt. Tools (but don t forget about people and process)

24

25 Big Data Best Practices Data management strategy should include Big Data (Hadoop) not be replaced by it. Hadoop is complementary technology to existing data storage (e.g. RDBMS, ECM, etc.). Utilize historians for real-time data. Create sandboxes to allow development of Proof-of-Concepts. Utilize commercial tools built to support the Hadoop ecosystem to help reduce the cost of development. These technologies are open source code and at this point in its evolution should be proven out through POCs.

26 Best Practices: the Big Data Road Map Keep it simple (KISS) to get started by developing some basic application. Data governance is a comprehensive approach for your entire data management enterprise. Data quality techniques should be applied cautiously as to not to impede adoption Hadoop s forte is combining unstructured data/data for analytics.

27 Preparing IT systems and organizations for the future Factor connectivity issues into design: Retooling for the Industrial Internet Actively participate in setting Industry Standards and deploying them! Embrace a continuous delivery model for software and services: Consider retrofitting existing products and services Update cyber security strategy and privacy protocols Explore modularity, interoperability Consider different organizational structures

28 Thanks for your time! page 28 Glen Sartain Noah Consulting Noah Consulting, An Infosys Company. All Rights Reserved 2016