MACHINE LEARNING IN THE OIL AND GAS INDUSTRY Transforming Data Into Performance Mehrzad Mahdavi Vice President, Digital Solutions 19 April 2017
1 WHAT IS MACHINE LEARNING? A method of data analysis that automates analytical model building. Using algorithms that iteratively learn from data, machine learning allows computers to find hidden insights without being explicitly programmed where to look. - SAS Advanced machine learning algorithms are composed of many technologies (such as deep learning, neural networks and naturallanguage processing), used in unsupervised and supervised learning, that operate guided by lessons from existing information. - Gartner A scientific field is best defined by the central question it studies. The field of machine learning seeks to answer the question: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes? - Tom Mitchell, The Discipline of Machine Learning, Carnegie Mellon University School of Computer Science
2 MACHINE LEARNING DEPLOYMENTS HAVE ACCELERATED As late as 1985 there were almost no commercial applications of machine learning. Tom Mitchell, The Discipline of Machine Learning Today, machine learning is being deployed by almost every industry Source: Tata Consultancy Services, Using Big Data for Machine Learning Analytics in Manufacturing
3 TECHNOLOGICAL INNOVATIONS HAVE FUELED THIS GROWTH Most machine learning scenarios require a large amount of data for the models to be properly trained, as well as a variety of data. This appetite for data requires significant resources that were scarce until the last two decades. Compute - Increases in computational processing power Storage - Cheaper, more powerful storage capabilities Massive Databases - Development of scale-out NoSQL database clusters, like Cassandra Networking - Faster, more pervasive networking and cellular communications technologies Sensor Technology - Advances in sensor and sensor packaging technologies IoT - The emergence of the Internet of Things Big Data Analytics - Development of Big Data Analytics stacks, like Hadoop Co-processor Technology - Emergence of powerful co-processor technologies, FPGAs and GPUs Combined, these major technological innovations allow massive amounts of data to be gathered, stored and processed by the machine learning algorithms. The result is better model training with a step change in accuracy and usefulness.
4 THE ROLE OF IoT AND BIG DATA IN MACHINE LEARNING IoT BIG DATA IoT AND BIG DATA IoT offers a way to gather that data from anywhere on the planet using commodity off-the-shelf hardware and software Technology becoming standardized and hardened through consortia, such as the Industrial Internet Consortium (IIC) and the OpenFog Consortium Forms a critical, foundational piece of the Industry 4.0 construct Big Data is critical because massive amounts of data must be housed and enriched before it can be consumed to train models, or for analysis by production ML systems. IoT and Big Data architectures would not be possible without the technological innovations listed on the previous slide.
5 WEATHERFORD EXPERIENCE MTBF / TTF ANALYTICS OBJECTIVE SCOPE GOALS To identify potential component failures Use Data from thousands of wells to predict failure probability of rod pump well components Rod, Pump, Tubing, etc. Identify best fit model and relevant data (features) to predict component failure Design the operational process and identify opportunities to further improve model performance
6 OVERVIEW OF THE PROCESS Prepare Model Evaluate Finalize Operationalize Business and Data Knowledge Data Preparation Select Models, Generate Test Data Build, Run, Assess Model Evaluate Model Results Review Modeling Process, Output, Insights Finalize Model Integration with Production optimization Platform Model implementation Test and deploy
WEATHERFORD ML PROCESS 7 Down Sampling Model Execution Trained Model Model Evaluation Model Promotion Well Dataset w/ features Classification Classification 18 months of data for evaluation Run-time Model Execution Daily well data with rolling averages of sensor observations 6 months of data for validation
WEATHERFORD ML ROD LIFT PREDICTION RESULTS 8 Big data ingestion and analytics using various machine learning techniques Accuracy 98% = (True Positives + True Negatives) / Total Population Techniques were applied to address noisy and extremely unbalanced sensor data Random Forest Binary Classification Model selected from 5 techniques to assess capabilities for component level failure prediction Thousands of wells Multiple years history 7 Million records 10+ GB These impediments need to be tackled at the source for efficient large-scale analytics Results show significant opportunity to predict component failure
9