The Internet of Things and Machine Learning

Size: px
Start display at page:

Download "The Internet of Things and Machine Learning"

Transcription

1 The Internet of Things and Machine Learning Making Wind Energy Cost Competitive Cast Study: Fluitec, with Cross-Industry Applications Oracle BIWA Summit: Jan. 27, 2016

2 How This IoT Presentation is Different No IoT market predictions No Gartner Hype Cycle Practical, hands-on lessons learned Open Source technologies Old-school low-tech shortcuts

3 Introductions Robert Liekar: Director, Data Science Founder: Data Science Professionals (datascipros.com) Jay Mason: Associate Partner, Business Intelligence M&S Consulting Big Data Architecture Hybrid Cloud Machine Learning Systems Integration Data Wrangling (ETL) Oracle Gold Partner IoT Strategy and Platform Selection

4 Internet of Things Devices embedded with electronics, software and sensors... network-connected... enabling communication and data transfer... incorporating automated analytics

5 Big Data and IoT Today we will see how Fluitec solved real-world challenges Data wrangling on a Big Data scale Predictive modeling and data mining of complex, heterogeneous data sources Utilization of cloud resources Integration of Big Data tools such as Hadoop with MySQL and Java Use of network connected devices to manage complex machinery

6 Key Technologies MySQL Hadoop Java Python R Cloud (AWS). a hand-rolled IoT platform!

7 Tribo-Analytics and IoT Automated feeds from sensors on wind turbine components Incorporated into central repository Automated analytics produce warnings and recommendations WTG 89172: Add 1000 mg P to GB oil WTG 00811: Yaw motor likely to fail < 3 mos

8 Fluitec International

9 Fluitec Fluitec International - Extending the life of industrial assets Fluitec Wind - Decreasing the cost of wind energy through advanced data analytics

10 Fluitec Wind

11 Fluitec Wind Tribo-Analytics SaaS platform Help turbine operators utilize existing data to reduce costs and guide decision making Aggregates data from thousands of turbines globally Recommend Maintenance Plans Identify At-Risk Components

12 Value Proposition for Customers Using data that is already being collected, predictive analytics can be used to identify which components will fail and when Customers can then either take preventive action or at least plan for replacement Enables condition-based oil changes

13 Why Wind Turbines? Expensive to build $2 MM/MW On-shore, $6 MM/MW Off-shore Expensive to maintain $50,000 / WTG per year $7,500 / WTG oil change (every 6 years) $100,000 Gearbox replacement cost Often remote = Difficult logistics Huge amounts of data already being collected

14 Often Remote

15 Big and Expensive

16 Really Big...

17 Exposed to the Elements

18 Maintenance can be Tricky

19 Distracted Staff

20 Not So Simple - Lots of Connected "Things"

21 Predict Component Failure Initial Focus on Gearboxes Very Expensive Difficult to Replace One of the most frequent causes of failure in wind turbines

22 Gearboxes

23 Damaged Bearing

24 Gear Micropitting

25 Types of Data Used Sensors on gearbox major components Ambient conditions Oil analysis Technician s reports

26 The Data 5,000 WTGs (Wind Turbine Generators) 30,000 oil samples Make, model, rated capacity, etc. Billions of rows of sensor data (to start) Bearing temperature Shaft RPM Vibration alerts Many more...

27 Collecting and Preparing Data Collect data historical + ongoing feeds Traditional ETL on low-volume data MDM is important Preprocessed high-volume data ETL Derived data (statistics)

28 High-Volume Data Flow

29 Data Collection and Pre-Processing Systems Landing Zone - hybrid Internal and AWS-based Linux servers Secure, encrypted transfer to Hadoop cluster in cloud Pre-Processing in Hadoop Pre-Aggregated Values and Statistics Sqoop ed to MySQL

30 Hybrid Cloud Architecture

31 Machine Learning Methodology Identify patterns associated with gearbox failure modes using training set (create model) Validate model using N-fold cross-validation Using model based on training set, predict TTF for: New wind turbines Existing turbines using updated data

32 Anomaly Detection Sensor Data

33 Oil Anomalies

34 Oil Contamination

35 Side Note... Big Data tools are not just for big data Powerful, open-source software Data Mining Exploring data in new ways Create KPIs for automated monitoring

36 Predictive Model Selection Ran millions of scenarios Used automated model generator 128 scenarios run in parallel on multiple cloud servers Cloud servers shut down when not running scenarios = $0 cost

37 Challenges and Lessons Learned Data Quality Semantics Data Volume Identifying Distinct Modes of Failure Identifying Failures Validate Peer Groups Should be clear that model works or doesn t

38 Key Features Repository for Data Mining Predictive Model Data Catalog Model Logs Automated Model Generation and Assessment No Lost Information

39 Led to New Lines of Business! Condition-based oil changes With monitoring and Fluitec expert analysis, only change oil when needed Breeze Oil Never change the oil But need monitoring to establish credibility Additional additives sales Top up additives instead of complete oil change

40 Use Case - Wafer Metrology IoT Value Reduce cost of machine failures and maintenance Reduce cost of out-of-spec product and re-work Challenges Volume: Millions of measurements per hour Variety: Many different machines, standards & metrics Complex M2M workflows - combines, splits, etc.

41 Use Case - Sports Performance IoT Value What else? Improve chance of winning! Challenges Time accuracy, to thousandths of a second Synchronizing events with video Disparate data streams: athlete location and body position, timing system, heart rate & other vitals, weather, etc. Real-time feedback to support decision-making

42 Evermore Technology Options Previously. MySQL Hadoop MR Java, Python R and Shiny Cloud (AWS) Custom ETL Today we also use. Multitenant Secure DB Spark and Spark Streaming Scala H2O Deep Learning, Sparkling Water Cloud (AWS, Oracle, Microsoft,.) ELT: Talend, Oracle Data Integrator Custom Events MQTT Publish/Subscribe Custom IoT platform Packaged IoT Platform

43 IoT Architecture Framework Source: Internet of Things: Role of Oracle Fusion Middleware

44 Complete Data Flow (generic)

45 M&S Consulting Process and Technology Consulting National clientele

46 Contact Us Robert Liekar Jay Mason