IoT for Lunch (and other critical workplace activities)

Size: px
Start display at page:

Download "IoT for Lunch (and other critical workplace activities)"

Transcription

1 IoT for Lunch (and other critical workplace activities) Kirk Principal Data Scientist Booz Allen Hamilton

2 OUTLINE IoT (and IoAT): The Coming Hyper-Data World Steps to Discovery: Data Science & Analytics Case Study: Mars Rovers (metaphor) Case Study: Environmental Monitoring Summary: A Buffet of Analytics for the IoT 2 2

3 Internet of Things

4 The Internet of Things (IoT) will be an interconnected universe of Dynamic Data-Driven Application Systems (DDDAS)

5 IoAT * = Internet of Anything, where Everything is Interconnected *

6 Ubiquitous Big Data Science in the IoT: Applications and Use Cases are everywhere 6

7 Ubiquitous Big Data Science in the IoT: Applications and Use Cases are everywhere Smart Apps Predictive Retail Precision Marketing Smart Highways Precision Traffic Smart Cities Predictive Law Enforcement Personalized Learning Smart Health Precision Medicine Precision Farming Personalized Financial Services Smart Organizations Predictive Maintenance Prescriptive Maintenance Smart Grid Smart, Predictive, Precision, Personalized XYZ 7

8 Edge Analytics in the IoT: Intelligence at the edge of the network (at the point of data collection upstream and downstream) Smart rigs (context-aware) Predictive maintenance Precision operations & controls Personalized workforce deployment Smart manufacturing & inventory Predictive supply chain & pricing Precision logistics (people, things) Personalized customer engagement 8

9 A Connected Data Platform for IoAT: the Connected World of Dynamic Data Modified Newton s 1 st Law of Motion ( Data Inertia ) = Data at Rest stay at rest and Data in Motion stay in motion unless acted upon by a force! => Perform analytics on dynamic IoT data while the data are moving <= 9

10 OUTLINE IoT (and IoAT): The Coming Hyper-Data World Steps to Discovery: Data Science & Analytics Case Study: Mars Rovers (metaphor) Case Study: Environmental Monitoring Summary: A Buffet of Analytics for the IoT 14 10

11 Machine Learning: 4 Types of Discovery (Graphic by S. G. Djorgovski, Caltech) 1) Class Discovery: Finding new classes of objects (population segments), events, and behaviors. This includes: learning the rules that constrain the class boundaries. 2) Correlation (Predictive and Prescriptive Power) Discovery: Finding patterns and dependencies, which reveal new governing principles or behavioral patterns (the customer DNA ). 3) Novelty (Surprise!) Discovery: Finding new, rare, one-in-a-[million / billion / trillion] objects and events. 4) Association (or Link) Discovery: Finding unusual (improbable) co-occurring associations. 11

12 5 Levels of Analytics Maturity 1) Descriptive Analytics Hindsight (What happened?) 2) Diagnostic Analytics Oversight (real-time / What is happening? Why did it happen?) 3) Predictive Analytics Foresight (What will happen?) 4) Prescriptive Analytics Insight (How can we optimize what happens?) 5) Cognitive Analytics Right Sight (the 360 view, what is the right question to ask for this set of data in this context = Game of Jeopardy) Finds the right insight, the right action, the right decision, right now! Moves beyond simply providing answers, to generating new questions and new hypotheses, for your next-best move 12

13 OUTLINE IoT (and IoAT): The Coming Hyper-Data World Steps to Discovery: Data Science & Analytics Case Study: Mars Rovers (metaphor) Case Study: Environmental Monitoring Summary: A Buffet of Analytics for the IoT 8 13

14 Case Study: Mars Rovers (metaphor) 14

15 Mars Rover: intelligent data-gatherer, mobile data mining agent, and autonomous decision-support system 1) Drive around surface of Mars take samples of items (Mars rocks): experimental data type = mass spectroscopy = data histogram = feature vector 2) Perform Intelligent Data Operations (Data Mining in Action): Supervised Learning Search for items with known characteristics, and assign items to known classes Unsupervised Learning Discover what types of items are present, without preconceived biases; find the set of unique classes of items; discover unusual associations; discover trends and new directions (new leads) to follow Semi-supervised Learning Find the rare, one-of-kind, most interesting items, behaviors, contexts, or events 3) Enact On-board Intelligent Data Understanding & Decision Support = Science Goal (or Business Goal) Monitoring: stay here and do more ; or else follow trend to a more interesting location send discoveries to human analysts immediately ; or send informational results later 15

16 Smart Sensors & Sentinels for Data-Driven Sense-Making and Decision Support From Sensors to Sentinels to Sense (for any application domain with streaming data from sensors) New knowledge and insights are acquired by monitoring and mining actionable data from all digital inputs (Sensors!) Alerts are triggered autonomously, without intervention (if permitted), applying machine learning and actionable business decision rules for pattern detection and diagnosis. (Sentinels! = embedded machine learning / data science algorithms) Smart Sensors (powered by Machine Learning-enabled sentinels) deliver actionable intelligence (Sense!) 16

17 Dynamic Data-Driven Application Systems (DDDAS) 4 steps from data to action in IoT = MIPS: Measurement Inference Prediction Steering This applies to any Network of Sensors: Web user interactions & actions (web analytics data), Cyber network usage logs, Customer Service interactions, Social network sentiment, Machine logs (of any kind), Manufacturing sensors, Health & Epidemic monitoring systems, Financial transactions, National Security, Utilities and Energy, Remote Sensing, Tsunami warnings, Weather/Climate events, Astronomical sky events, Machine Learning enables the IP part of MIPS: Pattern (Segment) Discovery Correlation (Trend) Discovery Novelty (Anomaly) Discovery Association (Link) Discovery Alert & Response systems: Actionable insights from streaming sensor data Automation of any datadriven operational system 17

18 Dynamic Data-Driven Application Systems (DDDAS) 4 steps from data to action in IoT = MIPS: Measurement Inference Prediction Steering This applies to any Network of Sensors: Web user interactions & actions (web analytics data), Cyber network usage logs, Customer Service interactions, Social network sentiment, Machine logs (of any kind), Manufacturing sensors, Health & Epidemic monitoring systems, Financial transactions, National Security, Utilities and Energy, Remote Sensing, Tsunami warnings, Weather/Climate events, Astronomical sky events, Machine Learning enables the IP part of MIPS: Pattern (Segment) Discovery Correlation (Trend) Discovery Novelty (Anomaly) Discovery Association (Link) Discovery Alert & Response systems: Actionable insights from streaming sensor data Automation of any datadriven operational system 18

19 OUTLINE IoT (and IoAT): The Coming Hyper-Data World Steps to Discovery: Data Science & Analytics Case Study: Mars Rovers (metaphor) Case Study: Environmental Monitoring Summary: A Buffet of Analytics for the IoT 21 19

20 Early Warning and Monitoring Systems for Geospatial Event Discovery Data Source #1: Satellite (LANDSAT) Data Source #2: Aerial photos Data Source #3: in situ sensors Data Source #4: Models Information Extracted: Regional events (drought) Information Extracted: Local events (land use) Information Extracted: Situational data (development activities) Information Extracted: Predictions & Forecasts (e.g., changes in climate, forestation, agriculture, ) Association & Link Analysis Correlation Discovery Anomaly/Novelty Discovery Clustering Analysis Principal Components Unsupervised methods KDD tools Supervised methods Understanding: Develop new knowledge on causal connections and interdependencies between geospatial events at the Human-Earth system interface Neural Networks / Deep Learning Support Vector Machines Bayesian Networks Markov Models Decision Trees (Random Forests) 20

21 Early Warning and Monitoring Systems for Geospatial Event Discovery From Sensors to Sentinels to Sense Data Source #1: Satellite (LANDSAT) Data Source #2: Aerial photos Data Source #3: in situ sensors Data Source #4: Models Information Extracted: Regional events (drought) Information Extracted: Local events (land use) Information Extracted: Situational data (development activities) Information Extracted: Predictions & Forecasts (e.g., changes in climate, forestation, agriculture, ) Association & Link Analysis Correlation Discovery Anomaly/Novelty Discovery Clustering Analysis Principal Components Unsupervised methods KDD tools Supervised methods Understanding: Develop new knowledge on causal connections and interdependencies between geospatial events at the Human-Earth system interface Neural Networks / Deep Learning Support Vector Machines Bayesian Networks Markov Models Decision Trees (Random Forests) From Data to Information to Knowledge to Understanding 21

22 The ASK pipeline = Applications, Services, Knowledge delivery from IoT data! Algorithms: Machine Learning Data Mining Visual Analytics Machine Vision Data Characterization Standards & Frameworks: BPEL, PMML, SIEM, VOevent, DDDAS Applications: Change-monitoring (Land Use, Drought, Environmental), Carbon Tracking, Vegetation Stress, Emergency Response to Natural Hazard Events, Human Needs, Education, Planning & Development Data Collections: Landsat, EO-1, MODIS, AVHRR, ASTER, SRTM, NEON, NLIP, Aerial, Satellite, Maps, LIDAR, Sensor Networks, Multi-mission, Elevation Datasets & Models, INFORMATION PRODUCTS Data Services: Visual Analytics Predictive & Prescriptive Analytics Ad hoc data mining & analysis Knowledge Discovery from Data Queryable information products Enhanced imagery products Data product recommendations Scientific Expertise: User-generated content User-annotations data User groups Academic researchers Professional societies New Technologies: Data Lakes Cloud / PaaS Linked Data (RDF) Data-as-a-Service SQL-on-Hadoop Knowledge: Essential Climate Variables Land Science Agriculture Forecasting Climate Change & Variability Environmental Science Famine, Forestation, Energy 22

23 Hortonworks DataFlow on HDP HDF HDP HDF is powered by Apache NiFi* for Data Flow Orchestration = to process Data in Motion! e.g., Cyber, IoT, IIoT, IoAT data. Stream processing tools include Apache Storm and Spark Streaming. (*NiFi = NiagraFiles = part of the Booz Allen Hortonworks strategic partnership) Hortonworks DataFlow (HDF ) collects, curates, analyzes, and delivers real-time data from the Internet of Anything (IoAT): devices, sensors, clickstreams, log files,... Hortonworks Data Platform (HDP ) enables the creation of a secure enterprise data lake and delivers the analytics you need to innovate fast and power real-time business insights. 23

24 Environmental Monitoring with IoT data Check out and participate in the EPA Smart City Air Quality Challenge: EPA is challenging communities to deploy hundreds of air quality sensors and to make the data public! Submissions due October 28, 2016 $100,000 in prizes 24

25 OUTLINE IoT (and IoAT): The Coming Hyper-Data World Steps to Discovery: Data Science & Analytics Case Study: Mars Rovers (metaphor) Case Study: Environmental Monitoring Summary: A Buffet of Analytics for the IoT 25 25

26 Summary: A Buffet of Analytics for the IoT Triage General example of IoT Analytics Triage: IoT Event Mining in Big Data Collections for Actionable Intelligence: Behavior modeling (anomaly & trend detection) and ad hoc inquiry for Discovery Identifying, characterizing, & responding to events for data-driven Decisions Deciding which events need immediate investigation and/or intervention = Action! for people, products, and processes! Many other examples: Predictive Maintenance alerts (from ubiquitous machine / engine sensors) Prescriptive efficiency modeling and implementation (from process mining) Supply chain monitoring (from manufacturing & shipping sensors) Predictive personnel & asset placement (from RFID tracking data & trajectory modeling) Cybersecurity alerts (from network logs) Preventive Fraud alerts (from financial applications) Health alerts (from EHRs and national health systems) Tsunami alerts (from geo sensors everywhere) Social event alerts or early warnings (from social media) Precision Logistics (Who is where? What are the doing? How much lunch to prepare?) 26