Data Science Strategies for ReaI-Time Analytics

Size: px
Start display at page:

Download "Data Science Strategies for ReaI-Time Analytics"

Transcription

1 Data Science Strategies for ReaI-Time Analytics Kirk Principal Data Scientist Booz Allen Hamilton

2 2

3 3

4 4

5 Can our electric grid be more resilient? 5

6 Real Time Data Analytics for the Resilient Electric Grid So, what is Resilience? 1. The capacity to recover quickly (bounce back) from difficulties; toughness. 2. The ability of a substance or object to spring back into shape; elasticity

7 Real Time Data Analytics for the Resilient Electric Grid So, what is Resilience? 1. The capacity to recover quickly (bounce back) from difficulties; toughness. 2. The ability of a substance or object to spring back into shape; elasticity. e.g., Resilient Communities have the sustained ability to utilize available resources to respond to, withstand, and recover from adverse situations = = resources + data + analytics = insights + decisions! 7

8 Enhancing Resilience in Infrastructure 8

9 Smart Electric Grid Use Cases Spatiotemporal insight Situational / Context awareness Fast diagnosis & response Anomaly / Fraud / Loss detection Predictive maintenance Digital twins / Prescriptive action System performance optimization Resiliency Load balancing Predictive demand forecasting Real-time pricing New products Customer nudge Targeted offerings Smart contracts (Blockchain) Regulatory compliance 9

10 Emerging and Disruptive Digital Technologies in the Energy Industry: This is what Digital Disruption looks like! 10

11 Emerging and Disruptive Digital Technologies in the Energy Industry: This is what Digital Disruption looks like! 11

12 The Data Science Revolution = Moving from data to insight to action! Manage the Digital Disruption with a Data Science Strategy Data Science enables the art of the possible : The easy button for real-time analytics! 12

13 Data needs a Transformer (like Electricity) to make it accessible to all. 13

14 Massive data collections unlock deeper insights into hard problems and complex systems Be careful what you wish for!!!! 14

15 Adding more data doesn t necessarily help Unless we can combine and integrate the different signals into a single view of the thing, there will continue to be many possible interpretations of what the source is! Combining, connecting, and linking diverse data makes data smart! Think of data not as information, but as facts that encode knowledge. 15

16 Environmental Analytics example Transforming Data to Information to Knowledge to Understanding 16 16

17 Environmental Analytics example 17 17

18 4 Types of Discovery from Data Science: What is your data analytics use case? (Graphic by S. G. Djorgovski, Caltech) 1) Class Discovery: Finding the categories of objects (population segments), events, and behaviors in your data. + Learning the rules that constrain the class boundaries (that uniquely distinguish them). 2) Correlation (Predictive and Prescriptive Power) Discovery: Finding trends, patterns, dependencies in data, which reveal the governing principles or behavioral patterns (the object s DNA ). 3) Novelty (Surprise!) Discovery: Finding new, rare, one-in-a-[million / billion / trillion] objects, events, or behaviors. 4) Association (or Link) Discovery: (Graph and Network Analytics) Finding the unexpected, (unusual ) co-occurring associations / links / connections among the entities in your domain. 18

19 1) Descriptive Analytics Hindsight (What happened?) 2) Diagnostic Analytics 5 Levels of Analytics Maturity in Data-Driven Applications Oversight (real-time / What is happening? Why did it happen?) 3) Predictive Analytics Foresight (What will happen?) 19

20 5 Levels of Analytics Maturity in Data-Driven Applications 1) Descriptive Analytics Hindsight (What happened?) 2) Diagnostic Analytics Oversight (real-time / What is happening? Why did it happen?) 3) Predictive Analytics Foresight (What will happen?) 4) Prescriptive Analytics Insight (How can we optimize what happens?) (Follow the dots / connections in the graph!) 5) Cognitive Analytics Right Sight (the 360 view, what is the right question to ask for this set of data in this context = Game of Jeopardy) Finds the right insight, the right action, the right decision, right now! Moves beyond simply providing answers, to generating new questions and hypotheses. 20

21 3 Examples of Analytics 1) Descriptive 2) Predictive 3) Cognitive 21

22 3 Examples of Analytics 1) Descriptive 2) Predictive 3) Cognitive 22

23 All of the features in the data histogram convey valuable (actionable) information (the long tail, outliers, multi-modal peaks, )

24 Mixture Models = Statistical Clustering Each of these data histograms can be represented by the mixture (i.e., sum) of several Gaussian normal distributions, such as the 3 Gaussian distributions shown in the lower right. Each Gaussian statistically represents (characterizes) one cluster of data values within the full set of data values. Comprehensive web resource for Mixture Models for clustering and unsupervised learning in Data Mining: 24

25 Statistical Clustering tags (characterizes) the data, enabling discovery: making the data smart! Each Gaussian in the mixture can be characterized by various parameters, such as the mean, variance (standard deviation), and amplitude (i.e., the strength of that particular Gaussian component within the mixture). These parameters can be plotted as a function of some independent (treatment) variable, to discover trends and correlations in the effects across the different segments of the population

26 3 Examples of Analytics 1) Descriptive 2) Predictive 3) Cognitive 26

27 Association Discovery Example #1 Classic Textbook Example of Data Mining (Legend?): Data mining of grocery store logs indicated that men who buy diapers also tend to buy beer at the same time. 27

28 Association Discovery Example #2 Wal-Mart studied product sales in their Florida stores in 2004 when several hurricanes passed through Florida. Wal-Mart found that, before the hurricanes arrived, people purchased 7 times as many of {one particular product} compared to everything else. 28

29 Association Discovery Example #2 Wal-Mart studied product sales in their Florida stores in 2004 when several hurricanes passed through Florida. Wal-Mart found that, before the hurricanes arrived, people purchased 7 times as many strawberry pop tarts compared to everything else. 29

30 Strawberry pop tarts???

31 Association Rule Discovery for Hurricane Intensification Forecasting Research by GMU geoscientists Predict the final strength of hurricane at landfall. Find co-occurrence of final hurricane strength with specific values of measured physical properties of the hurricane while it is still over the ocean. Result: the association rule discovery prediction is better than National Hurricane Center prediction! Research Paper by GMU scientists: 31

32 3 Examples of Analytics 1) Descriptive 2) Predictive 3) Cognitive 32

33 You can see a lot by just looking (and you can see around corners!) Cognitive, Contextual, Insightful, Forecastful 33

34 Final Thoughts 34

35 Data Science Strategies for Real-time Analytics 1) Design Patterns for Streaming Data Analytics: Detecting POI (Pattern, Product, Process, Person, or any Point Of Interest) Detecting BOI (Behavior Of Interest from any dynamic actor ) Precomputed scenarios and their responses (to speed up best action ) Design Thinking : UX, CX, EX (User / Customer / Employee experience) 2) Edge Analytics (move the algorithms to the sensor: intelligence at the point of data collection) Locality in Time 3) Near-field Analytics (what else is local to my asset?) Locality in Geospace 4) Related-entity Analytics (what else is similar to this event / entity?) Locality in Feature Space 5) Agile Analytics DataOps Culture of Experimentation Fail-fast / Learn-fast Build and deploy Learning Systems / Resilient Systems 35

36 Big Data + the IoT + Citizen Data Scientists = = Partners in Sustainability The Internet of Things (IoT): Knowing the knowable via deep, wide, and fast data from ubiquitous sensors! Sustainability Development Goals Big Data: In the Big Data era, Everything is Quantified and Monitored : Populations & Persons Smart Cities, Energy, Grids, Farms, Highways Environmental Sensors IoE = Internet of Everything! Discovery through Machine Learning and Data Science: Class Discovery, Correlation Discovery, Novelty Discovery, and Association Discovery: Find interesting cases where condition X is associated with event Y with time shift Z. 17 SDGs are KPIs for the World! (currently, the SDGs have 229 Key Performance Indicators) ( SDG: Sustainability Development Goal ) 36

37 Thank you! Contact information, for further questions or inquiries: Dr. Kirk Borne, Principal Data Scientist, Booz Allen Hamilton or Get slides here: Booz Allen Hamilton 37