Getting Started with Predictive Analytics

Size: px
Start display at page:

Download "Getting Started with Predictive Analytics"

Transcription

1 Getting Started with Predictive Analytics

2 What is Predictive Analytics? Predictive analytics uses many techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions about future. Using historical information to make predictions about the future

3 How can it help my organization? Risk Management Quality Management Outcomes Management Case Management Population Management Informed Decision making Forecasting Proper Resource Allocation Utilization Patterns Cost Reduction Intervention Opportunities Identifying Inefficiencies

4 Informed Decision Making Decision Theory is concerned with identifying the best decision to make, based on a set of assumptions. Our assumption is that the model is correct in its predictions, and that we are going to act on the predictions generated by the model. There are many strategies that can be used in decision theory We are going to perform some sort of intervention for each person that the model predicts to be a high utilizer in the coming year. A combination of predictive modeling, decision theory, and some math can even tell us for how much money we need to be able to perform that intervention.

5 What do I need to get started? Access to data Partner with (or hire/develop) someone who has Predictive Analytics experience Software R (free) Python (free) SPSS Clinical expertise Comfort with uncertainty

6 What does the process look like? Come up with a question Generate a dataset built around that question Develop a model using the dataset Analyze the results of the model Make changes to the model based on the analysis, until you are satisfied with the accuracy of the model Generate predictions from the model Operationalize the predictions This process is incredibly iterative

7 How do I implement this? Implementation is very flexible Find a way to get the predictions in the hands of clinicians Adding predictions to EHR s Identifying risky patients in morning meetings Developing action plans/interventions based on the predictions Operationalize the decision making process We intervene when the prediction goes above X These scores are elevated, but not enough to intervene, so let s keep an eye on these people Measure results

8 Measuring results Create some criteria by which to determine success/failure IPMH Cost Readmissions Total Cost Critical Incidents OPMH Engagement Keep track of whether or not an intervention occurred Collect the RIGHT information Collect that information as frequently as possible Perform analysis on those variables based on predictions and interventions

9 Through some analysis, we found that a small group of consumers were using a vastly disproportionate amount of funds We decided to try to predict when someone would become one of those high utilizers Here is what that process looked like

10 Step 1 The question Is this person going to be a high utilizer in the next X months?

11 Step 2 Build the dataset We built a dataset with records for each person in our system based on that question Each record looked at a starting point, then looked back in time for X months and forward in time X months.

12 For the historical window, we looked at claim records to determine: Demographic information Diagnostic information Services utilized Prescription drug fills Critical incidents A variety of survey s (PHQ9, CANS/ANSA, etc)

13 For the forward looking window, we looked at claim records to determine: Cost Classified whether or not that person was a high utilizer in that window This classification was a simple Y / N field

14 Step 3 Develop a model There are a wide variety of model types Some are better at specific goals than others Based on your data types and your goals, you can narrow down the model types to a select few We ran dataset through the model type we selected (Random Forest) using a software called R We analyzed the results of the model The results were OK, but not great, so we tweaked some existing variables and created some new variables, then ran the model again

15 Step 4 Analyze the results of the model Not all projects have the same goals when it comes to analysis Suicide prediction would want to err on the side of caution to catch EVERY instance of attempted suicide High Utilization may want to minimize false positives, for efficiency sake The results were OK, but not great, so we decided to tweak some variables and run the model again

16 Step 5 Make changes to the model During the analysis step, we noted which variables had the most predictive value We tweaked some existing variables and created some new variables Collaboration between the technical team and the clinical team is essential here Steps 2 through 5 were repeated until we stopped seeing gains in predictive ability

17 Step 6 Generate predictions Once the model was as good as we can get, we generated predictions for every person in the system, for a NEW 8 month period for which we don not know the outcomes. New predictions for each person are generated every month, then get passed along to a piloting partner who will act on those predictions

18 Step 6 Generate predictions Once the model was as good as we can get, we generated predictions for every person in the system, for a NEW 8 month period for which we don not know the outcomes. New predictions for each person are generated every month, then get passed along to a piloting partner who will act on those predictions

19 Step 7 Operationalize predictions For the partner to act on those predictions, we perform varied analyses on the predictions to help the partner maximize the number of interventions based on some criteria Available human resources Available financial resources Goals of the project (eg: Suicide vs. High Utilization) Organizational goals (eg: Cost Reduction, Productivity) Long term tracking of predictions and their trends

20 Things to be aware of This is just a tool It is not a replacement for clinical expertise Accuracy is related to timeliness of original data It s all about probability, not certainty