From Theory to Data Product Applying Data Science Methods to Effect Business Change KDD 2017 - August 13
Advanced Analytics Entry Points
Strategy Organization Policies, Procedures & Standards Components of an Advanced Analytics Project Corporate Drivers Insight Integration Data Strategy Change Management Data Driven Decision Making Architecture Business Value Planning Exploratory Analysis 3
IT What is an entry point? Bottom-Up Business Top-Down 4
5
IT Bottom-Up Business The Technology Directive Top-Down 6
Results were questioned and not aligned with current processes 8
Our vendor is really confident with this technology! 9
We do need Data Governance and we ll do that once we ve shown value 10
I m not sure who will look at it but it will be interesting 11
It s not that easy 12
The data is pretty self-evident 13
We didn t schedule time with [the SME s] for that 14
15
IT Bottom-Up Business The Technology Directive Top-Down 17
IT Bottom-Up The Field of Dreams Business The Technology Directive Top-Down 18
The right business people were not identified project resources 20
No executive support to move forward 21
We have easy access to the data 22
If we build it, they will come 23
I can t tell you what that means, you need to talk to 24
I can t get time with [SME] to 25
What happens if this information gets out? 26
Why wasn t I involved? 27
28
IT Bottom-Up The Field of Dreams Business The Technology Directive Top-Down 30
IT Bottom-Up The Field of Dreams Business The Technology Directive The Ambitious Executive Top-Down 31
End users saw no value in, or use of, the results 33
We don t need to talk to the end users 34
It doesn t matter what our current metrics are 35
We only use standard industry terms 36
We are just implementing what the business tells us 37
Are we sure this is what the users want? 38
39
IT Bottom-Up The Field of Dreams Business The Technology Directive The Ambitious Executive Top-Down 41
IT Bottom-Up The Field of Dreams The Smart Competitor Business The Technology Directive The Ambitious Executive Top-Down 42
We don t know what we don t know 44
The data is self-evident 45
That s third party software. We don t have a data dictionary 46
Doesn t seem like they are very interested 47
48
IT Bottom-Up The Field of Dreams The Smart Competitor Business The Technology Directive The Ambitious Executive Top-Down 50
IT Bottom-Up The Field of Dreams The Smart Competitor Business The Technology Directive The Ambitious Executive Top-Down 51
IT Bottom-Up The Field of Dreams The Smart Competitor Business The Technology Directive The Ambitious Executive Top-Down 52
53
IT Project Start Identify Business SMEs Identify Business Drivers Discuss business SME resource gaps with PM/Project Sponsor The Technology Directive Top-Down Midstream Project Start Gather results to date Engage business SMEs in workshop to focus on actionable insights 54
IT The Field of Dreams Bottom-Up Project Start Identify Business SMEs Identify Business Drivers Talk to business counterpart to garner support and participation Midstream Identify proper data stewards Engage business SMEs in workshop to focus on actionable insights 55
Project Start Identify Actionable Business Question Midstream Talk to end business users to best understand process and fit Business Identify Business Drivers Top-Down The Ambitious Executive 56
Bottom-Up Project Start Engage C-Suite early The Smart Competitor Check in with business drivers and strategy Business Midstream Realign if necessary 57
IT Advanced Analytics Entry Points Bottom-Up Business Top-Down 58
Group Exercise 59
Are you asking the right questions?
Here s what we re trying to avoid 61
Here s what we ll talk about What is a valuable business question? How do you define criteria specific to your business to identify valuable business questions How do you frame those questions and support prioritization Before and After examples of questions that have gone through this process
What is a Valuable Business Question?
When answered, a specific action can be taken which provides a measurable result in line with strategic business objectives.
Our Method - Activities Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Project Kick-Off Create Project Backlog Experiment Design Present Final Results Driver Definition Workshop Define Project Schedule Experiment Implementation Next Steps Coaching Business Value Workshop Define Architecture Experiment Testing Data Asset Inventory Install & Config Platform Analysis Prioritization Map Initial Data Ingest 65
Driver Definition Workshop
Business Value Workshop Who to invite? What do they need to know before they arrive (prepping them in advance)? What do they bring? 67
The First Hour
Non-Starters Can we show value with tool X? Data science helps businesses how can it help ours? How can we establish KPIs to more clearly indicate when high-level actions should be taken?
Preconceived We need to use deep learning to do some stuff Produce an algorithm that can be used, season to season, to set prices when new products become available in the market from us or competitors Create a full list of subscribers and their respective churn probabilities, indicating probability to churn within 4 weeks 70
Requires Refinement Who are the biggest users of healthcare in? How can we save $ on our preventive maintenance? When will customers leave? What products / features should we build?
Hour 2 and Beyond
Refining (Workshop) Brainstorm (Workshop!) Evaluate (Map!) Refine (Key Questions!) (Map) (Key Questions!) 73
Scenario 1: Telecommunications
In the business value workshop STARTED HERE: We want a machine learning model that predicts customer churn four weeks in advance of the event, with an accuracy of at least 80% and lift curves graphed to demonstrate why you picked this model. 75
We want to know when our customers will leave. Well, it doesn t always. Only some of the customers leaving actually matters.
High-value customers matter when they leave, they take a lot of business with them. Um Predicted lifetime value = $x/time unit/y years Well, we ll make sure customer service knows so they re careful with people who call in with a complaint. And maybe we ll offer them something to make staying more appealing. A special offer we d need at least two months to put something like that out. And how long do we need to change customer service approaches before someone leaves? Can we predict which of our high-value customers will leave in two months?
Candidate Business Question Can we predict which of our high-value customers will leave in two months? 78
Scenario 2: Healthcare
In the business value workshop STARTED WITH: Which of our company s three options is the most effective treatment approach for a given patient? 80
Why? Who? What? Where? Do we have it?
Candidate Business Question How do you predict the optimal treatment for any given patient before treatment is prescribed? 82
Evaluating and Prioritizing
Our Method - Activities Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Project Kick-Off Create Project Backlog Experiment Design Present Final Results Driver Definition Workshop Define Project Schedule Experiment Implementation Next Steps Coaching Business Value Workshop Define Architecture Experiment Testing Data Asset Inventory Install & Config Platform Analysis Prioritization Map Initial Data Ingest 84
Let s dive in to one example 1. How do you predict the optimal treatment for any given patient before treatment is prescribed? 2. What other products should we develop as part of our existing product line? (predict market needs) 3. Can we compile data from our healthcare providers to understand optimal practices in installing our products based on outcomes (recovery time)? 4. Can we determine poor performing products before they leave the factory? 85
Prioritization Criteria Value Feasibility Alignment 86
Value Value Timeframe Value Timeframe 10 8 Alignment Score 6 4 2 Value Score Value Score 0 Alignment Timeframe Feasibility Timeframe Feasibility Score 87
Feasibility Value Timeframe 10 8 Alignment Score 6 4 2 Value Score 0 Alignment Timeframe Feasibility Timeframe Feasibility Timeframe Feasibility Score Feasibility Score 88
Alignment Value Timeframe 10 Alignment Score Alignment Score 8 6 4 2 Value Score 0 Alignment Timeframe Alignment Timeframe Feasibility Timeframe Feasibility Score 89
Prioritization for 0ur scenario Value Timeframe 10 8 Question 1 Alignment Score 6 4 Value Score Question 2 2 0 Question 3 Question 4 Alignment Timeframe Feasibility Timeframe Feasibility Score 90
Valuable Business Question: When answered, a specific action can be taken which provides a measurable result in line with strategic business objectives. Brainstorm (Workshop!) Evaluate (Map!) Refine (Key Questions!) 91
Group Exercise 92
Agile Approach to Data Driven Decision Making
Strategy Organization Policies, Procedures & Standards Components of an Advanced Analytics Project Corporate Drivers Insight Integration Data Strategy Change Management Data Driven Decision Making Architecture Business Value Planning Exploratory Analysis 94
What is Agile? An iterative and incremental approach to project management and delivery Originated for software development Adapted for data driven decision making projects Allows project teams to adapt quickly to changing requirements and/or new insights 95
Agile Principles Source: http://agilemanifesto.org/principles.html Our highest priority is to satisfy the customer through early and continuous delivery of valuable software. Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale. Business people and implementation team must work together daily throughout the project. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. Working software is the primary measure of progress. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely. Continuous attention to technical excellence and good design enhances agility. Simplicity--the art of maximizing the amount of work not done--is essential. The best architectures, requirements, and designs emerge from self-organizing teams. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly. 96
Managing Uncertainty Experiment design is focused on constant progress toward the stated goal By their nature, experiments may not yield expected results Project strategy should be to adapt subsequent experiments, based on findings, to continue progressing toward the goal Using this approach mitigates risk Key activities early in the project validate feasibility of moving forward Goal is to fail fast reassess throughout to avoid unproductive activity 97
Agile Approach Charter Captures business drivers, guiding principles, scope, approach, who is involved Themes Prioritized Business Questions for investigation Epics Collections of Stories that comprise an investigation Theme Stories Units of work that can reasonably be completed in a Sprint 98
Other Agile / Scrum Terms Product Backlog An ordered list of everything that might be needed in the project Single source of requirements for any experiments to be conducted Sprint Backlog The set of Product Backlog items selected for the Sprint, plus a plan for delivery A forecast about what experiments will be addressed in this Sprint and the work needed to deliver these experiments 99
Our Method - Activities Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Project Kick-Off Create Project Backlog Experiment Design Present Final Results Driver Definition Workshop Define Project Schedule Experiment Implementation Next Steps Coaching Business Value Workshop Define Architecture Experiment Testing Data Asset Inventory Install & Config Platform Analysis Prioritization Map Initial Data Ingest 100
Our Method - Activities Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Project Kick-Off Create Project Backlog Experiment Design Present Final Results Driver Definition Workshop Define Project Schedule Experiment Implementation Next Steps Coaching Business Value Workshop Define Architecture Experiment Testing Data Asset Inventory Install & Config Platform Analysis Prioritization Map Initial Data Ingest 101
Types of Experiments Focused on data within each sprint Ex. Feature analysis Focused on testing results of a model Ex. Cross validation, lift, gain, etc. Focused on testing the insight A/B Testing, Focus groups, etc. Focused on data product Checking for model drift, model maintenance Ex. Recommendations Engine, etc. 102
Each Sprint Includes: Sprint Planning Workshop Updates to Product and Sprint Backlogs Development Activities Additional data ingestion <optional> Analysis & Review of Progress Check-in with Strategy & Drivers 103
Our Method - Deliverables Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Business Questions Document Prioritization Map Architecture Document Development Environment Setup Test Dataset Analysis Findings Final Results Recommended Next Steps Data Dictionary Project Schedule Data Quality Assessment Backlog Project Charter 104
Our Method Team Members Please Note: An individual may play more than one role Discovery Release Planning & Foundation Sprint 1 Sprint N-1 Sprint N Project Manager Project Manager Project Manager Project Manager Data Science Strategist Data Science Strategist Data Science Strategist Data Quality Analyst Data Quality Analyst Workshop Facilitator Data Architect / Developer Data Architect / Developer Product Owner Product Owner Product Owner Data Science Team 105
Responsible Accountable Consulted Informed (RACI) Deliverable Architect Data Science Strategist Data Quality Analyst Data Science Team Designer / Developer Business Questions Doc I RA C I I I Product Owner Project Manager Prioritization Map RA I I I Data Dictionary A R I I I Data Quality Assessment A R Project Charter C C C C C C RA Architecture Document RA C C Development Environment Setup A I R I I Project Schedule C C C C C C RA Backlog C C C C RA I Test Dataset C C RA I I I Analysis Findings A R I I Final Results C RA C I I Recommended Next Steps R RA C C I 106
Group Exercise 107
www.t4g.com Strategy Organization Data Strategy Policies, Procedures & Standards Thank You! Corporate Drivers Change Management Business Value Planning Insight Integration Data Driven Decision Making Architecture Exploratory Analysis Janet Forbes janet.forbes@t4g.com Lindsay Brin lindsay.brin@t4g.com Danielle Leighton danielle.leighton@t4g.com