Azure ML Studio. Overview for Data Engineers & Data Scientists
|
|
- Ilene Henry
- 5 years ago
- Views:
Transcription
1 Azure ML Studio Overview for Data Engineers & Data Scientists
2 Rakesh Soni, Big Data Practice Director Randi R. Ludwig, Ph.D., Data Scientist Daniel Lai, Data Scientist
3 Intersys Company Summary Overview Privately held IT services firm 150+ consultants spanning the full IT space Leverage a local, national and/or global model as appropriate for each customer and engagement Key Industries Retail and Hospitality Financial Services Healthcare High Tech Manufacturing Media and Advertising Core Values Be Accountable Bring Excellence Be Authentic Be in Service to Others Phoenix Austin NYC 3
4 Core Practice Capabilities Big Data & Analytics Big Data Analytics & Data Science Enterprise Search Business Intelligence Information Management Application Services Application Development Modern Web Mobility Cloud DevOps / Agile Technology Staffing Infrastructure Project Management Packaged Solutions Quality Assurance Cloud Project & Program Management Quality Assurance Assessment Strategy & Roadmap 4
5 Agenda Intersys Overview Machine Learning Azure Machine Learning Studio Main Features of Azure Machine Learning Studio Demo 1 Predicting Income Category Demo 2 Predicting Patient Readmission Q&A 5
6 Machine Learning What Is It?
7 Machine Learning All Around Us Credit: Google Deep Mind, Strategic Game Play Reinforced learning to take actions with the highest reward Credit: BBC News, 7 Autonomous Technology Training computers to be make intelligent and intuitive decisions
8 Machine Learning in Real Life Optimize Business Decisions Real time insights into customer behavior Credit: UiBS Microsoft Partner 8
9 Machine Learning How Does it Work? Credit: Google Research, "Machine Learning 101" Reported at: Data: examples used to train and validate model Model: the system that makes predictions or classifications Parameters: the signals or factors used by the model to form its decisions Learner: the system that adjusts the parameters and in turn the model by looking at differences in predictions versus actual outcome. 9
10 Azure Machine Learning Studio
11 Azure ML Studio What is it? Collaborative, drag-anddrop, fully managed cloud platform Build, test, and deploy predictive analytics Publish models as web services to be consumed by custom apps or BI tools. 11
12 Azure ML Studio Public Gallery Don t want to build a machine learning model from scratch? Avoid reinventing the wheel by browsing the public gallery for existing models that meet your needs. Add picture of gallery 12
13 Azure ML Studio Workspace To have full control, open your own Experiment. For any part of the process you can use built in features. Find these via the search bar or browse the categories shown. Once you ve found the module you need, just drag and drop onto the workspace. 13
14 Data Import And Export Options Bring in data in many forms. Export wherever you need it. 14
15 Easy To Create And Visualize Workflow Easy to follow workflow for creating and sharing. As you re creating your model, you can see dependencies in your process. When sharing your results, the workflow tells your story. 15
16 Dataset Statistics At Every Step Includes quick column statistics on your whole dataset Once your data is loaded into Azure ML, you can explore overall distributions of each feature. You can also quickly identify columns with many missing values. 16
17 Algorithm Selection Accuracy Training time Linearity Parameters Logistic Regression 5 Support Vector Machine * 5 Neural Network 9 Boosted decision tree 6 Decision forest 6 17 Classification Regression Accuracy Training time Linearity Parameters Logistic Regression 4 Bayesian Linear Regression 2 Neural Network 9 Boosted decision tree 5 Decision forest 6 Anomaly Detection Accuracy Training time Linearity Parameters Principal Component Analysis 3 K-means Clustering 4 Good performance Moderate performance * Good for large feature sets Large memory footprint Credit: Brandon Rohrer, azure.microsoft.com
18 Publish Output As A Web Service and BI Visualization 18
19 Azure Machine Learning Demo
20 Demo 1 - Predicting Income Category Azure ML Tour 20
21 Demo 2 Predicting Patient Readmission 21
22 Azure Machine Learning Studio Summary
23 Azure ML Summary Interactive & visual workspace Various data sources supported: SQL Server, HIVE tables, CSV file etc. Dataset statistics and easy exploration Data cleaning & transformation Many modeling algorithms included out of the box SQL/R/Python code can be included in workflow Many built-in ways to evaluate and compare models using standard performance metrics 23
24 Resources Microsoft's provided documentation is quite thorough and helpful: 24
25 25 Thank you. Any questions?
26 Q&A Please submit your questions into the chat field.
27 Demo Resource Slides
28 Demo 1: This is model that predicts whether a person earns > $50k. Here we show how to build a model from scratch, including training the model, predicting values for the test set, and evaluating results. 28
29 Input your data 29
30 Summary statistics 30
31 How to Clean Missing Data 31
32 How to Limit the Number of Features in Your Model 32
33 Splitting into Training/Test Sets 33
34 Choosing a Model 34
35 Training the Model 35
36 Which Features are Most Predictive? 36
37 Which Features are Most Predictive? 37
38 Predicting Values for Test Data 38
39 Check Predictions (Scored Labels) 39
40 Evaluate Model 40
41 Check Model Accuracy, Etc. 41
42 End of Demo 1 42
43 Demo 2: This model predicts whether a patient will be readmitted to a hospital for further treatment. This model considers a variety of strong machine learning algorithms. It then tunes the strongest model to be more efficient, evaluation of prediction results, and implements custom code in from R, Python, and SQL for further visualization and examination of data. 43
44 Explore the Dataset 44
45 Impute Missing Values 45
46 Compare Models: Decision Jungle 46
47 Compare Models: Boosted Decision Tree 47
48 Compare Models: Logistic Regression 48
49 Compare Models: Neural Network 49
50 Cross Validation Results: Variation in Accuracy for All Algorithms 50
51 ROC Chart comparison for models 51
52 Move Forward with Most Accurate Model 52
53 How to Cross Validate a Tuned Model 53
54 Assign Folds 54
55 Set Model Parameter Ranges 55
56 Tune Model Parameters 56
57 Use Model to Predict Values for Test Set 57
58 Visualize Predicted Values 58
59 Permute Features to Find Most Important 59
60 Find Features Most Influential to the Model 60
61 Can Export Results for Multiple Uses 61
62 Export CSV for debugging in notebooks 62
63 Sample R Notebook 63
64 SQL scripts 64
65 65
66 Find optimal cutoff in R script 66
67 R result 67
68 Python script for further visualizations 68
69 Python Visualization 69
70 Evaluate Final Model 70
71 Results 71