Lecture 6: Decision Tree, Random Forest, and Boosting
|
|
- Preston Sparks
- 5 years ago
- Views:
Transcription
1 Lecture 6: Decision Tree, Random Forest, and Boosting Tuo Zhao Schools of ISyE and CSE, Georgia Tech
2 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Tree? Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 2/42
3 Decision Trees
4 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Decision Trees Decision trees have a long history in machine learning The first popular algorithm dates back to 979 Very popular in many real world problems Intuitive to understand Easy to build Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 4/42
5 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis History EPAM- Elementary Perceiver and Memorizer 96: Cognitive simulation model of human concept learning 966: CLS-Early algorithm for decision tree construction 979: ID3 based on information theory 993: C4.5 improved over ID3 Also has history in statistics as CART (Classification and regression tree) Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 5/42
6 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Motivation How do people make decisions? Consider a variety of factors Follow a logical path of checks Should I eat at this restaurant? If there is no wait Yes If there is short wait and I am hungry Yes Else No Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 6/42
7 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Decision Flowchart Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 7/42
8 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Example: Should We Play Tennis? If temperature is not hot Play If outlook is overcast Play tennis Otherwise Don t play tennis Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 8/42
9 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Decision Tree Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 9/42
10 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Structure A decision tree consists of Nodes Branches Tests for variables Results of tests Leaves Classification Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting /42
11 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Function Class What functions can decision trees model? Non-linear: very powerful function class A decision tree can encode any Boolean function Proof Given a truth table for a function Construct a path in the tree for each row of the table Given a row as input, follow that path to the desired leaf (output) Problem: exponentially large trees! Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting /42
12 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Smaller Trees Can we produce smaller decision trees for functions? Yes (Possible) Key decision factors Counter examples Parity function: Return on even inputs, on odd inputs Majority function: Return if more than half of inputs are Bias-Variance Tradeoff Bias: Representation Power of Decision Trees Variance: require a sample size exponential in depth Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 2/42
13 Fitting Decision Trees
14 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis What Makes a Good Tree? Small tree: Occam s razor: Simpler is better Avoids over-fitting A decision tree may be human readable, but not use human logic! How do we build small trees that accurately capture data? Learning an optimal decision tree is computationally intractable Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 4/42
15 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Greedy Algorithm We can get good trees by simple greedy algorithms Adjustments are usually to fix greedy selection problems Recursive: Select the best variable, and generate child nodes: One for each possible value; Partition samples using the possible values, and assign these subsets of samples to the child nodes; Repeat for each child node until all samples associated with a node that are either all positive or all negative. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 5/42
16 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Variable Selection The best variable for partition The most informative variable Select the variable that is most informative about the labels Classification error? Example: P(Y = X = ) =.75 v.s. P(Y = X 2 = ) =.55 Which to choose? Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 6/42
17 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Information Theory The quantification of information Founded by Claude Shannon Simple concepts: Entropy: H(X) = x P(X = x) log P(X = x) Conditional Entropy: H(Y X) = x P(X = x)h(y X = x) Information Gain: IG(Y X) = H(Y) H(Y X) Select the variable with the highest information gain Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 7/42
18 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Example: Should We Play Tennis? H(Tennis) = 3/5 log 2 3/5 2/5 log 2 2/5 =.97 H(Tennis Out. = Sunny) = 2/2 log 2 2/2 /2 log 2 /2 = H(Tennis Out. = Overcast) = / log 2 / / log 2 / = H(Tennis Out. = Rainy) = /2 log 2 /2 2/2 log 2 2/2 = H(Tennis Out.) = 2/5 + /5 + 2/5 = Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 8/42
19 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Example: Should We Play Tennis? IG(Tennis Out.) =.97 =.97 If we knew the Outlook we d be able to predict Tennis! Outlook is the variable to pick for our decision tree Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 9/42
20 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis When to Stop Training? All data have same label Return that label No examples Return majority label of all data No further splits possible Return majority label of passed data If max IG =? Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 2/42
21 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Example: No Information Gain? Y X X 2 Both features give IG Once we divide the data, perfect classification! We need a little exploration sometimes (Unstable Equilibrium) Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 2/42
22 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Decision Tree with Continuous Features Decision Stump Threshold for binary classification: More than two classes: Threshold for three classes: Check X j δ j or X j < δ j. Check X j δ j, γ j X j < δ j or γ j < X j. Decompose one node to two modes: First node: Check X j δ j or X j < δ j, Second node: If X j < δ j, check γ j X j or γ j < X j. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 22/42
23 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University Decision Tree for Spam Classification 6/536 ch$<.555 ch$> /77 remove<.6 remove>.6 spam 48/359 hp<.45 hp>.45 spam spam 8/65 9/2 26/337 /22 ch!<.9 george<.5capave<2.97 ch!>.9 george>.5capave> spam spam spam 8/86 /24 6/9 /3 9/ 7/227 george<.5 CAPAVE< <.58 george>.5 CAPAVE> > spam 8/652 /29 36/23 6/8 hp<.3 free<.65 hp>.3 free>.65 spam 8/9 / spam 77/423 3/229 6/94 9/29 CAPMAX<.5 business<.45 CAPMAX>.5 business> spam 2/238 57/85 4/89 3/5 receive<.25edu<.45 receive>.25edu>.45 spam 9/236 /2 48/3 9/72 our<.2 our>.2 spam 37/ /2 Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 23/42
24 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University Decision Tree for Spam Classification Sensitivity ROC curve for pruned tree on SPAM data o TREE Error: 8.7% Specificity SPAM Data Overall error rate on test data 8.7%. ROC curve obtained by vary ing the threshold c of the clas sifier: C(X) =+if ˆP (+ X) >c. Sensitivity: proportion of tru spam identified Specificity: proportion of tru identified. Sensitivity: We may proportion want specificity of true to spam be high, identified and suffer some spam: Specificity: Specificity proportion : 95% of = true Sensitivity identified. : 79% Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 24/42
25 CS764/ISYE/CSE 674: Boosting Machine Learning/Computational Trevor Data Analysis Hastie, Stanford University 2 Decision Tree v.s. SVM ROC curve for TREE vs SVM on SPAM data Sensitivity o o SVM Error: 6.7% TREE Error: 8.7% TREE vs SVM Comparing ROC curves on the test data is a good way to compare classifiers. SVM dominates TREE here Specificity SVM outperforms Decision Tree. AUC: Area Under Curve. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 25/42
26 Bagging and Random Forest
27 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Toy Example Toy Example - No Noise Bayes Error Rate: X2 X Deterministic proble noise comes from sa pling distribution of Use a training sam of size 2. Here Bayes Error %. Nonlinear separable data. Optimal decision boundary: X 2 + X2 2 =. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 27/42
28 x.<-.783 x.>-.783 CS764/ISYE/CSE 674: Machine Learning/ComputationalBoosting Data Analysis Trevor Hastie, Stanford University 5 Toy Example: Decision Tree Classification Tree Trevor Hastie, Stanford University Decision Boundary: Tree 94/2 x.2<-.67 x.2>-.67 /34 72/66 x.2<.4988 x.2> /34 /32 x.<.3632 x.> /7 /7 x.< x.> /26 2/9 x.<-.668 x.2< x.>-.668 x.2> /2 5/4 2/8 /83 X Error Rate:.73 When th are in sification rather no rule Ĉ(X around 3 /5 4/ X Sample size: 2 7 branching nodes; 6 layers. Classification error: 7.3% when d = 2; > 3% when d =. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 28/42
29 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Model Averaging Decision trees can be simple, but often produce noisy (bushy) or weak (stunted) classifiers. Bagging (Breiman, 996): Fit many large trees to bootstrap-resampled versions of the training data, and classify by majority vote. Boosting (Freund & Shapire, 996): Fit many large or small trees to reweighted versions of the training data. Classify by weighted majority vote. Random Forests (Breiman 999): Fancier version of bagging. In general Boosting>Random Forests>Bagging>Single Tree. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 29/42
30 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Bagging/Bootstrap Aggregation Motivation: Average a given procedure over many samples to reduce the variance. Given a classifier C(S, x), based on our training data S, producing a predicted class label at input point x. To bag C, we draw bootstrap samples S,..., S B each of size N with replacement from the training data. Then Ĉ Bag = Mojarity Vote{C(S b, x)} B b=. Bagging can dramatically reduce the variance of unstable procedures (like trees), leading to improved prediction. All simple structures in a tree are lost. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 3/42
31 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Toy Example: Bagging Trees Boosting Trevor Hastie, Stanford University 9 Original Tree Bootstrap Tree /3 7/3 x.2<.39 x.2>.39 x.2<.36 x.2>.36 3/2 2/9 /23 /7 x.3<-.575 x.3>-.575 x.<-.965 x.> /5 /6 /5 /8 Bootstrap Tree 2 Bootstrap Tree 3 /3 4/3 x.2<.39 x.2>.39 x.4<.395 x.4>.395 3/22 /8 2/25 2/5 x.3<-.575 x.3>-.575 x.3<-.575 x.3> /5 /7 2/5 /2 Bootstrap Tree 4 Bootstrap Tree 5 3/3 2/3 x.2<.255 x.2>.255 x.2<.38 x.2>.38 2/6 3/4 4/2 2/ x.3<-.385 x.3>-.385 x.3<-.6 x.3>-.6 2/5 / 2/6 /4 2 branching nodes; 2 layers. 5 dependent trees. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 3/42
32 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Toy Example: Bagging Trees Decision Boundary: Bagging X X Error Rate:.32 Bagging averages m trees, and produ smoother decision bou aries. A smoother decision boundary. Classification error: 3.2% (Single deeper tree 7.3%). Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 32/42
33 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Random Forest Bagging features and samples simultaneously: At each tree split, a random sample of m features is drawn, and only those m features are considered for splitting. Typically m = d or log 2 d, where d is the number of features For each tree grown on a bootstrap sample, the error rate for observations left out of the bootstrap sample is monitored. This is called the out-of-bag error rate. random forests tries to improve on bagging by de-correlating the trees. Each tree has the same expectation. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 33/42
34 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 22 Random Forest for Spam Classification ROC curve for TREE, SVM and Random Forest on SPAM data Sensitivity o o o Random Forest Error: 5.% SVM Error: 6.7% TREE Error: 8.7% TREE, SVM and RF Random Forest dominates both other methods on the SPAM data 5.% error. Used 5 trees with default settings for random Forest package in R Specificity RF outperforms SVM. 5 Trees. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 34/42
35 oosting CS764/ISYE/CSE 674: Machine Trevor Learning/Computational Hastie, Stanford University Data Analysis 4 RandomSpam: Forest: Variable Variable Importance Importance Scores 3d addresses labs telnet direct table cs 85 parts # credit lab [ conference report original data project font make address order all hpl technology people pm mail over 65 meeting ; internet receive business re( 999 will money our you edu CAPTOT george CAPMAX your CAPAVE free remove hp $! Relative importance The bootstrap roughly covers /3 samples for each time. Do permutations over variables to check how important they are. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 35/42
36 Boosting
37 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 23 Boosting Weighted Sample C M (x) Boosting Weighted Sample Weighted Sample C 3(x) C 2(x) Average many trees, each grown to re-weighted versions of the training data. Final Classifier is weighted average of classifiers: [ M ] C(x) =sign m= α mc m (x) Training Sample C (x) Average many trees, each grown to re-weighted versions of the training data (Iterative). Final Classifier is weighted average of classifiers: Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 37/42
38 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 24 Adaboost v.s. Bagging Node Trees Boosting vs Bagging Test Error Bagging AdaBoost 2 points from Nested Spheres in R Bayes error rate is %. Trees are grown best first without pruning. Leftmost term is a single tree Number of Terms Each classifier for bagging is a tree with a single node (decision stump). 2 samples from Spheres in R. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 38/42
39 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 25 Adaboost AdaBoost (Freund & Schapire, 996). Initialize the observation weights w i =/N, i =, 2,...,N. 2. For m =tom repeat steps (a) (d): (a) Fit a classifier C m (x) to the training data using weights w i. (b) Compute weighted error of newest tree N i= err m = w ii(y i C m (x i )) N i= w. i (c) Compute α m =log[( err m )/err m ]. (d) Update weights for i =,...,N: w i w i exp[α m I(y i C m (x i ))] and renormalize to w i to sum to. [ M ] 3. Output C(x) = sign m= α mc m (x). w i s are the weights of the samples. err m is the weighted training error. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 39/42
40 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Adaboost Boosting Trevor Hastie, Stanford University 26 Test Error Single Stump 4 Node Tree Boosting Stumps A stump is a two-node tree, after a single split. Boosting stumps works remarkably well on the nested-spheres problem Boosting Iterations The ensemble of classifiers is much more efficient than the simple combination of classifiers. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 4/42
41 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 27 Overfitting of Adaboost Train and Test Error Training Error Nested spheres in - Dimensions. Bayes error is %. Boosting drives the training error to zero. Further iterations continue to improve test error in many examples Number of Terms More iterations continue to improve test error in many examples. The Adaboost is often observed to be robust to overfitting. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 4/42
42 CS764/ISYE/CSE 674: Machine Learning/Computational Data Analysis Boosting Trevor Hastie, Stanford University 28 Overfitting of Adaboost Train and Test Error Bayes Error Noisy Problems Nested Gaussians in -Dimensions. Bayes error is 25%. Boosting with stumps Here the test error does increase, but quite slowly Number of Terms Optimal Bayes classification error: 25%. The test error does increase, but quite slowly. Tuo Zhao Lecture 6: Decision Tree, Random Forest, and Boosting 42/42
Decision Tree Learning. Richard McAllister. Outline. Overview. Tree Construction. Case Study: Determinants of House Price. February 4, / 31
1 / 31 Decision Decision February 4, 2008 2 / 31 Decision 1 2 3 3 / 31 Decision Decision Widely Used Used for approximating discrete-valued functions Robust to noisy data Capable of learning disjunctive
More informationToday. Last time. Lecture 5: Discrimination (cont) Jane Fridlyand. Oct 13, 2005
Biological question Experimental design Microarray experiment Failed Lecture : Discrimination (cont) Quality Measurement Image analysis Preprocessing Jane Fridlyand Pass Normalization Sample/Condition
More informationCopyright 2013, SAS Institute Inc. All rights reserved.
IMPROVING PREDICTION OF CYBER ATTACKS USING ENSEMBLE MODELING June 17, 2014 82 nd MORSS Alexandria, VA Tom Donnelly, PhD Systems Engineer & Co-insurrectionist JMP Federal Government Team ABSTRACT Improving
More informationCS6716 Pattern Recognition
CS6716 Pattern Recognition Aaron Bobick School of Interactive Computing Administrivia Shray says the problem set is close to done Today chapter 15 of the Hastie book. Very few slides brought to you by
More informationEnsemble Modeling. Toronto Data Mining Forum November 2017 Helen Ngo
Ensemble Modeling Toronto Data Mining Forum November 2017 Helen Ngo Agenda Introductions Why Ensemble Models? Simple & Complex ensembles Thoughts: Post-real-life Experimentation Downsides of Ensembles
More informationPrediction and Interpretation for Machine Learning Regression Methods
Paper 1967-2018 Prediction and Interpretation for Machine Learning Regression Methods D. Richard Cutler, Utah State University ABSTRACT The last 30 years has seen extraordinary development of new tools
More informationSupervised Learning Using Artificial Prediction Markets
Supervised Learning Using Artificial Prediction Markets Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, FSU Dept. of Scientific Computing 1 Main Contributions
More informationUsing Decision Tree to predict repeat customers
Using Decision Tree to predict repeat customers Jia En Nicholette Li Jing Rong Lim Abstract We focus on using feature engineering and decision trees to perform classification and feature selection on the
More informationDecision Trees And Random Forests A Visual Introduction For Beginners
Decision Trees And Random Forests A Visual Introduction For Beginners We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on
More informationRanking Potential Customers based on GroupEnsemble method
Ranking Potential Customers based on GroupEnsemble method The ExceedTech Team South China University Of Technology 1. Background understanding Both of the products have been on the market for many years,
More informationStartup Machine Learning: Bootstrapping a fraud detection system. Michael Manapat
Startup Machine Learning: Bootstrapping a fraud detection system Michael Manapat Stripe @mlmanapat About me: Engineering Manager of the Machine Learning Products Team at Stripe About Stripe: Payments infrastructure
More informationData Science in a pricing process
Data Science in a pricing process Michaël Casalinuovo Consultant, ADDACTIS Software michael.casalinuovo@addactis.com Contents Nowadays, we live in a continuously changing market environment, Pricing has
More informationApplications of Machine Learning to Predict Yelp Ratings
Applications of Machine Learning to Predict Yelp Ratings Kyle Carbon Aeronautics and Astronautics kcarbon@stanford.edu Kacyn Fujii Electrical Engineering khfujii@stanford.edu Prasanth Veerina Computer
More informationAirline Itinerary Choice Modeling Using Machine Learning
Airline Itinerary Choice Modeling Using Machine Learning A. Lhéritier a, M. Bocamazo a, T. Delahaye a, R. Acuna-Agost a a Innovation and Research Division, Amadeus S.A.S., 485 Route du Pin Montard, 06902
More informationIntroduction to Random Forests for Gene Expression Data. Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 3.
Introduction to Random Forests for Gene Expression Data Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 3.5 1 References Breiman, Machine Learning (2001) 45(1): 5-32. Diaz-Uriarte
More informationReferences. Introduction to Random Forests for Gene Expression Data. Machine Learning. Gene Profiling / Selection
References Introduction to Random Forests for Gene Expression Data Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 3.5 Breiman, Machine Learning (2001) 45(1): 5-32. Diaz-Uriarte
More informationProgress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong
Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.
More informationRandom Forests. Parametrization and Dynamic Induction
Random Forests Parametrization and Dynamic Induction Simon Bernard Document and Learning research team LITIS laboratory University of Rouen, France décembre 2014 Random Forest Classifiers Random Forests
More informationA Brief History. Bootstrapping. Bagging. Boosting (Schapire 1989) Adaboost (Schapire 1995)
A Brief History Bootstrapping Bagging Boosting (Schapire 1989) Adaboost (Schapire 1995) What s So Good About Adaboost Improves classification accuracy Can be used with many different classifiers Commonly
More informationSalford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.
Powerful machine learning software for developing predictive, descriptive, and analytical models. The Company Minitab helps companies and institutions to spot trends, solve problems and discover valuable
More informationTree Depth in a Forest
Tree Depth in a Forest Mark Segal Center for Bioinformatics & Molecular Biostatistics Division of Bioinformatics Department of Epidemiology and Biostatistics UCSF NUS / IMS Workshop on Classification and
More information3 Ways to Improve Your Targeted Marketing with Analytics
3 Ways to Improve Your Targeted Marketing with Analytics Introduction Targeted marketing is a simple concept, but a key element in a marketing strategy. The goal is to identify the potential customers
More informationProposal for ISyE6416 Project
Profit-based classification in customer churn prediction: a case study in banking industry 1 Proposal for ISyE6416 Project Profit-based classification in customer churn prediction: a case study in banking
More informationMachine Learning Logistic Regression Hamid R. Rabiee Spring 2015
Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Probabilistic Classification Introduction to Logistic regression Binary logistic regression
More informationChurn Prediction for Game Industry Based on Cohort Classification Ensemble
Churn Prediction for Game Industry Based on Cohort Classification Ensemble Evgenii Tsymbalov 1,2 1 National Research University Higher School of Economics, Moscow, Russia 2 Webgames, Moscow, Russia etsymbalov@gmail.com
More informationForecasting Intermittent Demand Patterns with Time Series and Machine Learning Methodologies
Forecasting Intermittent Demand Patterns with Time Series and Machine Learning Methodologies Yuwen Hong, Jingda Zhou, Matthew A. Lanham Purdue University, Department of Management, 403 W. State Street,
More informationRandom Forests. Parametrization, Tree Selection and Dynamic Induction
Random Forests Parametrization, Tree Selection and Dynamic Induction Simon Bernard Document and Learning research team LITIS lab. University of Rouen, France décembre 2014 Random Forest Classifiers Random
More informationPREDICTION OF CONCRETE MIX COMPRESSIVE STRENGTH USING STATISTICAL LEARNING MODELS
Journal of Engineering Science and Technology Vol. 13, No. 7 (2018) 1916-1925 School of Engineering, Taylor s University PREDICTION OF CONCRETE MIX COMPRESSIVE STRENGTH USING STATISTICAL LEARNING MODELS
More informationLinear model to forecast sales from past data of Rossmann drug Store
Abstract Linear model to forecast sales from past data of Rossmann drug Store Group id: G3 Recent years, the explosive growth in data results in the need to develop new tools to process data into knowledge
More informationSPM 8.2. Salford Predictive Modeler
SPM 8.2 Salford Predictive Modeler SPM 8.2 The SPM Salford Predictive Modeler software suite is a highly accurate and ultra-fast platform for developing predictive, descriptive, and analytical models from
More informationCopyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT
ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ANALYTICAL MODEL DEVELOPMENT AGENDA Enterprise Miner: Analytical Model Development The session looks at: - Supervised and Unsupervised Modelling - Classification
More informationLogistic Regression and Decision Trees
Logistic Regression and Decision Trees Reminder: Regression We want to find a hypothesis that explains the behavior of a continuous y y = B0 + B1x1 + + Bpxp+ ε Source Regression for binary outcomes Regression
More informationWE consider the general ranking problem, where a computer
5140 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 11, NOVEMBER 2008 Statistical Analysis of Bayes Optimal Subset Ranking David Cossock and Tong Zhang Abstract The ranking problem has become increasingly
More informationCS229 Project Report Using Newspaper Sentiments to Predict Stock Movements Hao Yee Chan Anthony Chow
CS229 Project Report Using Newspaper Sentiments to Predict Stock Movements Hao Yee Chan Anthony Chow haoyeec@stanford.edu ac1408@stanford.edu Problem Statement It is often said that stock prices are determined
More informationAutomatic Ranking by Extended Binary Classification
Automatic Ranking by Extended Binary Classification Hsuan-Tien Lin Joint work with Ling Li (ALT 06, NIPS 06) Learning Systems Group, California Institute of Technology Talk at Institute of Information
More informationComparative Study of Gaussian and Z- shaped curve membership function for Fuzzy Classification Approach
Comparative Study of Gaussian and Z- shaped curve membership function for Fuzzy Classification Approach Pooja Agarwal 1*, Arti Arya 2, Suryaprasad J. 3 1* Dept. of CSE, PESIT Bangalore South Campus, Bangalore,
More informationIndicator Development for Forest Governance. IFRI Presentation
Indicator Development for Forest Governance IFRI Presentation Plan for workshop: 4 parts Creating useful and reliable indicators IFRI research and data Results of analysis Monitoring to improve interventions
More informationDecision Support Systems
Decision Support Systems 50 (2011) 602 613 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss Data mining for credit card fraud: A comparative
More informationRank hotels on Expedia.com to maximize purchases
Rank hotels on Expedia.com to maximize purchases Nishith Khantal, Valentina Kroshilina, Deepak Maini December 14, 2013 1 Introduction For an online travel agency (OTA), matching users to hotel inventory
More informationModel Selection, Evaluation, Diagnosis
Model Selection, Evaluation, Diagnosis INFO-4604, Applied Machine Learning University of Colorado Boulder October 31 November 2, 2017 Prof. Michael Paul Today How do you estimate how well your classifier
More informationSales prediction in online banking
Sales prediction in online banking Mathias Iden Erlend S. Stenberg Master of Science in Computer Science Submission date: June 2018 Supervisor: Jon Atle Gulla, IDI Norwegian University of Science and Technology
More informationMetaheuristics. Approximate. Metaheuristics used for. Math programming LP, IP, NLP, DP. Heuristics
Metaheuristics Meta Greek word for upper level methods Heuristics Greek word heuriskein art of discovering new strategies to solve problems. Exact and Approximate methods Exact Math programming LP, IP,
More informationPredicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest
Predicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest 1. Introduction Reddit is a social media website where users submit content to a public forum, and other
More informationADVANCED DATA ANALYTICS
ADVANCED DATA ANALYTICS MBB essay by Marcel Suszka 17 AUGUSTUS 2018 PROJECTSONE De Corridor 12L 3621 ZB Breukelen MBB Essay Advanced Data Analytics Outline This essay is about a statistical research for
More informationMethodological challenges of Big Data for official statistics
Methodological challenges of Big Data for official statistics Piet Daas Statistics Netherlands THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Content Big Data: properties
More informationApplication research on traffic modal choice based on decision tree algorithm Zhenghong Peng 1,a, Xin Luan 2,b
Applied Mechanics and Materials Online: 011-09-08 ISSN: 166-748, Vols. 97-98, pp 843-848 doi:10.408/www.scientific.net/amm.97-98.843 011 Trans Tech Publications, Switzerland Application research on traffic
More informationFrom Ordinal Ranking to Binary Classification
From Ordinal Ranking to Binary Classification Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk at Caltech CS/IST Lunch Bunch March 4, 2008 Benefited from joint work with Dr.
More informationFrom Ordinal Ranking to Binary Classification
From Ordinal Ranking to Binary Classification Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk at Dept. of CSIE, National Taiwan University March 21, 2008 Benefited from joint
More informationBUSINESS DATA MINING (IDS 572) Please include the names of all team-members in your write up and in the name of the file.
BUSINESS DATA MINING (IDS 572) HOMEWORK 4 DUE DATE: TUESDAY, APRIL 10 AT 3:20 PM Please provide succinct answers to the questions below. You should submit an electronic pdf or word file in blackboard.
More informationCustomer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara
Customer Relationship Management in marketing programs: A machine learning approach for decision Fernanda Alcantara F.Alcantara@cs.ucl.ac.uk CRM Goal Support the decision taking Personalize the best individual
More informationNew restaurants fail at a surprisingly
Predicting New Restaurant Success and Rating with Yelp Aileen Wang, William Zeng, Jessica Zhang Stanford University aileen15@stanford.edu, wizeng@stanford.edu, jzhang4@stanford.edu December 16, 2016 Abstract
More informationInsights from the Wikipedia Contest
Insights from the Wikipedia Contest Kalpit V Desai, Roopesh Ranjan Abstract The Wikimedia Foundation has recently observed that newly joining editors on Wikipedia are increasingly failing to integrate
More informationText Categorization. Hongning Wang
Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS
More informationText Categorization. Hongning Wang
Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS
More informationAccelerating battery development via early prediction of cell lifetime
Accelerating battery development via early prediction of cell lifetime Peter Attia, Marc Deetjen, Jeremy Witmer Stanford University Abstract Early prediction of battery lifetime would aid in their development,
More informationUtilizing Optimization Techniques to Enhance Cost and Schedule Risk Analysis
1 Utilizing Optimization Techniques to Enhance Cost and Schedule Risk Analysis Colin Smith, Brandon Herzog SCEA 2012 2 Table of Contents Introduction to Optimization Optimization and Uncertainty Analysis
More informationBOOSTED TREES FOR ECOLOGICAL MODELING AND PREDICTION
Ecology, 88(1), 2007, pp. 243 251 Ó 2007 by the Ecological Society of America BOOSTED TREES FOR ECOLOGICAL MODELING AND PREDICTION GLENN DE ATH 1 Australian Institute of Marine Science, PMB 3, Townsville
More informationMachine learning in neuroscience
Machine learning in neuroscience Bojan Mihaljevic, Luis Rodriguez-Lujan Computational Intelligence Group School of Computer Science, Technical University of Madrid 2015 IEEE Iberian Student Branch Congress
More informationPredicting Restaurants Rating And Popularity Based On Yelp Dataset
CS 229 MACHINE LEARNING FINAL PROJECT 1 Predicting Restaurants Rating And Popularity Based On Yelp Dataset Yiwen Guo, ICME, Anran Lu, ICME, and Zeyu Wang, Department of Economics, Stanford University Abstract
More informationActive Learning for Logistic Regression
Active Learning for Logistic Regression Andrew I. Schein The University of Pennsylvania Department of Computer and Information Science Philadelphia, PA 19104-6389 USA ais@cis.upenn.edu April 21, 2005 What
More informationApplying Regression Techniques For Predictive Analytics Paviya George Chemparathy
Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS
More informationECONOMIC MODELLING & MACHINE LEARNING
ECONOMIC MODELLING & MACHINE LEARNING A PROOF OF CONCEPT NICOLAS WOLOSZKO, OECD TECHNOLOGY POLICY INSTITUTE FEB 22 2017 Economic forecasting with Adaptive Trees 1 2 Motivation Adaptive Trees 3 Proof of
More informationFrom Ordinal Ranking to Binary Classification
From Ordinal Ranking to Binary Classification Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk at CS Department, National Tsing-Hua University March 27, 2008 Benefited from
More informationDetecting gene gene interactions that underlie human diseases
Genome-wide association studies Detecting gene gene interactions that underlie human diseases Heather J. Cordell Abstract Following the identification of several disease-associated polymorphisms by genome-wide
More informationDetecting gene gene interactions that underlie human diseases
REVIEWS Genome-wide association studies Detecting gene gene interactions that underlie human diseases Heather J. Cordell Abstract Following the identification of several disease-associated polymorphisms
More informationPredicting ratings of peer-generated content with personalized metrics
Predicting ratings of peer-generated content with personalized metrics Project report Tyler Casey tyler.casey09@gmail.com Marius Lazer mlazer@stanford.edu [Group #40] Ashish Mathew amathew9@stanford.edu
More informationIn silico prediction of novel therapeutic targets using gene disease association data
In silico prediction of novel therapeutic targets using gene disease association data, PhD, Associate GSK Fellow Scientific Leader, Computational Biology and Stats, Target Sciences GSK Big Data in Medicine
More informationMachine Learning Models for Sales Time Series Forecasting
Article Machine Learning Models for Sales Time Series Forecasting Bohdan M. Pavlyshenko SoftServe, Inc., Ivan Franko National University of Lviv * Correspondence: bpavl@softserveinc.com, b.pavlyshenko@gmail.com
More informationFrom Ordinal Ranking to Binary Classification
From Ordinal Ranking to Binary Classification Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk at CS Department, National Chiao Tung University March 26, 2008 Benefited from
More informationStrength in numbers? Modelling the impact of businesses on each other
Strength in numbers? Modelling the impact of businesses on each other Amir Abbas Sadeghian amirabs@stanford.edu Hakan Inan inanh@stanford.edu Andres Nötzli noetzli@stanford.edu. INTRODUCTION In many cities,
More informationWhen to Book: Predicting Flight Pricing
When to Book: Predicting Flight Pricing Qiqi Ren Stanford University qiqiren@stanford.edu Abstract When is the best time to purchase a flight? Flight prices fluctuate constantly, so purchasing at different
More informationEnabling Foresight. Skills for Predictive Analytics
Enabling Foresight Skills for Predictive Analytics Topic Outline INTRODUCTION Predictive Analytics Definition & Core Concepts Terms Used in Today s Analytics Environment Artificial Intelligence Deep Learning
More informationPrediction from Blog Data
Prediction from Blog Data Aditya Parameswaran Eldar Sadikov Petros Venetis 1. INTRODUCTION We have approximately one year s worth of blog posts from [1] with over 12 million web blogs tracked. On average
More informationPredictive Analytics
Predictive Analytics Mani Janakiram, PhD Director, Supply Chain Intelligence & Analytics, Intel Corp. Adjunct Professor of Supply Chain, ASU October 2017 "Prediction is very difficult, especially if it's
More informationAdvances in Machine Learning for Credit Card Fraud Detection
Advances in Machine Learning for Credit Card Fraud Detection May 14, 2014 Alejandro Correa Bahnsen Introduction Europe fraud evolution Internet transactions (millions of euros) 800 700 600 500 2007 2008
More informationDynamic Advisor-Based Ensemble (dynabe): Case Study in Stock Trend Prediction of Critical Metal Companies
Dynamic Advisor-Based Ensemble (dynabe): Case Study in Stock Trend Prediction of Critical Metal Companies Zhengyang Dong Middlesex School, Concord, MA, USA Corresponding Author: Zhengyang Dong Middlesex
More informationStudy on Customer Loyalty Prediction Based on RF Algorithm
2134 JOURNAL OF COMPUTERS, VOL. 8, NO. 8, AUGUST 213 Study on Customer Loyalty Prediction Based on RF Algorithm Jiong Mu Email: scmjmj@yahoo.com.cn Lijia Xu* Email: lijiaxu13@163.com, Xuliang Duan and
More informationProject Category: Life Sciences SUNet ID: bentyeh
Project Category: Life Sciences SUNet ID: bentyeh Predicting Protein Interactions of Intrinsically Disordered Protein Regions Benjamin Yeh, Department of Computer Science, Stanford University CS 229 Project,
More informationBig Data. Methodological issues in using Big Data for Official Statistics
Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics
More informationAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms Jieming Wei Sharon Zhang Introduction Many organizations prospect for loyal supporters and donors by sending direct mail appeals. This is an
More informationDetermining NDMA Formation During Disinfection Using Treatment Parameters Introduction Water disinfection was one of the biggest turning points for
Determining NDMA Formation During Disinfection Using Treatment Parameters Introduction Water disinfection was one of the biggest turning points for human health in the past two centuries. Adding chlorine
More informationE-Commerce Sales Prediction Using Listing Keywords
E-Commerce Sales Prediction Using Listing Keywords Stephanie Chen (asksteph@stanford.edu) 1 Introduction Small online retailers usually set themselves apart from brick and mortar stores, traditional brand
More informationThree lessons learned from building a production machine learning system. Michael Manapat
Three lessons learned from building a production machine learning system Michael Manapat Stripe @mlmanapat Fraud Card numbers are stolen by hacking, malware, etc. Dumps are sold in carding forums Fraudsters
More informationStock Price Prediction with Daily News
Stock Price Prediction with Daily News GU Jinshan MA Mingyu Derek MA Zhenyuan ZHOU Huakang 14110914D 14110562D 14111439D 15050698D 1 Contents 1. Work flow of the prediction tool 2. Model performance evaluation
More informationPredicting prokaryotic incubation times from genomic features Maeva Fincker - Final report
Predicting prokaryotic incubation times from genomic features Maeva Fincker - mfincker@stanford.edu Final report Introduction We have barely scratched the surface when it comes to microbial diversity.
More informationC-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING. CAS Annual Meeting November 7-10
1 C-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING CAS Annual Meeting November 7-10 GLM Process 2 Data Prep Model Form Validation Reduction Simplification Interactions GLM Process 3 Opportunities
More informationFINAL PROJECT REPORT IME672. Group Number 6
FINAL PROJECT REPORT IME672 Group Number 6 Ayushya Agarwal 14168 Rishabh Vaish 14553 Rohit Bansal 14564 Abhinav Sharma 14015 Dil Bag Singh 14222 Introduction Cell2Cell, The Churn Game. The cellular telephone
More informationAcknowledgments. Data Mining. Examples. Overview. Colleagues
Data Mining A regression modeler s view on where it is and where it s likely to go. Bob Stine Department of Statistics The School of the Univ of Pennsylvania March 30, 2006 Acknowledgments Colleagues Dean
More informationChapter 3 Top Scoring Pair Decision Tree for Gene Expression Data Analysis
Chapter 3 Top Scoring Pair Decision Tree for Gene Expression Data Analysis Marcin Czajkowski and Marek Krȩtowski Abstract Classification problems of microarray data may be successfully performed with approaches
More informationPredicting Corporate Influence Cascades In Health Care Communities
Predicting Corporate Influence Cascades In Health Care Communities Shouzhong Shi, Chaudary Zeeshan Arif, Sarah Tran December 11, 2015 Part A Introduction The standard model of drug prescription choice
More informationApplication of Decision Trees in Mining High-Value Credit Card Customers
Application of Decision Trees in Mining High-Value Credit Card Customers Jian Wang Bo Yuan Wenhuang Liu Graduate School at Shenzhen, Tsinghua University, Shenzhen 8, P.R. China E-mail: gregret24@gmail.com,
More informationCSE 255 Lecture 3. Data Mining and Predictive Analytics. Supervised learning Classification
CSE 255 Lecture 3 Data Mining and Predictive Analytics Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression,
More informationIntro Logistic Regression Gradient Descent + SGD
Case Study 1: Estimating Click Probabilities Intro Logistic Regression Gradient Descent + SGD Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade March 29, 2016 1 Ad Placement
More informationCredit Card Marketing Classification Trees
Credit Card Marketing Classification Trees From Building Better Models With JMP Pro, Chapter 6, SAS Press (2015). Grayson, Gardner and Stephens. Used with permission. For additional information, see community.jmp.com/docs/doc-7562.
More informationPredictive Modeling Using SAS Visual Statistics: Beyond the Prediction
Paper SAS1774-2015 Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction ABSTRACT Xiangxiang Meng, Wayne Thompson, and Jennifer Ames, SAS Institute Inc. Predictions, including regressions
More informationPredictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN
Predictive Modeling using SAS Enterprise Miner and SAS/STAT : Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN 1 Overview This presentation will: Provide a brief introduction of how to set
More informationAdvanced Analytics through the credit cycle
Advanced Analytics through the credit cycle Alejandro Correa B. Andrés Gonzalez M. Introduction PRE ORIGINATION Credit Cycle POST ORIGINATION ORIGINATION Pre-Origination Propensity Models What is it?
More informationAirbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.
Airbnb Price Estimation Liubov Nikolenko SUNet ID: liubov Hoormazd Rezaei SUNet ID: hoormazd Pouya Rezazadeh SUNet ID: pouyar Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.git
More informationWatts App: An Energy Analytics and Demand-Response Advisor Tool
Watts App: An Energy Analytics and Demand-Response Advisor Tool Santiago Gonzalez, Case Western Reserve University, Electrical Engineering, SUNFEST Fellow Dr. Rahul Mangharam, Electrical and Systems Engineering
More informationGene Expression Classification: Decision Trees vs. SVMs
Gene Expression Classification: Decision Trees vs. SVMs Xiaojing Yuan and Xiaohui Yuan and Fan Yang and Jing Peng and Bill P. Buckles Electrical Engineering and Computer Science Department Tulane University
More information