Preface to the third edition Preface to the first edition Acknowledgments

Size: px
Start display at page:

Download "Preface to the third edition Preface to the first edition Acknowledgments"

Transcription

1 Contents Foreword Preface to the third edition Preface to the first edition Acknowledgments Part I PRELIMINARIES XXI XXIII XXVII XXIX CHAPTER 1 Introduction What Is Business Analytics? What Is Data Mining? Data Mining and Related Terms Big Data Data Science Why Are There So Many Different Methods? Terminology and Notation Road Maps to This Book Order of Topics CHAPTER 2 Overview of the Data Mining Process Introduction Core Ideas in Data Mining Classification Prediction Association Rules and Recommendation Systems Predictive Analytics Data Reduction and Dimension Reduction Data Exploration and Visualization Supervised and Unsupervised Learning The Steps in Data Mining Preliminary Steps Organization of Datasets Sampling from a Database Oversampling Rare Events in Classification Tasks COPYRIGHTED MATERIAL ix

2 x CONTENTS Preprocessing and Cleaning the Data Predictive Power and Overfitting Creation and Use of Data Partitions Overfitting Building a Predictive Model with XLMiner Predicting Home Values in the West Roxbury Neighborhood 32 Modeling Process Using Excel for Data Mining Automating Data Mining Solutions Data Mining Software Tools: the State of the Market (by Herb Edelstein) Problems Part II DATA EXPLORATION AND DIMENSION REDUCTION CHAPTER 3 Data Visualization Uses of Data Visualization Data Examples Example 1: Boston Housing Data Example 2: Ridership on Amtrak Trains Basic Charts: Bar Charts, Line Graphs, and Scatter Plots Distribution Plots: Boxplots and Histograms Heatmaps: Visualizing Correlations and Missing Values Multidimensional Visualization Adding Variables: Color, Size, Shape, Multiple Panels, and Animation Manipulations: Re-scaling, Aggregation and Hierarchies, Zooming, Filtering Reference: Trend Line and Labels Scaling up to Large Datasets Multivariate Plot: Parallel Coordinates Plot Interactive Visualization Specialized Visualizations Visualizing Networked Data Visualizing Hierarchical Data: Treemaps Visualizing Geographical Data: Map Charts Summary: Major Visualizations and Operations, by Data Mining Goal Prediction Classification Time Series Forecasting Unsupervised Learning Problems

3 CONTENTS xi CHAPTER 4 Dimension Reduction Introduction Curse of Dimensionality Practical Considerations Example 1: House Prices in Boston Data Summaries Summary Statistics Pivot Tables Correlation Analysis Reducing the Number of Categories in Categorical Variables Converting a Categorical Variable to a Numerical Variable Principal Components Analysis Example 2: Breakfast Cereals Principal Components Normalizing the Data Using Principal Components for Classification and Prediction Dimension Reduction Using Regression Models Dimension Reduction Using Classification and Regression Trees Problems Part III PERFORMANCE EVALUATION CHAPTER 5 Evaluating Predictive Performance Introduction Evaluating Predictive Performance Benchmark: The Average Prediction Accuracy Measures Comparing Training and Validation Performance Lift Chart Judging Classifier Performance Benchmark: The Naive Rule Class Separation The Classification Matrix Using the Validation Data Accuracy Measures Propensities and Cutoff for Classification Performance in Unequal Importance of Classes Asymmetric Misclassification Costs Generalization to More Than Two Classes Judging Ranking Performance Lift Charts for Binary Data

4 xii CONTENTS Decile Lift Charts Beyond Two Classes Lift Charts Incorporating Costs and Benefits Lift as Function of Cutoff Oversampling Oversampling the Training Set Evaluating Model Performance Using a Non-oversampled Validation Set Evaluating Model Performance If Only Oversampled Validation Set Exists Problems Part IV PREDICTION AND CLASSIFICATION METHODS CHAPTER 6 Multiple Linear Regression Introduction Explanatory vs. Predictive Modeling Estimating the Regression Equation and Prediction Example: Predicting the Price of Used Toyota Corolla Cars Variable Selection in Linear Regression Reducing the Number of Predictors How to Reduce the Number of Predictors Problems CHAPTER 7 k-nearest-neighbors (k-nn) The k-nn Classifier (categorical outcome) Determining Neighbors Classification Rule Example: Riding Mowers Choosing k Setting the Cutoff Value k-nn with More Than Two Classes Converting Categorical Variables to Binary Dummies k-nn for a Numerical Response Advantages and Shortcomings of k-nn Algorithms Problems CHAPTER 8 The Naive Bayes Classifier Introduction Cutoff Probability Method Conditional Probability Example 1: Predicting Fraudulent Financial Reporting Applying the Full (Exact) Bayesian Classifier Using the Assign to the Most Probable Class Method.. 172

5 CONTENTS xiii Using the Cutoff Probability Method Practical Difficulty with the Complete (Exact) Bayes Procedure 172 Solution: Naive Bayes Example 2: Predicting Fraudulent Financial Reports, Two Predictors Example 3: Predicting Delayed Flights Advantages and Shortcomings of the Naive Bayes Classifier Problems CHAPTER 9 Classification and Regression Trees Introduction Classification Trees Recursive Partitioning Example 1: Riding Mowers Measures of Impurity Tree Structure Classifying a New Observation Evaluating the Performance of a Classification Tree Example 2: Acceptance of Personal Loan Avoiding Overfitting Stopping Tree Growth: CHAID Pruning the Tree Classification Rules from Trees Classification Trees for More Than two Classes Regression Trees Prediction Measuring Impurity Evaluating Performance Advantages, Weaknesses and Extensions Improving Prediction: Multiple Trees Problems CHAPTER 10 Logistic Regression Introduction The Logistic Regression Model Example: Acceptance of Personal Loan Model with a Single Predictor Estimating the Logistic Model from Data: Computing Parameter Estimates Interpreting Results in Terms of Odds (for a Profiling Goal) Evaluating Classification Performance Variable Selection

6 xiv CONTENTS 10.4 Example of Complete Analysis: Predicting Delayed Flights Data Preprocessing Model Fitting and Estimation Model Interpretation Model Performance Variable Selection Appendix: Logistic Regression for Profiling Appendix A: Why Linear Regression Is Problematic for a Categorical Response Appendix B: Evaluating Explanatory Power Appendix C: Logistic Regression for More Than Two Classes 244 Problems CHAPTER 11 Neural Nets Introduction Concept and Structure of a Neural Network Fitting a Network to Data Example 1: Tiny Dataset Computing Output of Nodes Preprocessing the Data Training the Model Example 2: Classifying Accident Severity Avoiding Overfitting Using the Output for Prediction and Classification Required User Input Exploring the Relationship Between Predictors and Response Unsupervised Feature Extraction and Deep Learning Advantages and Weaknesses of Neural Networks Problems CHAPTER 12 Discriminant Analysis Introduction Example 1: Riding Mowers Example 2: Personal Loan Acceptance Distance of an Observation from a Class Fisher s Linear Classification Functions Classification Performance of Discriminant Analysis Prior Probabilities Unequal Misclassification Costs Classifying More Than Two Classes

7 CONTENTS xv Example 3: Medical Dispatch to Accident Scenes Advantages and Weaknesses Problems CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling Ensembles Why Ensembles Can Improve Predictive Power Simple Averaging Bagging Boosting Advantages and Weaknesses of Ensembles Uplift (Persuasion) Modeling A-B Testing Uplift Gathering the Data A Simple Model Modeling Individual Uplift Using the Results of an Uplift Model Summary Problems Part V MINING RELATIONSHIPS AMONG RECORDS CHAPTER 14 Association Rules and Collaborative Filtering Association Rules Discovering Association Rules in Transaction Databases Example 1: Synthetic Data on Purchases of Phone Faceplates 309 Generating Candidate Rules The Apriori Algorithm Selecting Strong Rules Data Format The Process of Rule Selection Interpreting the Results Rules and Chance Example 2: Rules for Similar Book Purchases Collaborative Filtering Data Type and Format Example 3: Netflix Prize Contest User-Based Collaborative Filtering: People Like You Item-Based Collaborative Filtering Advantages and Weaknesses of Collaborative Filtering Collaborative Filtering vs. Association Rules Summary Problems

8 xvi CONTENTS CHAPTER 15 Cluster Analysis Introduction Example: Public Utilities Measuring Distance Between Two Observations Euclidean Distance Normalizing Numerical Measurements Other Distance Measures for Numerical Data Distance Measures for Categorical Data Distance Measures for Mixed Data Measuring Distance Between Two Clusters Minimum Distance Maximum Distance Average Distance Centroid Distance Hierarchical (Agglomerative) Clustering Single Linkage Complete Linkage Average Linkage (in XLMiner: Group Average Linkage ) Centroid Linkage Ward s Method Dendrograms: Displaying Clustering Process and Results Validating Clusters Limitations of Hierarchical Clustering Non-hierarchical Clustering: The k-means Algorithm Initial Partition into k Clusters Problems Part VI FORECASTING TIME SERIES CHAPTER 16 Handling Time Series Introduction Descriptive vs. Predictive Modeling Popular Forecasting Methods in Business Combining Methods Time Series Components Example: Ridership on Amtrak Trains Data Partitioning and Performance Evaluation Benchmark Performance: Naive Forecasts Generating Future Forecasts Problems

9 CONTENTS xvii CHAPTER 17 Regression-Based Forecasting A Model with Trend Linear Trend Exponential Trend Polynomial Trend A Model with Seasonality A Model with Trend and Seasonality Autocorrelation and ARIMA Models Computing Autocorrelation Improving Forecasts by Integrating Autocorrelation Information Evaluating Predictability Problems CHAPTER 18 Smoothing Methods Introduction Moving Average Centered Moving Average for Visualization Trailing Moving Average for Forecasting Choosing Window Width (w) Simple Exponential Smoothing Choosing Smoothing Parameter α Relation between Moving Average and Simple Exponential Smoothing Advanced Exponential Smoothing Series with a Trend Series with a Trend and Seasonality Series with Seasonality (No Trend) Problems Part VII DATA ANALYTICS CHAPTER 19 Social Network Analytics Introduction Directed vs. Undirected Networks Visualizing and Analyzing Networks Graph Layout Adjacency List Adjacency Matrix Using Network Data in Classification and Prediction Social Data Metrics and Taxonomy Node-Level Centrality Metrics Egocentric Network

10 xviii CONTENTS Network Metrics Using Network Metrics in Prediction and Classification Link Prediction Entity Resolution Collaborative Filtering Advantages and Disadvantages Problems CHAPTER 20 Text Mining Introduction The Spreadsheet Representation of Text: Bag-of-Words Bag-of-Words vs. Meaning Extraction at Document Level Preprocessing the Text Tokenization Text Reduction Presence/Absence vs. Frequency Term Frequency---Inverse Document Frequency (TF-IDF) From Terms to Concepts: Latent Semantic Indexing Extracting Meaning Implementing Data Mining Methods Example: Online Discussions on Autos and Electronics Importing and Labeling the Records Tokenization Text Processing and Reduction Producing a Concept Matrix Labeling the Documents Fitting a Model Prediction Summary Problems Part VIII CASES CHAPTER 21 Cases Charles Book Club The Book Industry Database Marketing at Charles Data Mining Techniques Assignment German Credit Background Data Assignment Tayko Software Cataloger

11 CONTENTS xix Background The Mailing Experiment Data Assignment Political Persuasion Background Predictive Analytics Arrives in US Politics Political Targeting Uplift Data Assignment Taxi Cancellations Business Situation Assignment Segmenting Consumers of Bath Soap Business Situation Key Problems Data Measuring Brand Loyalty Assignment Appendix Direct-Mail Fundraising Background Data Assignment Catalog Cross-Selling Background Assignment Predicting Bankruptcy Predicting Corporate Bankruptcy Assignment Time Series Case: Forecasting Public Transportation Demand 502 Background Problem Description Available Data Assignment Goal Assignment Tips and Suggested Steps References 504 Data Files Used in the Book 506 Index 508

12

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here.

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. Contents List of Figures xv Foreword xxiii Preface xxv Acknowledgments xxix Chapter

More information

From Profit Driven Business Analytics. Full book available for purchase here.

From Profit Driven Business Analytics. Full book available for purchase here. From Profit Driven Business Analytics. Full book available for purchase here. Contents Foreword xv Acknowledgments xvii Chapter 1 A Value-Centric Perspective Towards Analytics 1 Introduction 1 Business

More information

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos Effective CRM Using Predictive Analytics Antonios Chorianopoulos WlLEY Contents Preface Acknowledgments xiii xv 1 An overview of data mining: The applications, the methodology, the algorithms, and the

More information

Data Mining Applications with R

Data Mining Applications with R Data Mining Applications with R Yanchang Zhao Senior Data Miner, RDataMining.com, Australia Associate Professor, Yonghua Cen Nanjing University of Science and Technology, China AMSTERDAM BOSTON HEIDELBERG

More information

Data Mining In Excel: Lecture Notes and Cases

Data Mining In Excel: Lecture Notes and Cases See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/242384346 Data Mining In Excel: Lecture Notes and Cases Article CITATIONS 4 READS 167 3 authors,

More information

DATA MINING AND BUSINESS ANALYTICS WITH R

DATA MINING AND BUSINESS ANALYTICS WITH R DATA MINING AND BUSINESS ANALYTICS WITH R DATA MINING AND BUSINESS ANALYTICS WITH R Johannes Ledolter Department of Management Sciences Tippie College of Business University of Iowa Iowa City, Iowa Copyright

More information

Effective CRM Using Predictive Analytics

Effective CRM Using Predictive Analytics Effective CRM Using Predictive Analytics Effective CRM Using Predictive Analytics Antonios Chorianopoulos This edition first published 2016 2016 John Wiley & Sons, Ltd Registered Office John Wiley & Sons,

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 2015 The MathWorks, Inc. 1 MATLAB 을이용한머신러닝 ( 기본 ) Senior Application Engineer 엄준상과장 2015 The MathWorks, Inc. 2 Machine Learning is Everywhere Solution is too complex for hand written rules or equations

More information

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ANALYTICAL MODEL DEVELOPMENT AGENDA Enterprise Miner: Analytical Model Development The session looks at: - Supervised and Unsupervised Modelling - Classification

More information

DATA ANALYTICS WITH R, EXCEL & TABLEAU

DATA ANALYTICS WITH R, EXCEL & TABLEAU Learn. Do. Earn. DATA ANALYTICS WITH R, EXCEL & TABLEAU COURSE DETAILS centers@acadgild.com www.acadgild.com 90360 10796 Brief About this Course Data is the foundation for technology-driven digital age.

More information

Chapter 5 Evaluating Classification & Predictive Performance

Chapter 5 Evaluating Classification & Predictive Performance Chapter 5 Evaluating Classification & Predictive Performance Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Why Evaluate? Multiple methods are available

More information

Software Metrics. Practical Approach. A Rigorous and. Norman Fenton. James Bieman THIRD EDITION. CRC Press CHAPMAN & HALIVCRC INNOVATIONS IN

Software Metrics. Practical Approach. A Rigorous and. Norman Fenton. James Bieman THIRD EDITION. CRC Press CHAPMAN & HALIVCRC INNOVATIONS IN CHAPMAN & HALIVCRC INNOVATIONS IN SOFTWARE ENGINEERING AND SOFTWARE DEVELOPMENT Software Metrics A Rigorous and Practical Approach THIRD EDITION Norman Fenton Queen Mary University of London. UK James

More information

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS

More information

COPYRIGHTED MATERIAL. Contents. Part One Requirements, Realities, and Architecture 1. Acknowledgments Introduction

COPYRIGHTED MATERIAL. Contents. Part One Requirements, Realities, and Architecture 1. Acknowledgments Introduction Contents Contents ix Foreword xix Preface xxi Acknowledgments xxiii Introduction xxv Part One Requirements, Realities, and Architecture 1 Chapter 1 Defining Business Requirements 3 The Most Important Determinant

More information

IBM SPSS Modeler Personal

IBM SPSS Modeler Personal IBM Modeler Personal Make better decisions with predictive intelligence from the desktop Highlights Helps you identify hidden patterns and trends in your data to predict and improve outcomes Enables you

More information

Predictive Analytics

Predictive Analytics Predictive Analytics Mani Janakiram, PhD Director, Supply Chain Intelligence & Analytics, Intel Corp. Adjunct Professor of Supply Chain, ASU October 2017 "Prediction is very difficult, especially if it's

More information

Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction

Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction Paper SAS1774-2015 Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction ABSTRACT Xiangxiang Meng, Wayne Thompson, and Jennifer Ames, SAS Institute Inc. Predictions, including regressions

More information

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities

More information

IBM SPSS Modeler Personal

IBM SPSS Modeler Personal IBM SPSS Modeler Personal Make better decisions with predictive intelligence from the desktop Highlights Helps you identify hidden patterns and trends in your data to predict and improve outcomes Enables

More information

Implementing Instant-Book and Improving Customer Service Satisfaction. Arturo Heyner Cano Bejar, Nick Danks Kellan Nguyen, Tonny Kuo

Implementing Instant-Book and Improving Customer Service Satisfaction. Arturo Heyner Cano Bejar, Nick Danks Kellan Nguyen, Tonny Kuo Implementing Instant-Book and Improving Customer Service Satisfaction Arturo Heyner Cano Bejar, Nick Danks Kellan Nguyen, Tonny Kuo Business Problem Problem statement Problem: high rejection rate (15%)

More information

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 02 Data Mining Process Welcome to the lecture 2 of

More information

TABLE OF CONTENTS ix

TABLE OF CONTENTS ix TABLE OF CONTENTS ix TABLE OF CONTENTS Page Certification Declaration Acknowledgement Research Publications Table of Contents Abbreviations List of Figures List of Tables List of Keywords Abstract i ii

More information

Data Science Training Course

Data Science Training Course About Intellipaat Intellipaat is a fast-growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 600,000 in over

More information

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models. Powerful machine learning software for developing predictive, descriptive, and analytical models. The Company Minitab helps companies and institutions to spot trends, solve problems and discover valuable

More information

Predicting Customer Purchase to Improve Bank Marketing Effectiveness

Predicting Customer Purchase to Improve Bank Marketing Effectiveness Business Analytics Using Data Mining (2017 Fall).Fianl Report Predicting Customer Purchase to Improve Bank Marketing Effectiveness Group 6 Sandy Wu Andy Hsu Wei-Zhu Chen Samantha Chien Instructor:Galit

More information

SolidQ Data Science Services Fraud Detection

SolidQ Data Science Services Fraud Detection SolidQ Data Science Services Fraud Detection www.solidq.com Agenda Introduction The Continuous Learning Cycle The Structure of the POC The Benefits 1 Initial Situation Attempts to fraud happen every day!

More information

Workflow Administration of PTC Windchill 11.1

Workflow Administration of PTC Windchill 11.1 Workflow Administration of PTC Windchill 11.1 Overview Course Code Course Length TRN-5266-T 16 Hours In this course, you will learn about Windchill workflow features and how to design, configure, and test

More information

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING Abbas Heiat, College of Business, Montana State University, Billings, MT 59102, aheiat@msubillings.edu ABSTRACT The purpose of this study is to investigate

More information

Data Analytics with MATLAB Adam Filion Application Engineer MathWorks

Data Analytics with MATLAB Adam Filion Application Engineer MathWorks Data Analytics with Adam Filion Application Engineer MathWorks 2015 The MathWorks, Inc. 1 Case Study: Day-Ahead Load Forecasting Goal: Implement a tool for easy and accurate computation of dayahead system

More information

[Type the document title]

[Type the document title] EFFECTIVE PREMIUM - CUSTOMER TARGETING USING CLASSIFICATION METHODS - Increase number of purchases of high margin products using classification methods [Type the document title] [Type the document subtitle]

More information

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN Predictive Modeling using SAS Enterprise Miner and SAS/STAT : Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN 1 Overview This presentation will: Provide a brief introduction of how to set

More information

Copyright 2013, SAS Institute Inc. All rights reserved.

Copyright 2013, SAS Institute Inc. All rights reserved. IMPROVING PREDICTION OF CYBER ATTACKS USING ENSEMBLE MODELING June 17, 2014 82 nd MORSS Alexandria, VA Tom Donnelly, PhD Systems Engineer & Co-insurrectionist JMP Federal Government Team ABSTRACT Improving

More information

Segmentation and Targeting

Segmentation and Targeting Segmentation and Targeting Outline The segmentation-targeting-positioning (STP) framework Segmentation The concept of market segmentation Managing the segmentation process Deriving market segments and

More information

2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro

2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro 2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro Mia Stephens mia.stephens@jmp.com http://bit.ly/1uygw57 Copyright 2010 SAS Institute Inc. All rights reserved. Background TQM

More information

M.Tech. IN ADVANCED INFORMATION TECHNOLOGY - SOFTWARE TECHNOLOGY (MTECHST)

M.Tech. IN ADVANCED INFORMATION TECHNOLOGY - SOFTWARE TECHNOLOGY (MTECHST) No. of Printed Pages : 8 I MINE-0221 M.Tech. IN ADVANCED INFORMATION TECHNOLOGY - SOFTWARE TECHNOLOGY (MTECHST) Time : 3 hours Note : (i) (ii) (iii) (iv) (v) Term-End Examination December, 2014 MINE-022

More information

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) OUTLINE FOR THE POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) Module Subject Topics Learning outcomes Delivered by Exploratory & Visualization Framework Exploratory Data Collection and

More information

Text Analysis of American Airlines Customer Reviews

Text Analysis of American Airlines Customer Reviews SESUG 2016 Paper EPO-281 Text Analysis of American Airlines Customer Reviews Rajesh Tolety, Oklahoma State University Saurabh Kumar Choudhary, Oklahoma State University ABSTRACT Which airline should I

More information

IBM SPSS Decision Trees

IBM SPSS Decision Trees IBM SPSS Decision Trees 20 IBM SPSS Decision Trees Easily identify groups and predict outcomes Highlights With SPSS Decision Trees you can: Identify groups, segments, and patterns in a highly visual manner

More information

SAS Enterprise Miner 5.3 for Desktop

SAS Enterprise Miner 5.3 for Desktop Fact Sheet SAS Enterprise Miner 5.3 for Desktop A fast, powerful data mining workbench delivered to your desktop What does SAS Enterprise Miner for Desktop do? SAS Enterprise Miner for Desktop is a complete

More information

Big Data. Methodological issues in using Big Data for Official Statistics

Big Data. Methodological issues in using Big Data for Official Statistics Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics

More information

PROVEN PRACTICES FOR PREDICTIVE MODELING

PROVEN PRACTICES FOR PREDICTIVE MODELING PROVEN PRACTICES FOR PREDICTIVE MODELING BROUGHT TO YOU BY SAS CUSTOMER LOYALTY CONTRIBUTIONS FROM: DARIUS BAER DAVID OGDEN DOUG WIELENGA MARY-ELIZABETH ( M-E ) EDDLESTONE PRINCIPAL SYSTEMS ENGINEER, ANALYTICS

More information

Data Mining and Applications in Genomics

Data Mining and Applications in Genomics Data Mining and Applications in Genomics Lecture Notes in Electrical Engineering Volume 25 For other titles published in this series, go to www.springer.com/series/7818 Sio-Iong Ao Data Mining and Applications

More information

3 Ways to Improve Your Targeted Marketing with Analytics

3 Ways to Improve Your Targeted Marketing with Analytics 3 Ways to Improve Your Targeted Marketing with Analytics Introduction Targeted marketing is a simple concept, but a key element in a marketing strategy. The goal is to identify the potential customers

More information

Segmentation and Targeting

Segmentation and Targeting Segmentation and Targeting Outline The segmentation-targeting-positioning (STP) framework Segmentation The concept of market segmentation Managing the segmentation process Deriving market segments and

More information

Text Mining. Theory and Applications Anurag Nagar

Text Mining. Theory and Applications Anurag Nagar Text Mining Theory and Applications Anurag Nagar Topics Introduction What is Text Mining Features of Text Document Representation Vector Space Model Document Similarities Document Classification and Clustering

More information

Statistics, Data Analysis, and Decision Modeling

Statistics, Data Analysis, and Decision Modeling - ' 'li* Statistics, Data Analysis, and Decision Modeling T H I R D E D I T I O N James R. Evans University of Cincinnati PEARSON Prentice Hall Upper Saddle River, New Jersey 07458 CONTENTS Preface xv

More information

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA

advanced analysis of gene expression microarray data aidong zhang World Scientific State University of New York at Buffalo, USA advanced analysis of gene expression microarray data aidong zhang State University of New York at Buffalo, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI Contents

More information

Methods and Applications of Statistics in Business, Finance, and Management Science

Methods and Applications of Statistics in Business, Finance, and Management Science Methods and Applications of Statistics in Business, Finance, and Management Science N. Balakrishnan McMaster University Department ofstatistics Hamilton, Ontario, Canada 4 WILEY A JOHN WILEY & SONS, INC.,

More information

From Killer Analytics. Full book available for purchase here. Introduction: What Are Predictive Analytics? 1

From Killer Analytics. Full book available for purchase here. Introduction: What Are Predictive Analytics? 1 From Killer Analytics. Full book available for purchase here. Contents Foreword Preface xix xv Acknowledgments xxvii Introduction: What Are Predictive Analytics? 1 Learning from Past Mistakes 1 Organizational

More information

SPM 8.2. Salford Predictive Modeler

SPM 8.2. Salford Predictive Modeler SPM 8.2 Salford Predictive Modeler SPM 8.2 The SPM Salford Predictive Modeler software suite is a highly accurate and ultra-fast platform for developing predictive, descriptive, and analytical models from

More information

SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc.

SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc. ABSTRACT Paper SAS5780-2016 SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc. In today's Business Intelligence world,

More information

New Customer Acquisition Strategy

New Customer Acquisition Strategy Page 1 New Customer Acquisition Strategy Based on Customer Profiling Segmentation and Scoring Model Page 2 Introduction A customer profile is a snapshot of who your customers are, how to reach them, and

More information

Applications of Machine Learning to Predict Yelp Ratings

Applications of Machine Learning to Predict Yelp Ratings Applications of Machine Learning to Predict Yelp Ratings Kyle Carbon Aeronautics and Astronautics kcarbon@stanford.edu Kacyn Fujii Electrical Engineering khfujii@stanford.edu Prasanth Veerina Computer

More information

The PTC Windchill PDMLink 11.1 MCAD Data Management Process for PTC Creo Parametric

The PTC Windchill PDMLink 11.1 MCAD Data Management Process for PTC Creo Parametric The PTC Windchill PDMLink 11.1 MCAD Data Management Process for PTC Creo Parametric Overview Course Code Course Length TRN-5230-T 8 Hours In this Process-based course, you will learn about the Windchill

More information

Analytics for Banks. September 19, 2017

Analytics for Banks. September 19, 2017 Analytics for Banks September 19, 2017 Outline About AlgoAnalytics Problems we can solve for banks Our experience Technology Page 2 About AlgoAnalytics Analytics Consultancy Work at the intersection of

More information

Intelligence and. Vivek Kaie

Intelligence and. Vivek Kaie Enterprise Performance Intelligence and Decision Patterns Vivek Kaie /0\ CRC Press \CtJ Taylor & Francis Croup V- 'S Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an

More information

SAS BIG DATA ANALYTICS INCREASING YOUR COMPETITIVE EDGE

SAS BIG DATA ANALYTICS INCREASING YOUR COMPETITIVE EDGE SAS BIG DATA ANALYTICS INCREASING YOUR COMPETITIVE EDGE SAS VISUAL ANALYTICS STATE OF THE ART SOLUTION FOR FASTER, SMARTER DECISIONS AIMED AT THE MASSES Data visualization Approachable analytics Robust

More information

FINAL PROJECT REPORT IME672. Group Number 6

FINAL PROJECT REPORT IME672. Group Number 6 FINAL PROJECT REPORT IME672 Group Number 6 Ayushya Agarwal 14168 Rishabh Vaish 14553 Rohit Bansal 14564 Abhinav Sharma 14015 Dil Bag Singh 14222 Introduction Cell2Cell, The Churn Game. The cellular telephone

More information

MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS. NORTHWESTERN NUTC/CCIT October 26, Ted Gifford

MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS. NORTHWESTERN NUTC/CCIT October 26, Ted Gifford MACHINE LEARNING OPPORTUNITIES IN FREIGHT TRANSPORTATION OPERATIONS NORTHWESTERN NUTC/CCIT October 26, 2016 Ted Gifford SCHNEIDER IS A TRANSPORTATION AND LOGISTICS LEADER WITH A BROAD PORTFOLIO OF SERVICES.

More information

GSAW 2018 Machine Learning

GSAW 2018 Machine Learning GSAW 2018 Machine Learning Space Ground System Working Group Move the Algorithms; Not the Data! Dan Brennan Sr. Director Mission Solutions daniel.p.brennan@oracle.com Feb, 2018 Copyright 2018, Oracle and/or

More information

Forecasting Seasonal Footwear Demand Using Machine Learning. By Majd Kharfan & Vicky Chan, SCM 2018 Advisor: Tugba Efendigil

Forecasting Seasonal Footwear Demand Using Machine Learning. By Majd Kharfan & Vicky Chan, SCM 2018 Advisor: Tugba Efendigil Forecasting Seasonal Footwear Demand Using Machine Learning By Majd Kharfan & Vicky Chan, SCM 2018 Advisor: Tugba Efendigil 1 Agenda Ø Ø Ø Ø Ø Ø Ø The State Of Fashion Industry Research Objectives AI In

More information

Learn What s New. Statistical Software

Learn What s New. Statistical Software Statistical Software Learn What s New Upgrade now to access new and improved statistical features and other enhancements that make it even easier to analyze your data. The Assistant Let Minitab s Assistant

More information

Building the In-Demand Skills for Analytics and Data Science Course Outline

Building the In-Demand Skills for Analytics and Data Science Course Outline Day 1 Module 1 - Predictive Analytics Concepts What and Why of Predictive Analytics o Predictive Analytics Defined o Business Value of Predictive Analytics The Foundation for Predictive Analytics o Statistical

More information

Process-based Strategic Planning

Process-based Strategic Planning Process-based Strategic Planning Rudolf Grünig Richard Kühn Process-based Strategic Planning Translated by Anthony Clark Third Edition with 137 Figures ^ Springer Professor Dr. Rudolf Grünig University

More information

Managing PTC Creo Parametric Data with PTC Windchill PDMLink 11.0

Managing PTC Creo Parametric Data with PTC Windchill PDMLink 11.0 Managing PTC Creo Parametric Data with PTC Windchill PDMLink 11.0 Overview Course Code Course Length TRN-4770-T 8 Hours In this Process-based course, you will learn about the Windchill PDMLink Creo Parametric

More information

LEED Reference Guide for Green Building Operations and Maintenance For the Operations and Maintenance of Commercial and Institutional Buildings 2009

LEED Reference Guide for Green Building Operations and Maintenance For the Operations and Maintenance of Commercial and Institutional Buildings 2009 LEED Reference Guide for Green Building Operations and Maintenance For the Operations and Maintenance of Commercial and Institutional Buildings 2009 Edition COPYRIGHT DISCLAIMER ii LEED REFERENCE GUIDE

More information

Gene Expression Data Analysis

Gene Expression Data Analysis Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based

More information

Approaching an Analytical Project. Tuba Islam, Analytics CoE, SAS UK

Approaching an Analytical Project. Tuba Islam, Analytics CoE, SAS UK Approaching an Analytical Project Tuba Islam, Analytics CoE, SAS UK Approaching an Analytical Project Starting with questions.. What is the problem you would like to solve? Why do you need analytics? Which

More information

Software Data Analytics. Nevena Lazarević

Software Data Analytics. Nevena Lazarević Software Data Analytics Nevena Lazarević 1 Selected Literature Perspectives on Data Science for Software Engineering, 1st Edition, Tim Menzies, Laurie Williams, Thomas Zimmermann The Art and Science of

More information

Targeting, valuing, segmenting and loyalty techniques

Targeting, valuing, segmenting and loyalty techniques MIKEGRIGSBY ADVANCED CUSTOMER ANALYTICS Targeting, valuing, segmenting and loyalty techniques MARKETING SCIENCE SERIES A KoganPage CONTENTS 01 Overview 1 What is retail? 1 What is analytics? 2 Who is this

More information

Achieve Better Insight and Prediction with Data Mining

Achieve Better Insight and Prediction with Data Mining Clementine 12.0 Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.

More information

A FORMALIZATION AND EXTENSION OF THE PURDUE ENTERPRISE REFERENCE ARCHITECTURE AND THE PURDUE METHODOLOGY REPORT NUMBER 158

A FORMALIZATION AND EXTENSION OF THE PURDUE ENTERPRISE REFERENCE ARCHITECTURE AND THE PURDUE METHODOLOGY REPORT NUMBER 158 A FORMALIZATION AND EXTENSION OF THE PURDUE ENTERPRISE REFERENCE ARCHITECTURE AND THE PURDUE METHODOLOGY REPORT NUMBER 158 Purdue Laboratory for Applied Industrial Control Prepared by Hong Li Theodore

More information

Predictive Modelling for Customer Targeting A Banking Example

Predictive Modelling for Customer Targeting A Banking Example Predictive Modelling for Customer Targeting A Banking Example Pedro Ecija Serrano 11 September 2017 Customer Targeting What is it? Why should I care? How do I do it? 11 September 2017 2 What Is Customer

More information

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh

Professor Dr. Gholamreza Nakhaeizadeh. Professor Dr. Gholamreza Nakhaeizadeh Statistic Methods in in Mining Business Understanding Understanding Preparation Deployment Modelling Evaluation Mining Process (( Part 3) 3) Professor Dr. Gholamreza Nakhaeizadeh Professor Dr. Gholamreza

More information

DASI: Analytics in Practice and Academic Analytics Preparation

DASI: Analytics in Practice and Academic Analytics Preparation DASI: Analytics in Practice and Academic Analytics Preparation Mia Stephens mia.stephens@jmp.com Copyright 2010 SAS Institute Inc. All rights reserved. Background TQM Coordinator/Six Sigma MBB Founding

More information

Smart Rating for Electronic gadgets

Smart Rating for Electronic gadgets Smart Rating for Electronic gadgets BUSINESS ANALYTICS USING DATA MINING Deepanshu Saini 61310308 Rachna Lalwani 61310845 Saurabh Thaman 61310113 Vikramadith Raman 61310387 Jeevan Murthy 61310542 Karthik

More information

PROJECT MANAGEMENT. Systems, Principles, and Applications. Taylor & Francis Group Boca Raton London New York

PROJECT MANAGEMENT. Systems, Principles, and Applications. Taylor & Francis Group Boca Raton London New York PROJECT MANAGEMENT Systems, Principles, and Applications Adedeji B. Badiru C R C P r e s s Taylor & Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an informa

More information

Machine Learning Models for Sales Time Series Forecasting

Machine Learning Models for Sales Time Series Forecasting Article Machine Learning Models for Sales Time Series Forecasting Bohdan M. Pavlyshenko SoftServe, Inc., Ivan Franko National University of Lviv * Correspondence: bpavl@softserveinc.com, b.pavlyshenko@gmail.com

More information

1 Introduction 1. 2 Forecasting and Demand Modeling 5. 3 Deterministic Inventory Models Stochastic Inventory Models 63

1 Introduction 1. 2 Forecasting and Demand Modeling 5. 3 Deterministic Inventory Models Stochastic Inventory Models 63 CONTENTS IN BRIEF 1 Introduction 1 2 Forecasting and Demand Modeling 5 3 Deterministic Inventory Models 29 4 Stochastic Inventory Models 63 5 Multi Echelon Inventory Models 117 6 Dealing with Uncertainty

More information

Neural Connection s four powerful neural networks give you better performing models

Neural Connection s four powerful neural networks give you better performing models Neural Connection 2.1 Neural Connection s four powerful neural networks give you better performing models Build better models for more effective classification, prediction, time series forecasting and

More information

Credit Scoring, Response Modelling and Insurance Rating

Credit Scoring, Response Modelling and Insurance Rating Credit Scoring, Response Modelling and Insurance Rating Also by Steven Finlay THE MANAGEMENT OF CONSUMER CREDIT CONSUMER CREDIT FUNDAMENTALS Credit Scoring, Response Modelling and Insurance Rating A Practical

More information

Master Assessment Plan: Analytics

Master Assessment Plan: Analytics Master Assessment Plan: Analytics Outcomes Analysis Years: 2014 2015 Biennial Report Year/Semester: 2015/Spring Program(s): All programs Objective: Data mining and machine learning Graduates should be

More information

TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011

TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011 TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011 1 Develop Models for Customers Likely to Churn Churn is a term used to indicate a customer leaving the service of one company in

More information

Case studies in Data Mining & Knowledge Discovery

Case studies in Data Mining & Knowledge Discovery Case studies in Data Mining & Knowledge Discovery Knowledge Discovery is a process Data Mining is just a step of a (potentially) complex sequence of tasks KDD Process Data Mining & Knowledge Discovery

More information

Chapter 8 Analytical Procedures

Chapter 8 Analytical Procedures Slide 8.1 Principles of Auditing: An Introduction to International Standards on Auditing Chapter 8 Analytical Procedures Rick Hayes, Hans Gortemaker and Philip Wallage Slide 8.2 Analytical procedures Analytical

More information

Windchill PDMLink Curriculum Guide

Windchill PDMLink Curriculum Guide Windchill PDMLink 10.2 Curriculum Guide Live Classroom Curriculum Guide Update to Windchill PDMLink 10.2 from Windchill PDMLink 9.0/9.1 for the End User Introduction to Windchill PDMLink 10.2 for Light

More information

Ensemble Modeling. Toronto Data Mining Forum November 2017 Helen Ngo

Ensemble Modeling. Toronto Data Mining Forum November 2017 Helen Ngo Ensemble Modeling Toronto Data Mining Forum November 2017 Helen Ngo Agenda Introductions Why Ensemble Models? Simple & Complex ensembles Thoughts: Post-real-life Experimentation Downsides of Ensembles

More information

Contents. Part One: Setting Up QuickBooks. The Missing Credits... xi. Introduction...

Contents. Part One: Setting Up QuickBooks. The Missing Credits... xi. Introduction... The Missing Credits.... xi Introduction.... What s New in QuickBooks 2015...xv When QuickBooks May Not Be the Answer...xvii Choosing the Right Edition... xviii Accounting Basics: The Important Stuff...

More information

Cryptocurrency Price Prediction Using News and Social Media Sentiment

Cryptocurrency Price Prediction Using News and Social Media Sentiment Cryptocurrency Price Prediction Using News and Social Media Sentiment Connor Lamon, Eric Nielsen, Eric Redondo Abstract This project analyzes the ability of news and social media data to predict price

More information

Supervised and Unsupervised Learning

Supervised and Unsupervised Learning Supervised and Unsupervised Learning Kwok-Leung Tsui Industrial & Systems Engineering Georgia Institute of Technology 1/7/2009 1 Data Mining (KDD) Process Determine Business Objectives Data Preparation

More information

Text Categorization. Hongning Wang

Text Categorization. Hongning Wang Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS

More information

Text Categorization. Hongning Wang

Text Categorization. Hongning Wang Text Categorization Hongning Wang CS@UVa Today s lecture Bayes decision theory Supervised text categorization General steps for text categorization Feature selection methods Evaluation metrics CS@UVa CS

More information

Churn Prevention in Telecom Services Industry- A systematic approach to prevent B2B churn using SAS

Churn Prevention in Telecom Services Industry- A systematic approach to prevent B2B churn using SAS Paper 1414-2017 Churn Prevention in Telecom Services Industry- A systematic approach to prevent B2B churn using SAS ABSTRACT Krutharth Peravalli, Dr. Dmitriy Khots West Corporation It takes months to find

More information

SAP Predictive Analytics Suite

SAP Predictive Analytics Suite SAP Predictive Analytics Suite Tania Pérez Asensio Where is the Evolution of Business Analytics Heading? Organizations Are Maturing Their Approaches to Solving Business Problems Reactive Wait until a problem

More information

Auctioning Experts in Credit Modeling

Auctioning Experts in Credit Modeling Auctioning Experts in Credit Modeling Robert Stine The School, Univ of Pennsylvania May, 2004 www-stat.wharton.upenn.edu/~stine Opportunities Anticipate default - Who are most likely to default in the

More information

Brian Macdonald Big Data & Analytics Specialist - Oracle

Brian Macdonald Big Data & Analytics Specialist - Oracle Brian Macdonald Big Data & Analytics Specialist - Oracle Improving Predictive Model Development Time with R and Oracle Big Data Discovery brian.macdonald@oracle.com Copyright 2015, Oracle and/or its affiliates.

More information

MANAGING SUPPLY. Competitive Strategy for A Sustainable Future. Ling Li Old Dominion University, USA

MANAGING SUPPLY. Competitive Strategy for A Sustainable Future. Ling Li Old Dominion University, USA MANAGING SUPPLY CHAIN AND LOGISTICS Competitive Strategy for A Sustainable Future Ling Li Old Dominion University, USA World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI HONG KONG TAIPEI CHENNAI

More information

Using Decision Tree to predict repeat customers

Using Decision Tree to predict repeat customers Using Decision Tree to predict repeat customers Jia En Nicholette Li Jing Rong Lim Abstract We focus on using feature engineering and decision trees to perform classification and feature selection on the

More information

a. Category Report displays Sales by Product Categories b. Supplier Report displays Sales by Suppliers

a. Category Report displays Sales by Product Categories b. Supplier Report displays Sales by Suppliers Reservations/ Sales Sales Channel Report This report displays the total sales, for a selected time period, from each of your distribution points or sales channels (for example call center and website,

More information

Linear model to forecast sales from past data of Rossmann drug Store

Linear model to forecast sales from past data of Rossmann drug Store Abstract Linear model to forecast sales from past data of Rossmann drug Store Group id: G3 Recent years, the explosive growth in data results in the need to develop new tools to process data into knowledge

More information