FINAL PROJECT REPORT IME672. Group Number 6

Size: px
Start display at page:

Download "FINAL PROJECT REPORT IME672. Group Number 6"

Transcription

1 FINAL PROJECT REPORT IME672 Group Number 6 Ayushya Agarwal Rishabh Vaish Rohit Bansal Abhinav Sharma Dil Bag Singh Introduction Cell2Cell, The Churn Game. The cellular telephone industry has always offered a compelling value proposition: convenient, mobile telephone service. The wireless carriers have recently begun to manage their customer base. This involves three main areas: Acquisition, Development and Retention. Attention has focused on retention, minimizing the number of customers who defect to another company. These defections are called customer churn. As a result, the big challenge these days is to increase customer loyalty before subscribers decide to leave and to aim efforts at customers who are at risk of churning.. This need for a Proactive Retention Program has made Predictive CRM (Customer Relationship Management) and Churn Modeling the new buzzwords in the cellular sector. Thus our mission is to: (1) develop a statistical model for predicting customer churn, (2) use the model to identify the most important drivers of churn, and (3) with these new insights, recommend a customer churn management program Data Set The given data set consists of 71,047 rows & containing a total of 78 variables (including a variable named CHURN, signifying whether the customer had left the company two months after observation). One of the variables named CALIBRAT was used to differentiate the validation dataset from training dataset. Training dataset contained data of 40,000 customers and validation dataset contained 31,047 customers. Data Pre processing

2 The dataset was divided in training and validation datasets, using CALIBRAT as the partition variable (value of 1 was used training and value of 0 was used for validation). Variable CHURNDEP was set as target variable and some other variables (those not related to business objective) were rejected. Using CALIBRAT we divided the data into two sets data_training (CALIBRAT = 1) and data_validation (CALIBRAT = 0) We had cleaned the data previously and removed all the outliers, missing and absurd values. We had coalesced various attributes into one by converting them from binary to nominal and removed attributes using correlation analysis. After this the number of attributes was 53. We further reduced the attributes to 11 using Rough Sets. According to Rough Set theory these attributes had almost the entire information of the 53 variables stored in them. We perform further classification on the reduced attributes. Attribute Reduction Attributes were reduced using rough sets. RoughSets library was used. Firstly we converted our data_training into decision table using SF.asDecisionTable command. After that using the unsupervised quantiles method we discretized the decision table using the command D.discretization.RST and SF.applyDecTable. Now the number of attributes were reduced to 11 using the FS.quickreduct.RST command on the discretized table. The reduced 11 attributes were RECCHRGE, DROPVCE, CSA, EQPDAYS, CUSTOMER, AGE1, INCOME, SETPRC, CRERAT, MARSTAT, CHURNDEP From the selected 11 attribute CSA and CUSTOMER removed for the following reasons CUSTOMER Obviously the customer ID does not determine the customer churn. So it was removed. CSA The number of factors was large and it was hampering our churn prediction models. Moreover it was leading to unruly prediction models. Thus to reduce computation time we removed CSA attribute. So the final 9 attributes used to build prediction model were 1

3 RECCHRGE, DROPVCE, EQPDAYS, AGE1, INCOME, SETPRC, CRERAT, MARSTAT, CHURNDEP Data Modeling Total of 8 different models were used to predict the churn of customers. These models were: Naive Bayes Classification Logistic Regression Decision Tree Support Vector Model Neural Net Bagging Boosting Random Forest Classifying Models 1. Naive Bayes Classification Model Using the e1071 library we built the Naive Bayesian Model using naivebayes command. The confusion matrix obtained was This model gives the accuracy 39.7% Lift chart 2

4 2. Logistic Regression Model Using the library stats we built the model with generalized linear model (GLM). The confusion matrix achieved was 0 1 Sum Sum With an accuracy of %. Lift chart 3. Decision Tree Using Recursive Partitioning (rpart) algorithm we designed our predictive churn model 3

5 (i.e. tree_model) with CHURNDEP as target variable. The accuracy of the model came out to be 71.3 %.The Confusion matrix obtained was Accuracy 72 % 4

6 Lift Chart 4. Support Vector Model 5

7 We again use the e1071 library to build the SVM model. The confusion matrix Accuracy 78.5% 5. Neural Net Using the neuralnet library, we made this neural net for the data_training 6

8 Ensemble Methods 1. Bagging Using the treebag method the data_training was trained to form the model. The confusion matrix was 0 1 Sum Sum Model accuracy came out to be 71.9% 2. Boosting In the boosting tree model, we first use three fold cross validation. We simple boost the tree fitting model and get the following confusion matrix 0 1 Sum Sum Accuracy 71.93% 3. Random Forest Now, we run a random forest algorithm. caret use cross validation to select the number of the predictors. Here we use three fold cross validation in this model due the computational cost.the confusion matrix obtained was 7

9 Accuracy is 71.8% KEY FACTORS THAT PREDICT CUSTOMER CHURN. The RoughSets technique was used to reduce the number of attributes from 52 to 10 and then to 8. The importance of these variables was calculated for the Random Forests classifying technique. INCOME RECCHRGE SETPRC CRERAT MARSTAT DROPVCE EQPDAYS AGE This shows that the attributes SETPRC, INCOME EQPDAYS and AGE1 are the most important attributes in predicting churn. This conclusion is ratified by the Decision Tree formed. Theses attributes make perfect sense for predicting the CHURN of customers. SETPRC is the Handset Price of the particular customer. A greater handset price shows a more affluent customer and one who has not entered into contracts with the telecom company for purchasing the Handset. A more affluent customer is more likely thus to churn if not satisfied. INCOME is the income of the customer. Greater income shows greater buying capacity and ability to churn if dissatisfied. 8

10 EQPDAYS. It makes business sense, as a customer who changes his old cell phone is likely to churn, because many mobile service providers gives new cell connection bundled with cell phone. AGE1 representing age of the first household member. We can assume that this is the age of the customer. Older customers being financially independent have a greater chance of churning, while younger customers with greater income have high chances of churning too. These along with the other attributes, namely, RECCHRGE, CRERAT, MARSTAT, DROPVCE form a subset that contains almost all the information that was present in the initial cleaned dataset, according to the theory of Rough Sets. CALCULATION OF THE PROFITABILITY OF THE RETENTION PLAN The Classification Model with the best accuracy is the SVM method. It has an accuracy of 78.5% It accurately predicts that 61 customers will be churning out of the total 31,047. It also inaccurately predicts that 6139 customers will churn, while in reality they do not churn. Therefore we need to take into consideration the amount of the profit that the 61 customers, that were prevented from churning, will be bringing into the company. This profit has to be larger greater than the costs of trying to retain the customers that are predicted to be churning by the model. The monthly revenue from a customer is given from the data to be We have 61 customers who have been retained thus the amount of increase in revenue is 61 * = We can thus spend to retain 6200 customers. This cost includes the service costs of the retained customers,cost on the data mining and the cost of the incentives given to all those customers which were predicted to churn. This calculation has been done on a per month basis. 9

11 Possible Incentives Offered Based on above derived important variables, the following incentives plan can be offered to the customers to reduce the possible churn From the model we got that if age1 > 0.35 (normalised score), the customer churns so we can provide special plans to older people. Strategies like making a family group and reducing the call rate in that family group.the family group plan will motivate older people to use more Telecommunication to stay in touch with their family group. It can be concluded that Customers with low age and a high income have a higher probability of churning. We need to offer services targeting the young customers like better messaging packs and Social Network plans. Better services to them will also help in retaining these customers. We need to see that the customers are affluent and thus do 10

12 not care so much about the cost of services than the services themselves. These customers can be kept in a high priority list to ensure that their calls do not get dropped. From the model we got that if setprc > 0.36 (normalised score). The telecom company should provide more options for contracting customers for longer periods on expensive handsets. This will ensure that the customers do not churn. All the customers predicted to be churned can be given special bonus offers like free talktime or better topup recharges to make them use the product. To see whether the schemes have worked we will see whether the number of churning individuals decreased in the customer bases that the schemes targeted. If this is the case then we will be continuing the schemes. Else the schemes will be modified. Data mining will be applied on the data of customers who actually churned and the rules for their churning will be extracted again and some new strategy will be implemented. 11