CHAPTER 8 PROFILING METHODOLOGY

Size: px
Start display at page:

Download "CHAPTER 8 PROFILING METHODOLOGY"

Transcription

1 107 CHAPTER 8 PROFILING METHODOLOGY 8.1 INTRODUCTION This research aims to develop a customer profiling methodology with reference to customer lifetime value, relationship, satisfaction, behavior using data mining techniques. The proposed methodology helps to maximize the customers and identifies a potential loss of customer at the earliest possible point. Content analysis is used to analyze text data. It is used to transform unstructured customer service information into structured customer service data. ANFIS model is used to discover customer knowledge. Customer Lifetime Value using Recency, Frequency, Monetary and Term method is used to find target customers. Clustering analysis can locate high value customers. Based on the result of clustering analysis, the number of target customers with high loyalty, high interest, and a high amount of purchase can be identified. Customer satisfaction can be improved with the help of the expert system developed by using Artificial Neural Networks. The expert system s role is to capture the knowledge of the experts and the data from the customer requirements, and then, process the collected data. Prediction is done using the previous transactions of the customers and data is estimated with the help clustering and association rules. The customers with similar purchasing behavior are first grouped by means of clustering techniques. Finally, for each cluster, an association rules extractor is used to identify the products that are frequently bought together by the customers from each segment. CARMS has been proposed to predict sales. The system involves different consecutive stages communicating with one another in generating rules as the data pre-processing, data partitioning, data transformation and association rule mining. 8.2 UNSTRUCTURED DATA TO STRUCTURED DATA USING TEXT MINING Text mining is used to transform unstructured customer service information into structured customer service data. Data can be collected from the following sources: (i) The Electronic News System (ii) The customer service hotline (iii) Word documents. Unstructured mail feedback details must be collected. All these feedback details were stored in text files. Using RapidMiner tool, each lines have been tokenized and the

2 108 occurrences of words are stored. The feedback details about products were rated and stored as Bad, Satisfactory, Good, Very Good, and Excellent. Based on the customer feedback, the products quality can be rated. Customer feedback details are maintained in the profiles. Fuzzy Logic Toolbox software computes the membership function parameters that best allow the associated fuzzy inference system to track the given input/output data. The Fuzzy Logic Toolbox function that accomplishes this membership function parameter adjustment is called anfis. The anfis function can be accessed either from the command line or through the ANFIS Editor GUI. Using a given input/output data set, the toolbox function anfis constructs a fuzzy inference system (FIS) whose membership function parameters are tuned (adjusted) using either a back propagation algorithm alone or in combination with a least squares type of method. This adjustment allows fuzzy systems to learn from the data they are modeling. Rules have been created according to the number of products. Text mining is used to transform unstructured data to structured data. The customer feedback and other details have been stored in database. Each product feedback has been maintained and ANFIS has been used to generate rules for the customer feedback. 8.3 CUSTOMER LIFETIME VALUE ANALYSIS AND RELATIONSHIP The customer is loyal, if he/she purchases more at his/her lifetime, buys products recently and spending more money during the lifetime. But if a customer doesn t purchase recently, total number of his/her purchasing is low and spent money is low, he/she is disloyal customer. Customer loyalty is a perfect feature for segmenting customers. Segmenting helps companies to tailor their offerings to each of the segment. One of the common ways of segmentation is based on loyalty & profitability. The relationship between customer and firm will be identified based on RFM method. The customer type based on term is listed in table 8.1. CLV will be changed according to term value. Long-term customers are valuable customers and they have to be attracted more. Customers of clusters 2, 3, and 6 are long-term customers. Customers of clusters 1, 4, 5, 7 & 8 are short-term customers. Customers of cluster 1 are new customers. Comparison of integrated CLV by RFM and RFMT models is shown in table 8.2. The CLV accuracy of long-term customers is increased in RFMT model. Table 8.3 illustrates the CLV based on RFMT model.

3 109 Table 8.1: Customer Type based on Term Cluster Recency Frequency Monetary Term Integrated CLV Customer CLV Rank Type Rating New Long-Term Long-Term Short-Term Short-Term Long-Term Short-Term Short-Term Average C j I = W R C j R + W F C j F + W M C j M + W T C j T (where W R = 0.626, W F = 0.236, W M = and W T= 0.056) Table 8.2: Comparison of Integrated CLV by RFM and RFMT Models Cluster Integrated Rating by RFM Model Integrated Rating by RFMT Model CLV Rank

4 110 Table 8.3: CLV based on RFMT Model Cluster CLV Rank The company can have three communication channels which are categorized based on their implementation costs from high to low that are stated as follows, respectively: 1. Verbal communications (Face-to-face, telephone) 2. Written communications (catalogues, proposals, s, letters, and training manuals) 3. External communications (Media advertisement and web pages). According to the ranking of the clusters shown in table 8.3, board of directors of the company can decide better about allocating each of this communication channels to the ranked clusters. Consequently, some strategic decisions can be made regarding to attain new customers, maintain current customers, and increase the satisfaction of high value customers. The figure 8.1 shows the comparison of integrated CLV rating by RFM and RFMT models. The recency analysis of each cluster is shown in figure 8.2. The frequency analysis of each cluster is shown in figure 8.3. The monetary analysis of each cluster is shown in figure 8.4. The term analysis of each cluster is shown in figure 8.5.

5 111 Fig 8.1: Comparison of Integrated CLV Rating by RFM and RFMT Models Fig. 8.2: Recency Analysis of Each Cluster

6 112 Fig. 8.3: Frequency Analysis of Each Cluster Fig. 8.4: Monetary Analysis of Each Cluster

7 113 Fig. 8.5: Term Analysis of Each Cluster 8.4 IMPORTANCE OF RELATIONSHIP AND SATISFACTION Customer satisfaction can be improved with the help of the expert system developed by using Artificial Neural Networks. An expert system is the computer system that emulates the behavior of human experts in a well-specified manner, and narrowly defines the domain of knowledge. It captures the knowledge and heuristics that an expert employs in a specific task. After completing the recording of the expert knowledge and the details of the eight groups of customers (male/female of teen, young, adult, and old) purchased into the expert system on chosen products characteristics about colors / scheme. The aim of the research is to evaluate the accuracy between the result advised by the expert system and the chosen product (colors / design) from each type of any customer. So, the higher number of customer satisfaction on the product (colors / design) advised by the expert system means higher accuracy of the expert system. The satisfied customer will maintain good relationship with the company. The correlation coefficient between male and female customers is shown in figure 8.6. Correlation coefficient between Male teen and Female Teen is It has negative correlation. Correlation coefficient between Male Young and Female Young is It has strong positive correlation. Correlation coefficient between Male adult and Female adult is It has negative correlation. Correlation coefficient between Male old and Female old is It has weak positive correlation.

8 114 Fig 8.6: Correlation Coefficient between Male and Female Customers 8.5 CUSTOMER BEHAVIOR An efficient CARMS architecture is proposed to discover customer group based rules. In order to obtain the rules, both the customer and the product domains have been bridged. Clustering and Association rule mining were incorporated to analyse the similarity between customer groups and their preferences for products. The complete set of rules must be stored in a separate knowledge base. RFMT based Apriori, despite its simple logic and inherent pruning advantage, suffers from limitations of a huge number of repeated input scans. RFMT based FP Growth algorithm is used to extract important and effective rules. The figure 8.7 shows the redundant free rules for bookstore dataset of cluster 1. The figure 8.8 shows the redundant free rules for life insurance dataset of cluster 1. The following association rules are best rules of bookstore dataset. 1. [PoliticsBooks] --> [ChildBooks] (confidence: 0.900) 2. [PoliticsBooks] --> [GeogBooks] (confidence: 0.900) 3. [FrenchBooks] --> [GeogBooks] (confidence: 0.900) 4. [ChildBooks, ItBooks] --> [GeogBooks, PoliticsBooks] (confidence: 0.904) 5. [PoliticsBooks, ArtBooks] --> [GeogBooks, CookBooks] (confidence: 0.904) 6. [PoliticsBooks, ArtBooks] --> [GeogBooks, FrenchBooks] (confidence: 0.904) 7. [FrenchBooks, ItBooks] --> [GeogBooks, PoliticsBooks] (confidence: 0.904) 8. [PoliticsBooks, ArtBooks] --> [GeogBooks, ItBooks] (confidence: 0.904) 9. [ArtBooks, ItBooks] --> [GeogBooks, PoliticsBooks] (confidence: 0.904)

9 [PoliticsBooks, ArtBooks] --> [ChildBooks, GeogBooks, CookBooks] (confidence: 0.904) Fig 8.7: Redundant Free Rules for Bookstore Dataset

10 116 Fig 8.8: Redundant Free Rules for Life Insurance Dataset The following association rules are best rules of life insurance dataset. 1. [WealthPlus, JeevanSaral ] --> [JeevanAnand ] (confidence: 0.905) 2. [MarketPlus, KomalJeevan, JeevanAnand, JeevanVarsha ] --> [PensionPlan]

11 117 (confidence: 0.906) 3. [JeevanAnand, PensionPlan ] --> [MarketPlus, KomalJeevan ] (confidence: 0.907) 4. [JeevanAnand, PensionPlan ] --> [MarketPlus, JeevanVarsha ] (confidence: 0.907) 5. [KomalJeevan, JeevanAnand, JeevanVarsha ] --> [PensionPlan ] (confidence: 0.907) 6. [JeevanAnand, PensionPlan ] --> [KomalJeevan, JeevanVarsha ] (confidence: 0.907) 7. [MarketPlus, JeevanVarsha ] --> [JeevanAnand ] (confidence: 0.909) 8. [PensionPlan ] --> [MarketPlus, JeevanVarsha ] (confidence: 0.909) 9. [JeevanVarsha ] --> [MarketPlus] (confidence: 0.917) 10. [JeevanAnand, JeevanSaral ] --> [MarketPlus] (confidence: 0.917) 8.6 ANALYSIS OF PROFILES The basic component of customer knowledge comes from a customer profile that is obtained by the use of a database and data mining technologies used in organizations, Adomavicius and Tuzhilin (2001). Customer profiling methodology is one of the most important strategies for knowing more about customers. In summary, using a customer profile is the technique, which converts raw information about customers into the strategicsupport knowledge that reinforces the value of goods which companies offer customers. The customer profile describes the characteristics of the customer who could really benefit of product or service. The customer profile based on CLV for cluster 3 is shown in figure 8.9. The customer profile based on CLV for cluster 4 is shown in figure Fig 8.9: Customer Profile based on CLV for Cluster 3

12 118 Fig 8.10: Customer Profile based on CLV for Cluster 4 The execution time of RFMT based Apriori and FP Growth algorithms are shown in figure RFMT based FP Growth algorithm s execution time is low. Fig 8.11: Execution Time Comparison The customers have been segmented based on RFMT variables. The long-term, short-term and new customers are identified efficiently. The customer profile must be in the following order:

13 Long-term and loyal customers data. 2. New and loyal customers data. 3. Short-term and loyal customers data. 4. Short-term and disloyal customers data. 8.7 SUMMARY The unstructured data must be transformed into structured data. The customer feedback details must be tokenized. The feedback for each item must be recorded into the customer database. The profiling methodology is based on CLV, relationship, satisfaction and behavior. The customers with similar purchasing behavior are first grouped by means of clustering techniques. Finally, for each cluster, an association rules extractor is used to identify the products that are frequently bought together by the customers from each segment.