Profiling Methodology with reference to Customer Lifetime Value, Relationship, Satisfaction and Behavior Using Data Mining Techniques.

Size: px
Start display at page:

Download "Profiling Methodology with reference to Customer Lifetime Value, Relationship, Satisfaction and Behavior Using Data Mining Techniques."

Transcription

1 Profiling Methodology with reference to Customer Lifetime Value, Relationship, Satisfaction and Behavior Using Data Mining Techniques Synopsis Submitted in partial fulfillment of the Degree of Doctor of Philosophy (Ph. D) In Computer Science By P. Isakki alias Devi Under the guidance of Dr. S. P. Rajagopalan School of Computer Science March 2012

2 SYNOPSIS OF THE PH. D THESIS ENTITLED Profiling Methodology with reference to Customer Lifetime Value, Relationship, Satisfaction and Behavior Using Data Mining Techniques SUBMITTED BY P. ISAKKI ALIAS DEVI Data mining is the non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. It automates the detection of relevant patterns in a database. It helps to increase the customer revenue and customer profitability. Customer lifetime value using recency, frequency and monetary method is used to find target customers. Clustering analysis is useful in locating high value customers. Based on the result of clustering analysis, the number of target customers with high loyalty, high interest, and a high amount of purchase can be identified. Data mining provides the structure to record whole customer s information, detecting important customers systematically, the change of identifying the individual and valuable customers. Data mining technologies are used to analyze the customers behavior to form the customer s profile. The expert system s role is to capture the knowledge of the experts and the data from the customer requirements, and then, process the collected data and form the appropriate rules. As per the correlation coefficient, the customer s behavior is determined. The analysis of customer behavior is used to maintain good relationship with customers in order to maximize the customer satisfaction. Prediction is done using the previous transactions of the customers and data is estimated with the help clustering and association rules. We can extract knowledge from transactions records that aims to improve service levels and increase sales. Customer details are segmented and the association rules used to identify customer behavior. This research aims to develop a profiling methodology with reference to customer lifetime value, relationship, satisfaction and behavior using data mining techniques. The proposed methodology helps to maximize the customers and identifies a potential loss of customer at the earliest possible point.

3 The present thesis is organised into the following chapters 1. Introduction 2. Literature Survey 3. Mining Unstructured Data using Artificial Neural Network and Fuzzy Inference Systems Model for Customer Relationship Management 4. A Framework for Customer Lifetime Value using Data Mining Techniques 5. The Expert System designed to improve Customer Satisfaction 6. Analysis of Customer Behavior using Clustering and Association Rules 7. Integration of Clustering and Association Rule Mining to mine customer data and predict sales 8. Profiling methodology 9. Conclusions

4 Chapter 1 Introduction This is introductory in nature and discusses the basis of profiling methodology using data mining techniques. The major activities in the area of data mining for profiling methodology are briefed. This chapter acts as a brief guide to the related literature survey of the thesis. The basic notions and definitions necessary for the work presented in thesis are briefly reviewed and this helps to introduce the notation followed in the rest of the thesis. 1

5 Chapter 2 Literature Survey An extensive literature survey is carried out as part of this research in order to analyse the profiling methodology using data mining techniques. The profiling methodology is based on customer lifetime value, relationship, satisfaction and behavior using data mining techniques. Customer lifetime value using recency, frequency and monetary (RFM) method is used to find target customers. Customer data usually consists of many variables. There is a large amount of research that suggests RFM variables appear to be a good source for predicting customer behavior. Recency variables store information regarding the timeframe between purchases or use of service. A lower value suggests a higher probability of the customer making a repeat purchases. Frequency variables are those connected to how often the service is used. In general it can be assumed that the higher the frequency, the more satisfied the customer is with the service. A monetary variable for a customer would be the total sum of money a customer spends on his/her services over a certain time period. Those customers with high monetary values are the ones an organization should be most interested in retaining. The expert system s role is to capture the knowledge of the experts and the data from the customer requirements, and then, process the collected data and form the appropriate rules. Prediction is done using the previous transactions of the customers and data is estimated with the help clustering and association rules. The customers with similar purchasing behavior are first grouped by means of clustering techniques. Finally, for each cluster, an association rules extractor is used to identify the products that are frequently bought together by the customers from each segment. Clustering based Association Rule Mining System (CARMS) is proposed to predict sales. 2

6 Chapter 3 Mining Unstructured Data using Artificial Neural Network and Fuzzy Inference Systems Model for Customer Relationship Management We can apply data mining technologies to analyze the customer s behavior in order to form the right of customers profile. This research uses content analysis to transform unstructured textual content into structured data. Unstructured data has no identifiable structure. It includes bitmap images/objects, text and other data types that are not part of a database. It is a generic label for describing any corporate information that is not in a database. Textual unstructured data is generated in media like messages, PowerPoint presentations, Word documents, collaboration software and instant messages. Human intervention is required to make the unstructured data as machine readable. Data can be collected from the following sources: (i) The Electronic News System (ii) The customer service hotline (iii) Word documents The company can extract data from customer service center systems and convert the data into Excel format for study personnel. Content analysis can be used to analyze the data content of customer inquiries. After the analysis is completed, the data combined with other data in the database, and the necessary programs are written to transform and clean the data. Finally, the data is integrated and its accuracy verified by checking items such as customer listing and data formats. Content analysis is used to analyze text data. It is used to transform unstructured customer service information into structured customer service data. This is used to find ways to analyze text data in order to discover more latent knowledge. Artificial Neural Network and Fuzzy Inference Systems (ANFIS) model is used to discover the customer knowledge. 3

7 Chapter 4 A Framework for Customer Lifetime Value using Data Mining Techniques To build an accurate customer profile, the analysis of customer data needs to be completed first. Recency Frequency Monetary (RFM) analysis is used to determine customer behavior and analyzing market segment. (1) R (Recency): the period since the last purchase; a lower value corresponds to a higher probability of the customer s making a repeat purchase; (2) F (Frequency): number of purchases made within a certain period; higher frequency indicates greater loyalty; (3) M (Monetary): the money spent during a certain period; a higher value indicates that the company should focus more on that customer. The value of RFM analysis as a method to identify high response customers in marketing promotions, and to improve overall response rates is well known and is widely applied today. Using RFM analysis, the company will be able to identify opportunities to create marketing that are relevant to the customers. After getting the relative weight of RFM and the RFM of an individual customer, a customer lifetime value (CLV) can be calculated by the following steps: Step1: Standardize each customer's RFM value Step2: Calculate each customer's CLV Step3:Calculate the CLV of each cluster Based on the result of clustering analysis, the number of target customers with high loyalty, high interest, and a high amount of purchase can be identified. Clustering customers into different groups not only improves the quality of recommendation but also helps decision-makers identify market segments more clearly and thus develop more effective strategies. 4

8 Chapter 5 The Expert System designed to improve Customer Satisfaction Customer satisfaction can be improved with the help of the expert system developed by using Artificial Neural Networks. The expert system s role is to capture the knowledge of the experts and the data from the customer requirements, and then, process the collected data and form the appropriate rules for choosing products. In order to identify the hidden pattern of the customer s needs, the Artificial Neural Networks technique has been applied to classify the products based upon a list of selected information. In addition, the expert system has been validated with a different customer types. The expert system should have three main components 1. User interface 2. Inference engine for making decision 3. Database for storing the data rules, and training the system. The customers can interact with the interface of the expert system to ask and get the advices from the system. The inference engine consists of: The production rules from expert knowledge The data of customers that are classified by the ANN system. Another main result, knowledge from data analyzing, guides us in detail about how to utilize customer behaviour on preferred products in combination with knowledge captured from the expert in prediction of the product. This benefits greatly the manufacturer in offering the right product to the right customer group of its new product in order to achieve customer satisfaction. Correlation Coefficient can be found. As per the correlation coefficient, the customer s behavior is identified. Correlation coefficient (r) is a measure of the degree of correlation between two quantities or variables. The linear correlation coefficient measures the strength and the direction of a linear relationship between two variables. 5

9 Chapter 6 Analysis of Customer Behavior using Clustering and Association Rules The analysis of customer behavior is used to maintain good relationship with customers in order to maximize the customer satisfaction. We can also improve customer loyalty and retention. We can develop a trend for launching products with configurations for customers of different gender based on past data of purchases done by customers. Prediction is done using the previous transactions of the customers and data is estimated with the help clustering and association rules. The customers with similar purchasing behavior are first grouped by means of clustering techniques. Finally, for each cluster, an association rules extractor is used to identify the products that are frequently bought together by the customers from each segment. Clustering analysis is a data mining technique that maps data objects into unknown groups of objects with high similarity. Definition 1 : Given a set of items I = { I 1,I 2,,I s ), and the database of transaction records D = {t 1,t 2,,t n }, where t 1 = { I i1, I i2,,i ik } and I ij I, an association rule is an implication of the form X=> Y where X,Y C I and X Y = Φ. Definition 2: The support (s) for an association rule X=>Y is the percentage of transactions in the database that contain X U Y. That is, support (X =>Y ) = P (X U Y ), P is the probability. Definition 3: The confidence or strength ( Φ ) for an association rule (X=>Y) is the ratio of the number of transactions that contain X U Y to the number of transactions that contains X. That is confidence (X=>Y) = P(Y X). Product association rules can be used to motivate customers to increase their purchases and keep loyal to the company. The behavior of customers and relationships can be easily identified. The most frequent itemsets can be easily found out from the database. 6

10 Chapter 7 Integration of Clustering and Association Rule Mining to mine customer data and predict sales A predictive mining approach is presented in this chapter that predicts sales for a new location based on the existing data. A new methodology is proposed to identify customer s behaviour also. The methodology is based on the integration of data mining approaches such as clustering and association rule mining. Clustering based Association Rule Mining System (CARMS) is proposed to predict sales at a different location. The system involves different consecutive stages communicating with one another in generating rules as the data pre-processing and data partitioning, data transformation, association rule mining and presentation modules. Before proceeding to the rule mining of datasets, raw data must be pre-processed in order to be useful for knowledge discovery. All target data should be organized into a usable transaction database. This involves the clear understanding of the variables, selection of attributes, which are more pertinent in generating rules. Classification of customer data is most important for business support and decision making. We need to identify newly emerging trends in business process. Sales patterns from customer data can be used in forecasting which has great potential for decision making, strategic planning and market competition. All target data should be organized into a usable transaction database. Both the customer and the product domains are bridged based on clustering. Clustering and Association rule mining are incorporated to analyse the similarity between customer groups and their preferences for products. The complete set of rules generated can be stored in a separate knowledge base. 7

11 Chapter 8 Profiling Methodology This research aims to develop a customer profiling methodology with reference to customer lifetime value, relationship, satisfaction, behaviour using data mining techniques. The proposed methodology helps to maximize the customers and identifies a potential loss of customer at the earliest possible point. With reference to the following, the customers profile can be designed: Content analysis is used to analyze text data. It is used to transform unstructured customer service information into structured customer service data. ANFIS model is used to discover customer knowledge. Customer Lifetime Value using Recency, Frequency and Monetary method is used to find target customers. Clustering analysis can locate high value customers. Based on the result of clustering analysis, the number of target customers with high loyalty, high interest, and a high amount of purchase can be identified. Customer satisfaction can be improved with the help of the expert system developed by using Artificial Neural Networks. The expert system s role is to capture the knowledge of the experts and the data from the customer requirements, and then, process the collected data and form the appropriate rules for choosing products. Prediction is done using the previous transactions of the customers and data is estimated with the help clustering and association rules. The customers with similar purchasing behavior are first grouped by means of clustering techniques. Finally, for each cluster, an association rules extractor is used to identify the products that are frequently bought together by the customers from each segment. CARMS is proposed to predict sales. The system involves different consecutive stages communicating with one another in generating rules as the data pre-processing and data partitioning, data transformation, association rule mining and presentation modules. 8

12 Chapter 9 Conclusions The scholar has identified profiling methodology with reference to customer lifetime value, relationship, satisfaction and behavior using data mining techniques. The proposed methodology helps to maximize the customers and identifies a potential loss of customer at the earliest possible point. RFM analytic approach is used to evaluate customer's loyalty and the contribution in the field of marketing management, and is used to assess customer's lifetime value. Content analysis is used to analyze text data. It is used to transform unstructured customer service information into structured customer service data. ANFIS model is used to discover customer knowledge. The expert system is designed to improve customer satisfaction. Customer data is classified with the help of Clustering. Association rules are used to identify customer s behavior. The future work can be extended the profiling methodology with reference to cross-selling and up-selling opportunities. The stable customers at the time of profile generation are assumed that they will be the most sensitive to competing offers. The work can be extended to consider business environment changes also. 9

13 List of papers published based on this thesis I. Publications in International Journals 1. P.Isakki alias Devi and Dr.S.P.Rajagopalan, A framework for Customer Lifetime Value using Data Mining Techniques, International Journal of Computer Science and Information Engineering, Volume 3, Number 1, June 2011, pp ISSN: X. 2. P.Isakki alias Devi and Dr.S.P.Rajagopalan, Mining Unstructured Data using Artificial Neural Network and Fuzzy Inference Systems Model for Customer Relationship Management, International Journal of Computer Science Issues (IJCSI), Volume 8, Issue 4, July 2011, pp ISSN (online version): P.Isakki alias Devi and Dr.S.P.Rajagopalan, The Expert System designed to improve Customer Satisfaction, Advanced Computing : An International journal (ACIJ), Volume 2, Number 6, November 2011, pp 69-84, ISSN : [Online] ; X [Print]. II. Publications in International Conferences 1. Integration of Clustering and Association Rule Mining for Predicting Sales, S.A.Engineering College, 24 th and 25 th February 2012, Proceedings of the International Conference on Information, Communication & Embedded Systems ( ICICES 2012 ). III. Publications in National Conferences 1. Application of Data Mining for an Effective Customer Relationship Management, Tagore Engineering College, 11 th and 12 th April 2011, Proceedings of the National Conference on Recent Innovations in Computing Technology (NCRICT 11). 2. Mining Customer Data Using Clustering and Association Rule Mining, Bhaktavatsalam Memorial College for Women, 2 nd and 3 rd March 2012, Proceedings of the National Conference on Challenges in Business Practices. 10

14 References 1. Athakorn Kengpol, Worrapon Wangananon (2006), The expert system for assessing customer satisfaction on fragrance notes: Using artificial neural networks, Computer & Industrial Engineering, Vol 51, Issue 4, pp Bose and R. K. Mahapatra, Business data mining - a machine learning perspective, Information and Management, Vol.39, 2001,pp Ching-Hsue Cheng, You-Shyang Chen, Classifying the segmentation of customer value via RFM model and RS theory, Expert Systems with Applications,Volume 36 Issue 3, April, Dawn E. Holmes, Jeffrey Tweedale and Lakhmi C. Jain, Data Mining Techniques in Clustering, Association and Classification, Data Mining : Foundations and Intelligent Paradigms, Intelligent Systems Reference Library, 2012, Springer, Volume 23, Erkan Bayraktar, Ekrem Tatoglu, Ali Turkyilmaz, Dursun Delen (2011), Measuring the eficiency of customer satisfaction and loyalty for mobile phone brands with DEA, Expert Systems with Applications, Vol 39, Issue 1, pp E.W.T. Ngai, Li Xiu and D.C.K. Chau, Application of data mining techniques in customer relationship management: A literature review and classification, Expert Systems with Applications, 2009, Vol 36,Issue 2, Part 2, pp Gang Cui, A methodologic application of Customer Retention based on back propagation Neural Network prediction, Computer Engineering and Technology, 2 nd International Conference, 2010, V3-418,V Ismail, R., Othman, Z. and Bakar, A.A, Associative prediction model and clustering for product forecast data, Intelligent Systems Design and Applications, 2010,10 th International Conference. 9. Jo Ting, Mining of stock data: inter- and inter-stock pattern associative classification, Prroceddings of 2006 I nternational conference on data mining Las Vegas, USA, June Liu, D. & Shih, Y, Hybrid Approaches to Product Recommendation Based on Customer Lifetime Value and Purchase Preferences, The journal of Systems and Software, 2004, 77, Meier A.,Werro N.,Albrecht M.,Sarakinos M.,2005,Using a Fuzzy Classification Query Language for Customer Relationship Management,Proceedings 31st International Conference on V. 12. Pradip Kumar Bala (2010), Purchase-driven classification for Improved Forecasting in spare parts inventory replenishment, International Journal of Computer Applications, Vol.10- No Seyed Mohammad Seyed Hosseini, Anahita Maleki, Mohammad Reza Gholamian (2010), Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty, Expert Systems with Applications, Vol 37, Issue 7, pp Shim, Beom-Soo and Suh, Yong-Moo (2010), CRM Strategies for A Small-Sized Online Shopping Mall Based on Association Rules and Sequential Patterns, PACIS 2010 Proceedings. 15. S.Kotsiantis, Kanellopoulos, Association Rules Mining : A Recent Overview, GESTS International Transactions on Computer Science and Engineering, Vol.32(1), 2006, pp