Checking and Analysing Customers Buying Behavior with Clustering Algorithm

Size: px
Start display at page:

Download "Checking and Analysing Customers Buying Behavior with Clustering Algorithm"

Transcription

1 Pal. Jour. V.16, I.3, No , Copyright 2017 by Palma Journal, All Rights Reserved Available online at: Checking and Analysing Customers Buying Behavior with Clustering Algorithm Reza Molaei, Master of Electronic Commerce, Khajeh Nasir-Al-Deen Toosi Industrial University, Tehran Somayeh Alizadeh Assistant Professor of Department of Information Technology, Industrial Engineering College, Khajeh Nasir-Al-Deen Toosi Industrial University, Tehran Abstract Due to the large number of producers in different industries including food industry, there is an increasing need for companies to have interactive and effective relationships with their customers, understand their needs and fill the gap between them by analyzing these needs. Data mining helps customer relationship management to reach its ultimate goal which is profitability of the company and customer loyalty and satisfaction. In this paper, customer value is determined by RFM method. In the next step, based on the value, the customers are categorized using K-means clustering algorithm. Finally, purchasing behavior of the customers of the food company are analyzed so that the managers of the company can make strategic decisions for each cluster based on the sufficient knowledge about that cluster. Keywords: Customer relationship management, RFM model, Customer behavior. Introduction Considering the increase in customer s value and rights and since their needs are the top priorities in today s businesses, companies focus on increasing customer s loyalty and satisfaction in order not to lose them. Companies should implement innovative methods for collecting and analyzing customer needs in order to face the competitive and complicated environment and turn the customers into permanent ones[1]. To obtain this goal, customer relationship management is a well-known strategy for finding and retaining customers. The most important goal in relationship with customers is finding permanent and profitable customers for a company [2]. Many companies in different industries have a large amount of information about their customers which is considered as a great asset for them. They can discover hidden knowledge in the raw data [3]. Data mining is the process of obtaining knowledge and recognizing hidden patterns from great amounts of data using mathematical and statistical relations, artificial intelligence and machine learning techniques. The common concept in all of the definitions of data mining is obtaining and discovering hidden knowledge from large databases. In the general sense, this means that companies can gain opportunities by analyzing customer information, discovering patterns and useful knowledge from the data and using this knowledge for decision making [4]. In this paper, we have studied the customers of a chocolate and cookies producing and distributing company. Considering the high market share of this company and the presence of many producers in this field, identifying customers and categorizing their needs is of high importance. The paper is structured as follows: customer relationship management, RFM mode and K-means algorithm have been explained. In the second section, the model and the process of the research have been mentioned and in the last section, the conclusion is presented. Data mining in CRM In the early 1980s, the concept of CRM in marketing was developed which is important from four perspectives, identifying, gaining, retaining and increasing customers. Finding a confirmed definition for CRM is difficult. We can describe CRM as a comprehensive strategy for gaining, retaining and collaborating in order to create value for companies and customers [5]. Customer relationship management is a comprehensive business and marketing strategy which integrates technology, processes and all the activities of the business with a focus on customer. CRM is part of the strategy of a company for identifying Palma Journal

2 Customers Buying Behavior with Clustering Algorithm 487 customers, keeping them satisfied and turning them into permanent customers. It also helps the company to maximize customer value [6]. In other words, customer relationship management is a method, system and most importantly, a guideline in business which aims to categorize customers and manage them in order to optimize customer value in the long term. In fact, Customer relationship management is finding customers, approaching them, satisfying them and retaining them [7]. In every stage of obtaining customers, increasing customer value and retaining profitable customers, discovering patterns from database can increase profitability if combined with customer relationship management or executed as an independent applications [8]. Customer relationship management is a process which consists of monitoring customers, collecting proper data, managing and evaluating the data and finally, creating a real advantage from the obtained data in customer relationships [9]. Richard and Jones [10] have created a list of expected benefits of customer relationship management by examining 26 studies conducted before They have selected six main advantages among all the others: Improving the ability to target profitable customers Integration of the communication channels with customers Efficiency and effectiveness of sales force Personalizing marketing messages Customizing services and products Efficiency and effectiveness of customer services An important marketing concept which is being implemented in today s businesses is identifying the value of each customer. This concept leads the business to personalizing, creating value and better services. Value of each customer is defined as the profit gained in each transaction. We are usually interested in evaluating each customer in the long term and with more transactions. In other words, by comparing the calculated value against the expected value, we can determine the quality of a customer for the company. Therefore, value and frequency of transactions may indicate the potential share of each customer. This is what management looks for. A common model for measuring customer value is RFM (Recency, Frequency, Monetary) which consists of three main factors, Recency, Frequency and monetary. Recency indicates the length of time from the last transaction, frequency means the number of transactions and monetary means the value of transactions in term of money [11]. RFM model is often used in clustering techniques and has a long history in direct marketing. Decision makers can effectively identify valuable customers and then develop efficient marketing strategies using the RFM model [12]. Clustering algorithm Data mining is the process of the discovery of useful information from large amounts of stored data. Data mining is an attempt for searching useful patterns in huge amounts of data which would not be possible without data mining process [13]. Clustering is one of the tools and methods of data mining which clusters data objects into groups [14] in a way that the data in a cluster are most similar to each other and have the least similarity to the members of other clusters [15]. K-means is one of the most simple and famous clustering algorithms. This algorithm was developed by Mac Queen based on the mean value of each cluster and assigning each data to a cluster which is closest to the mean of that cluster [16]. In this algorithm, K indicates the default number of clusters specified by the user. In this research, we will cluster our customers using R, F and M attributes which are the input parameters of the algorithm. A model for evaluating the behavior of the customers of the company In this section, a mode for analyzing customer behavior using data mining techniques is presented. The main objective of this model is to help the managers of the chocolate and cookie producing company to make better decisions based on the complete knowledge and intelligence about customer clusters. To cluster customer behaviors, we have used a standard process called CRISP, which has been developed by data mining experts [16]. This standard describes the data mining cycle in understanding business and data, preprocessing data, modeling, evaluation and development. The main steps of the methodology of this research have been developed based on the CRISP methodology. Since the goal is to cluster customers, we fist clustered them based on RFM model and K-means algorithm and in the next step, we considered a label

3 488 R.Molaei and S. Alizadeh for each cluster according to its characteristics. Figure 1 shows the main steps of the methodology of this study. Preparing data The paper title should be typed with the first character in every word in capital letters, centered at the first page. The title should be bold printed and should not be underlined. The name(s) of the author(s) and author affiliation(s) should be centered, with the first character in every word in capital letters. In the end of the title block and after the author affiliation(s) the author s address must be written. Assigning value to customers based on RFM method In this method, customer behavior can be analyzed and categorized based on these three parameters: Recency: indicates the time period from the last transaction of the customer. Frequency: indicates the number of transactions of the customer with the company. Monetary value: indicates the total monetary value of all the transactions of the customer. For each customer, we considered the parameters about purchase date, total monetary value of the purchase and the recent purchase date. In RFM method, the total value of each customer is estimated using a parameter called RFM score. The higher the RFM score of a record, the higher the value of that record or to be precise, that customer for the company. R, F and M parameters can have different weights in calculating RFM score. In this paper, we assumed the same weight for the parameters. Figure 1: A figure fitted in a column Clustering customer value We considered K-means algorithm for clustering customers with R, F and M scores as input parameters. To have the best clusters for the customers, we have used Dunn index [17]. The goal is to have the highest similarity among the members each cluster and the highest difference among the members of two different clusters [18]. Dunn index was calculated for two to ten clusters. After comparing the Dunn index value for this number of clusters, we came to the conclusion that value of this index for four numbers of clusters is optimal; in other words, its value was smallest for four clusters and therefore is selected as the optimum number of customer clusters. Figure two displays Dunn index for two to ten clusters.

4 Customers Buying Behavior with Clustering Algorithm 489 Figure 2: Evaluating the number of clusters with Dunn index After calculating the optimal number of clusters, we determined the number of members in each cluster as shown in figure 3. Figure 3: Number of members in each cluster Table 1 shows the mean and the intervals of RFM parameters for each cluster. Table 1. Mean an intervals of RFM parameters for each cluster Recency Frequency Monetary RFM-Score Min Max Ave. Min Max Ave. Min Max Ave. Min Max Cluster Cluster Cluster Cluster Labeling clusters To have a more precise understanding of customer behaviors after finding the number of clusters, we can assign a meaningful label to each cluster based on the common characteristics of the members in each cluster [19]. In this paper, we have labeled customer clusters as follows: 1) Loyal customers: These customers have high interactions with the company are considered as loyal. 2) Satisfied customers: These customers are ranked as two in terms of frequency of purchase and have good interactions with the company. 3) Future customers of the company: The purchase frequency of this cluster is less than the second cluster but when we examine the R parameter of the members of the cluster, we come to this conclusion that despite their short acquaintance with the company, the purchases of these customers have high value which means the company has high hopes for the profitability of these customers and the competitiveness of this cluster

5 490 R.Molaei and S. Alizadeh is higher than the second one due to their short lifetime with the company and the high monetary value of their purchases. 4) Runaway customers: Customers of this group have low frequency, purchase value and recency score. They have little profitability for the company. After grouping the customers in four clusters, their profile is as follows: Cluster 1 customers: The customers have the highest purchase value and frequency. In other words, they are the most profitable customers of the company who are labeled as loyal customers. Cluster 2 customers: They rank as two in terms of frequency and value of their purchases. One of the best things that can happen to the company is that members of this cluster turn to cluster one customer. We label them as satisfied customers. Cluster 3 customers: Competitiveness in this cluster is more intense than the second one since the interactions of these customers with the company is less. We label them as future customers. Cluster 4 customers: In the past three years, the frequency and monetary value of the purchases in this cluster has been low and they may potentially turn to other competitors so we label them as runaway customers. Analyzing the behavior of each cluster The benefit of clustering similar customers in different groups is that we can have a special strategy for each group of the customers [4]. The following is cluster analysis in the past three years according to figure 4. It is important to note that the data of 2013 is available for the first half of the year and the analysis has been done for the data until the first of the year. By examining the charts in figure 4, we draw the conclusion that the customers in the first cluster have the highest purchase frequency rate and monetary value. They have recently had transactions with the company; in other words, it has not been a long time since their last purchase. The strategies that company managers can adopt for this group of customers are : Devising methods for retaining the loyalty of this cluster Using the knowledge and experience of these customer to create loyalty in other clusters By examining the second cluster in charts of figure four, it becomes clear that even though the purchases of these customers do not have a high monetary value, the frequency is high and they are considered as good customers of the company. Proposed strategies for this group of customers are: Using the knowledge and experience of cluster one customers for cluster two customers. Devising encouraging methods for turning these customers to loyal ones. Examining solutions for decreasing purchase frequency and increasing purchase value in order to optimize logistics and distribution costs. According to figure 4, the frequency of purchases in the third cluster is relatively low but the monetary value is high so distribution cost of these customers are less than that of the second cluster since in each transaction, they buy a greater deal of goods. Another important point is that these customers have recently started their interactions with the company and their acquaintance is short. Strategies that can be adopted by the managers are: Since the lifetime of the interactions in this group is short, competitiveness is high; that is why the interactions of these customers with competitors should be analyzed and followed up. Advertising products and devising methods for encouraging this group. Preparing questionnaires can help to better understand the customers and identify their needs. By examining the charts in figure 4, we realize that the customers in the fourth cluster have the lowest purchase frequency and value and they have had few interactions with the company and they are running away, so to speak. The company should pay attention to the considerable number of members in this cluster who make a large portion of the customers of the company (71 out of 365). Proposed strategies for this cluster are: Devising solutions for better understanding of the needs of the customers and having closer relationships with them. Devising promotional and discount methods for the customers Finding out whether they have turned to competitors to buy their products? What are the reasons?

6 Customers Buying Behavior with Clustering Algorithm 491 Figure 4: Clusters comparison conceptual charts (* In order to draw the position of clusters 2 and 3 relative to each other, we have used A as the mean point for the parameters of the horizontal and vertical axis.) Conclusion In this study, a model was presented for evaluating and analyzing the purchasing behaviors of customers in a chocolate and cookies producing company. RFM method and K-means algorithm was used to cluster the customers. In the cluster behavior analysis section, a label was assigned to each cluster in order to have a better understanding of the cluster performance. In the last section, purchase frequency, recency and monetary value of each cluster of customers was examined. Research results show the interactions of the company with its customers and the level of stickiness of the interactions with the company during those three years. These results help the managers of the company to analyze the reasons for high or low stickiness and develop specific advertising, promotional and other strategies for each cluster.

7 492 R.Molaei and S. Alizadeh References A. Berson, S. Smith and K. Thearling, "lding Data Mining Applications for CRM", Manhatten: McGraw-Hill, A. Parvatiyar and J. N. Sheth, ""Customer relationship management: Emerging"," Journal of Economic and Social Research, p. 1 34, B. Stone, "Successful Direct Marketing Methods", Lincolnwood, C. H. Cheng and Y. S. Chen, "" the segmentation of customer value via RFM model and RS theory"," Expert Systems with Applications, vol. 36, p , E. Turban, E. Mclean and Wetherbr, "Information Technology for management: Making Connections forstrategic Advantage", 2nd ed., New York: Wiley, F. Bijnen, A. H. van and P. B. Baillif, " In-line structure measurement of food products," Powder Technology, vol. 124, pp , H. Chuang, ""A study on the application of data mining techniques to enhance customer lifetime value based on the department store industry"," The seventh international conference on machine learning and cybernetics, pp , H. Edelstein, ""Building Profitable Customer Relationships With Data Mining, Two Crows Corporation, CRM today"," White Paper, pp. 1-12, H.-S. Kim, Y.-G. Kim and C.-W. Park, ""Integration of firm's resource and capability to implement enterprise CRM :A case study of a retail bank inkorea"," Decision Support Systems, pp , J. Davis and E. Joyner, "Successful Customer Relationship Managemen," SAS Institute Inc U.S.A, J. Dunn, " Well separated clusters and fuzzy partitions," Cybernetics and Systems: An International Journal, vol. 4, pp , J. Han and M. Kamber, "Data Mining: Concepts and Techniques", San Francisco: CA: Morgan Kaufmann, J. MacQueen, " Some methods for classification and analysis of multivariate observations"," in Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, K. A. Richards and E. Jones, "" Customer relationship management: Finding value drivers, Industrial Marketing Management"," vol. 37, pp , M. Berry and G. Linoff, "Data Mining Techniques for Marketing, Sales, and Customer Relationship Management", 2nd ed., Wiley Computer Publishing, M. Tan, Data Mining: Concepts and Techniques, CA, San Francisco: Morgan Kaufmann, N. Jafari Momtaz, S. Alizadeh and Mahya, ""A new model for assessment fast food customer behavior case study"," British Food Journal, vol. 115, pp , O. Gök, " Linking account portfolio management to customer information: using customer satisfaction metrics for portfolio analysis," Industrial Marketing Management, vol. 38, pp , V. Ravi, ""Advances in Banking Technology and Management: Impacts of ICT and CRM"," Information science reference, Hershey, New, Yurchak Printing Inc, 2008.