Product Recommender System for Small-Scale Niche Businesses using Association Rule Mining

Size: px
Start display at page:

Download "Product Recommender System for Small-Scale Niche Businesses using Association Rule Mining"

Transcription

1 Product Recommender System for Small-Scale Niche Businesses using Association Rule Mining Julia Di Russo ANR SNR u Master Thesis Master Data Science: Business and Governance Academic Year Tilburg University Date: May 15 th, 2017 Supervisor: Sander Bakkes Second reader: Drew Hendrickson Supervisor at OnMarc: Kevin Van Kalkeren & Melissa Paans Provider of the dataset: OnMarc Faculty: Tilburg School of Humanities

2 Table of Contents Preface... 2 Abstract Introduction Related Work General methods for recommender systems Common techniques for the evaluation of recommender systems Recommender systems for small-scale businesses Frequent item set mining & association rule mining Summary of related work Method Dataset description Pre-processing of the data Exploratory data analysis Experimental procedure Evaluation of the model Results Performance of the algorithm on the validation set Performance obtained on the test set Statistical analysis of the accuracy scores on test set Conclusion & Discussion Answers to the Research Questions Answer to the problem statement General Discussion Limitations and Recommendations for future research References Appendix

3 Preface I would like to thank my academic supervisor Sander Bakkes for his help regarding the writing of this thesis. I would also like to sincerely thank Kevin and Melissa for their advice and their valuable presence during this research, as well as Peter for his valuable advice regarding the analysis of my results. Moreover, I would like to thank the management of OnMarc for making it possible for me to carry out my thesis project in the company as well as for providing me with the datasets. Additionally, I would like to thank Michiel, for the daily support he gave me throughout my Master. 2

4 Abstract During the last decades, the online retailing business has globally been experiencing a substantial growth. Consequently, the importance of product recommender systems has become obvious and clear for both the leading companies in the online retailing market and the smallerscale online retailers. While many successful recommendation methods have already been published, small-scale retailers present limitations in their computational capabilities and in the amount of data available, creating a new area of challenges almost untouched in the field of recommender systems. Therefore, this thesis aims at designing an accurate product recommender system for small-scale niche businesses using two popular market basket analysis method: frequent-item set mining and association rule mining. The performance of the method chosen on unseen data was only measured with an offline experiment and a popularity baseline was chosen to compare the accuracy of our method to a relevant baseline. As a second part of this research, the impact of temporal and geographical dimension on the accuracy of the association rules was investigated by measuring whether the association rule would be more accurate when they are generated on a temporally or geographically split dataset. In conclusion, our recommender system performs better than the chosen popularity baseline but does not cover a large part of the products available on the website of the retailer. However, the association rules generated on the split dataset had an accuracy score lower than the baseline score. These results can be due to the low number of transactions available for most of these analyses. Eventually, it was determined that the predictions are the most accurate when the association rules are based on 2 or 3 products that were previously basket-added. To gain more insights into the accuracy and efficiency of this recommender system, we would recommend the evaluation of this recommendation method using an online experiment. 3

5 1. Introduction During the last decades, the online retailing business has globally been experiencing a substantial growth. Indeed, according to the results of a study executed by the Centre of Retail Research, this sector experienced a growth rate of 18.6% in Europe in 2015 and 16.7% in 2016, making it the fastest growing retail market in that region. The increasing competition on the online retailing market has forced these businesses to improve their techniques and their online environment to target their customers accurately and to maintain a high level of satisfaction. While the number of products available online is excessively large, online retailers now aim at instantly providing their customers with the products they are seeking, reducing the efforts of their customers as much as possible. Consequently, the importance of product recommender systems has become obvious and clear for the leading companies in the online retailing market. According to Huseynov, Huseynov & Özkan (2016), recommender systems can commonly be described as intelligent software providing easily accessible, high-quality recommendations for online consumers. Such systems are now considered serious business tools (Schafer et al, 2001), helping customers to find the item they are seeking or suggesting them additional ones. Numerous advantages are found in the use of accurate recommender systems. In fact, previous research has established that the use of accurate product recommender systems helps online customers make better decisions during their purchases, reducing the time and efforts put in their search. It was also found that the use of such systems can increase the number of purchases (Huseynov, Huseynov, Özkan, 2016). Often cited as examples, Amazon.com, the largest online retailer in the world and Netflix.com, the largest online distributor of streaming media, are two well-known and successful websites that established efficient product recommenders following collaborative filtering methods (among other techniques) (Marlin, Adams, Sadasivam & Houston, 2013). Collaborative filtering is one of the most popular method used in recommender systems and is based on the opinion or ratings given by previous users (Schafer et al, 2007). As reported in an article about their recommendation system, Netflix s researchers estimate that the combined effect of personalization and recommendation have probably saved the company around 1 billion dollars per year (Gomez- Uribe & Hunt, 2016). Often used and cited in the recommendation system studies, collaborative filtering is a very popular method for filtering items using the implicit or explicit rating of users (see Related Work section). However, the quality of recommendations using collaborative filtering can be significantly lower in cases when data is sparse, which can be often the case for small-scale online retailers (Cai et al, 2014). 4

6 In fact, traditional recommendation techniques are often hard to apply for small-scale retailers because of the small amount of data available as well as the lack of data about returning users (Kaminskas, Bridge, Foping & Roche, 2015). As stated above, many techniques, such as collaborative filtering methods, imply the use of a large amount of data and a large amount of purchases to train the recommender system. Moreover, small-scale companies do not commonly have the same resources and the same computing capability that some of the large-scale systems cited above commonly require (Chen, Miller, & Dagher, 2014.). While many companies do not benefit from the same number of visitors as Netflix, or, in the case of retailers, the same numbers of buying users as Amazon, very little attention has been paid to the design of recommender systems for small-scale businesses (see Related Work section for more details). Nevertheless, in the case of small online retailers, using an efficient product recommender is fundamental as it can have a significant influence on their sales revenue and can sometimes determine the success or failure of their business. Moreover, small-scale online retailers often display a large panel of products that customers have to browse through and can highly benefit from the use of an accurate and efficient recommender system. This study aims at designing an accurate and simple recommendation algorithm within the limits of the data that can be gathered by small online businesses and considering the potentially limited computing capability of such businesses. This research will also allow us to get more detailed and useful insights into product recommendations for small-scale retailers. Firstly, this study is focusing on the design of the algorithm adapted to the data and structure of the website of the selected small-scale online retailer. However, this thesis might also be of interest for other small-scale retailers who wish to design and implement or simply adapt such a recommendation system without having the resources to research it. As the number of recommender system algorithms for small-scale retailers is sparse, a new functional algorithm could provide a great opportunity for the expansion of the small-scale retailing business. Secondly, this study intends to provide other researchers in that field with new valuable knowledge and insights regarding product recommendations for small-scale online retailer. While the scientific focus is currently on the use and analysis of big data sets or significantly large datasets, this research will also provide a new detailed approach in the design of recommendation algorithms with sparse data and will provide knowledge about the current possibilities in this relatively untouched field of research. Problem Statement: To what extent can we accurately predict the next product bought by an online customer using a simple recommender system based on sparse purchase data? 5

7 To answer the problem statement, a product recommender system will be designed using purchase data from a real small-scale online retailer. The recommender system will be based on literature and on previous work carried out in the field of recommendation systems for small-scale retailers. To cover all aspects of this question, the problem has been split into three research questions. Firstly, to establish a reliable product recommender system and to accurately predict the next product to be purchased, the recommendation method chosen is based on the purchase history of previous customers. Therefore, the first research question of this thesis is expressed as follows: Research Question 1: To what extent does the purchase history of previous customers accurately predict the next product that a new customer will purchase? To answer this question, we will attempt to design a recommender system using frequent item sets mining and association rule mining based on the purchase history of previous customers. As described in more details in the Retailed Work section, frequent item sets mining and association rule mining are two popular basket analysis methods often used to find patterns in transactional data. As stated in the first research question, the recommendation method chosen for this research is based on the products basket-added and then purchased by the customer. The number of products included in a transaction can highly vary from one user to another. Therefore, we aim at determining the number of products needed to establish a confident recommendation to the user. Consequently, the Research Question 2 is presented as follows: Research Question 2: How many basket-added products must the recommendation be based on to obtain a highly accurate prediction? This question will be answered by finding the average number of products used in the most accurate associate rules generated with our data set. As suggested by Chen et al (2014), geographical dimension and temporal dimension can be two important factors to consider when recommending products to customers. To measure the influence of the two factors on the accuracy of the method chosen, our Research Question 3 is established as follows: Research Question 3: To what extent does the geographical origin and the temporal dimension help in the predictions accuracy? This question will be answered with three different analysis. In a first time, we will investigate the temporal dimension effect on the accuracy of the association rules. To investigate whether the day of the week can influence the predictions accuracy, we will split the transactions carried out during the week and during the week end. As we also aim at finding out whether the time of the day has an influence on the prediction accuracy, we will perform the same analysis for transactions made before 12pm and after 12pm. In a second time, the 6

8 transactions will be split between the most represented countries in this dataset. The association rules will be generated in all cases and the accuracy of the rules compared. This study has been organized in the following way. The next section of this paper will give an overview of the previous and related work done in the field of product recommender systems, focusing on the systems designed for small-scale retailers. The method section will then introduce the selected small-scale retailer, describe the dataset used for this research and explain the different manipulation carried out to extract the frequent item sets, generate the association rules and measure the performance of the newly designed method. In a fourth part, the results of the study will be thoroughly described including important measures such as the accuracy and performance of the designed algorithm. The discussion and the conclusion will follow respectively, stating the strength and weaknesses of this research as well as concluding its findings. 7

9 2. Related Work A considerable amount of literature has been published in the field of recommender systems. The vast majority of these studies focused on identifying new recommendation techniques to push the limits of their algorithm s accuracy or reviewing previous techniques to seek out the best performance and efficiency. While much can be said about the field of recommender systems, this section focuses on a few aspects of this subject. First, we present in Section 2.1 the general categories of recommendation methods and their differences. Subsequently, Section 2.2 and Section 2.3 respectively present the most common methods of evaluation for recommendation systems and highlight the previous studies done in the field of recommender systems for small online retailers. Ultimately, we introduce in Section 2.3 two popular basket analysis methods chosen for the design of our recommendation system: frequent item set mining and association rule mining General methods for recommender systems One study conducted by Huseynov et al (2014) as well as another study conducted the same year by Cai et al (2014), give a clear overview of the general organization of recommender systems and establish the main characteristics of the main methods currently available. According to Cai et al (2014), recommendation methods are most commonly separated in three categories: collaborative filtering, content-based and hybrid methods combining the previous two categories. Content-based systems commonly try to recommend items to a user by matching user profile characteristics and the characteristics of certain products. These profiles are built using the characteristics of the product previously rated by that user (Lops, et al, 2011). Content-based filtering methods have the great advantage of being able to provide recommendations for any new product that was not yet rated by the users, and do not rely on the number of ratings given by other users. (Lops et al, 2011). However, besides requiring the building of extensive user interest s profiles, content-based systems encounter several limitations. Indeed, as stated in the study of Lops et al (2011), content-based systems cannot provide very reliable recommendation for new users and tend to recommend new products that might be similar to previously bought products. For instance, if a customer recently purchased a coffee machine, the same customer might be recommended another similar coffee machine during its next visit on the retailer s website. Dissimilarly, collaborative filtering systems recommendations are based on the ratings given by previous users with a similar taste (Huseynov et al, 2016). This recommendation technique can include temporal dynamics, which makes it flexible and able to adapt to the changing trends 8

10 and to the user s changing tastes (Koren et al, 2011). As stated by Herlocker et al (2014), many collaborative filtering algorithms have been built for datasets in which the number of users is much larger than the number of products to recommend. As this method is based on user s ratings, a recommender system based on collaborative filtering cannot recommend items which have not yet been rated by users, limiting the number of items covered by the algorithm (Herlocker et al, 2004). Considering that explicit ratings are not always available, some collaborative filtering recommender systems also take into account some implicit ratings such as basket-adds, clicks on product pages as well as mouse movements. While both previous methods present major limitations, hybrid recommender systems combine both previous methods to generate reliable recommendations and to avoid the drawbacks of each individual system. Several techniques are available in the field of recommender systems to create hybrid recommendations such as weighted hybrid recommenders, switching hybrid recommender or mixed hybrid recommenders (Burke et al, 2002). Weighted hybrid recommender commonly combines the results of both recommendation techniques available and adjusts the weights of each technique according to the quality of the prediction. Dissimilarly, a switching hybrid recommender simply switches between recommendation technique depending on the situation it is facing. While both previous recommenders typically produce one single recommendation, mixed hybrid recommenders simultaneously provide the user with all the recommendations from all techniques included in the hybrid system (Burke et al, 2002) Common techniques for the evaluation of recommender systems Evaluating or comparing the accuracy and performance of existing recommendation techniques is one of the main issue in this field. As briefly discussed above, many studies found in the field of product recommender systems aim at reviewing the existing methods and algorithms to provide users with an objective overview of their performances in different cases than the one they were built for. However, reviewing the efficiency of recommendation algorithms is a difficult task as many algorithms are adapted to a certain type of dataset and will not yield the same results in different contexts. (Herlocker et al, 2004). Moreover, there are many different metrics that are being used to measure the accuracy of recommender systems and there seems to be no standard metric currently available (Herlocker et al, 2004). Among classification accuracy metrics, precision and recall are the two most common measures often used to evaluate the performance of collaborative systems. As explained by Raghavan, Jung and Bollmann (1989), recall represents the ratio of relevant instances retrieved divided by the total number of relevant instances and precision represents the ratio of relevant instances retrieved divided by the total number of retrieved instances. While precision and recall have been popular in this field for decades, many other 9

11 metrics can also be chosen to evaluate information retrieval systems, such as ROC-curve, another classification accuracy metric. ROC-Curve is an alternative measure to precision and recall which attempts to measure the ability of the system to distinguish between relevant information and noise (Herlocker et al, 2004). A critical advantage of classification accuracy metrics when using sparse data is the ability of ignoring recommendations for items that do not have any ratings. Nevertheless, different types of tasks require different types of metrics including rank accuracy metrics or predictive accuracy metrics (e.g. mean absolute error), severely increasing the number of metrics used in this field (Herlocker et al, 2004). Therefore, this lack of standardization and the large number of metrics used in research often complicates the task of comparing the performance of recommender systems designed by different authors Recommender systems for small-scale businesses. As stated previously, small-scale retailers commonly require a different approach for the design of their recommender system as they face several limitations. Indeed, small-scale businesses frequently face three major challenges: the sparsity of their data, the low number of returning users on their website and the potentially limited computational capabilities that they dispose of. However, few studies have investigated the matter of product recommendations for small-scale businesses. Recent work by Kaminskas et Al (2015) based on the data of two small-scale online retailers, has established a new hybrid approach allowing small-scale online retailers to produce accurate product recommendations with an item-centric approach based on two techniques: one using the product co-occurrences in the browsing history and one focusing on the textual description of the items. While a certain approach relying on association rule mining do not provide recommendations for all products available on the site because the data is too sparse, this study provides an answer to this problem by including the textual descriptions of the items. This research also uses the theme of products as feature, considering the categories that were manually included in by the retailer initially. However, this approach only intends to pair products together and relies on the fact that all items must have a sufficient textual description available on the website. Also, it is important to mention that to solve the problem of data sparsity, the researchers only focus on products viewed and not on the product purchases, limiting the validity of this recommender system. The same authors conducted another study one year later in an identical context and with the same retailers using a very similar approach. On top of the product views, this new technique adds basket events such as basket-adds to the initial hybrid approach including association rule mining and text-based similarity (Kaminskas et al, 2016). Interestingly, in both studies, the researchers have been able to carry out an offline and an online evaluation of their recommender system. Both recommender systems were implemented on the website of the small-scale retailers and the real influence on purchases could be measured. It was found that users engaging with the 10

12 newly displayed recommendations generated a higher amount of completed order and provided a higher revenue in both cases. Remaining in the field of recommender systems for small-scale retailers, another study by Chen et al (2014) attempted to design a product recommender using association rules mining and common features such as the month of purchase, the country of origin, the product last selected by the customer and the previous purchases of the customer. The recommender system designed in this study uses the Apriori algorithm and frequency analysis to include the demographical features in the association rule model. As efficiency and scalability were two major objectives in this study, the algorithm was run with several test sets to record its runtime. The results showed that the recommender system was both scalable and efficient as the runtime increased linearly with the size of the test set and the system produced recommendations in less than 0.1 seconds. To evaluate the content of the recommendations supplied by the newly designed method, the algorithm was tested on new instances and yielded a 56% of accuracy with 100 instances. Nevertheless, the accuracy was severely reduced when tested with a larger dataset of 200 instances, bringing it to 28% only. While the previous studies on small-scale retailers used different approaches and features in their datasets, they commonly included association rules mining to base their recommender system on. Indeed, many algorithms have been built for association rules mining, among them are Apriori, Eclat and Partition (Hipp, Guntzer & Nakhaizadeh, 2000). According to the results of a study from Hipp, Guntzer & Nakhaizadeh (2000) which reviewed several association rules mining algorithm, there is unexpectedly not a significant difference in the run time of these different algorithms and their performance with basket-like data Frequent item set mining & association rule mining As evoked previously, frequent item set mining and association rule mining are two popular data mining methods commonly chosen by retailers for basket analysis. Often used in combination with association rules mining, frequent item set mining is a data mining method initially created for market basket analysis and aiming at finding hidden patterns in the purchasing behaviour of customers. While frequent item set mining algorithms find the recurring patterns in the transactional data, association rule mining algorithms subsequently use the frequent item sets to create association rules. However, frequent item set mining is now also used more widely for different types of tasks such as finding regularities in certain variables (Borgelt, 2012). In a recent study from Geyer-Schulz & Hahsler (2002) which attempted to evaluate recommender systems using frequent item set mining, it was found that frequent item sets obtained from purchase histories and yielding a high accuracy appear to match the concept of useful recommendation as given by the KDD (community for data mining, data science and analytics). 11

13 First, frequent item sets are commonly extracted from a transaction database. Each frequent item set is generated with a support value which represents the number of transactions that includes the frequent item set. The minimum support value (set by the user) determines which item sets will be considered frequent. Despite its popularity and the simplicity of this method, a recurring problem with frequent item set mining, especially in large databases, is that the number of frequent item sets obtained can often become extremely high, (Borgelt, 2012). Indeed, frequent item set mining follows the Apriori principle which resides in the following sentence. If an item set is frequent then all of its subsets must be frequent. (Kumar et al, 2006). This means that if {a, b, c} is a frequent item set, all its subsets, such as {a,b}, {b,c}, {a}, {b}, {c}, are also frequent item sets. Nevertheless, maximal frequent item sets and closed frequent item sets can help reduce the number of sets generated. A maximal frequent item set is considered as such if none of its superset is frequent while a closed frequent item set is considered as such if none of its superset has the same support (Borgelt, 2012). By retaining only maximal frequent item sets or closed frequent item sets, one can significantly reduce the number of item sets generated. Unsurprisingly, maximal frequent item set mining it is one of the most investigated topic in the large field of data mining. Popular in this field, the DepthProject algorithm was designed in 2000 by three researchers from IBM and aims at efficiently finding maximal item sets in long databases using a depth first technique (Agarwal, Aggarwal & Prasad, 2000). While Depth Project might be considered the most efficient algorithm known for maximal frequent item set mining, newer techniques are being built to try to maximise the efficiency of these algorithms such as the MAFIA technique from Burdik et al (2001) also aiming at mining maximal frequent item sets in long databases. Functioning like a tree, this technique uses several efficient pruning components to trim the tree at several levels and significantly reduce the running time. The MAFIA algorithm presented a running time five time shorter compared to the running time of the Depth Project algorithm while applied to the same publicly available data sets. Secondly, association rule mining uses the frequent item sets previously generated to create rules regarding the allocation of items. For instance, an association rule can be defined as if a customer bought the products a and b, the next product bought is likely to be c. As association rule mining often generates a very large number of rules, this method typically has two objective measures, called support and confidence, that must be manually tuned to filter the useful association rules. Therefore, only association rules with minimum support and confidence will be retained by the recommender system. To begin with, the support measure is slightly different from the frequent item set supports as it represents the fraction of transactions containing the items in the rule. The equation below explains the calculation for an example rule with a premise item a and a recommendation item c. 12

14 support = number of transactions incl. a & c total number of transactions ( 1 ) The confidence represents the fraction of transactions with item a that also contain item c. It is calculated for each association rule according to the following formula (Lai & Cerpa, 2001): confidence = number of transactions incl. a & c number of transactions incl. a ( 2 ) As concluded by Geyer-Schulz & Hahsler (2002), association rules do not have model assumptions, making it a flexible and easy-to-tune model to be implemented on a vast range of data. However, the previously evoked study of Kaminskas et al (2015) shows that association rules alone often do not cover the total number of products that a retailer offers, especially when the number of transactions available to train the algorithm is low. Contrarily, in cases including a large data set, the association rule algorithm follows the same pattern as the frequent item set algorithm and the number of rules generated can quickly become enormous. To avoid the large running time that comes with a large number of association rules, Lin et al (2002) successfully designed a collaborative recommender system which adjusts the parameters of the association rule mining algorithm during the mining process to generate a number of rules within a predefined range. Their approach yielded a better accuracy than traditional correlation-based methods and reduces the running time needed to provide a good recommendation. Nevertheless, other measures are available to reduce the number of relevant association rules while retaining rules with a high interest. One of the most popular measure often used to filter a large number of association rules is the lift, also called interest. Lift selects the association rules by measuring their interestingness (or added value) according to the following formula: lift = P(c a) P(a)P(c) ( 3 ) The interpretation of the lift value is as follows. When the lift is greater than 1, the items are associated. If the lift is exactly 1, it means that a & c are independent from each other and only co-occur in the database. However, many researchers agree that the use of the lift measure can sometimes be problematic as it tends to yield high values for the rules that have a support value close to the minimum support value. Therefore, lift is highly unstable as it is highly likely to vary with any change of the minimum support value (Hahsler & Hornik, 2007). 13

15 2.5. Summary of related work Kaminska et al (2015) and Chen et al (2014) are two recent studies in the field of smallscale retailers that provide us with valuable insight for our research. We choose to follow their method closely as association rule mining appears to be a successful method yielding a satisfying accuracy in their online experiments. Consequently, frequent item set mining and association rules mining are two methods that will be used in the design of our recommender system for sparse data. However, to provide the field with new insights, we wish to analyse the application of association rule mining further and to also investigate the influence of temporal dimension and geographical dimension on the accuracy of the rules. The next section will further justify this choice and describe the implementation of the algorithms. 14

16 3. Method Firstly, this section describes the content and features of the dataset analysed in this thesis (Section 3.1). Subsequently, the cleaning of the data set is explained in Section 3.2 as well as the transformation needed to obtain the transactions format needed to apply the data mining method chosen. Then, we provide an exploratory data analysis in Section 3.3 and we present in Section 3.4 the process used to apply the data mining technique which extracts patterns and recommendations out of the transactions. Lastly, in Section 3.5, we explain the method of evaluation chosen to measure the performance of our model and the selected baseline to compare it to Dataset description The small-scale online retailer whose data is analysed in this study sells high-priced luxurious shoe accessories on its website which is available in three languages: English, Dutch and German. Their online shop receives around 11,000 visitors each month and records around 350 purchasing visitors per month. This retailer s characteristics are similar to the one s of the retailers included in the study of Kaminskas et al (2015) as they both present a sparse number of transactions and operate on a niche market. It is important to consider that this retailer does not actively gather personal data about its users such as gender or age during registration or purchasing process, limiting the amount of demographical data available about its users. Moreover, unique user IDs were not available at the start of our research so we could not track sessions that belonged to a same user and all sessions were assumed independent from each other. Considering these limitations, a new approach (as compared to previous studies in this field) was taken to design an accurate product recommender system for this small-scale online retailer. Meaningful events such as a product being added to the customer s basket (a basket-add) were included in the dataset with the date of the purchase completion and the name of each product added to the basket during a single session. Along with these features, and as partially mentioned in the research questions, day of the week and country of origin as well as city of origin are three user-focused features available in our data, following the model of Chen et al (2014) which includes similar demographic features in their analysis. The datasets used for training the algorithm contains around 3 months of collected data. The data was extracted from the Celebrus tracking system and was initially available in 3 csv files respectively including basket adds, visitors country and cities of origin as well as the recording of specific goals in the system such as a client completing a purchase order or adding a product to their wish list. The wish list represents a common feature on many website allowing users to keep the products they might want to purchase later in a personal 15

17 list often only available with a user account. The wish listed product is then easily retrievable for the user s next visit. Considering the low number of purchases, the sessions of users that only wish-listed products were also kept in the file and considered similar to a session including a purchase completion. Basket removals were not available in our datasets as they could not be extracted from the database. However, since only sessions with a meaningful goal such as a purchase completion were retained, all products basket-added during the retained sessions are highly likely to have been purchased. Therefore, it was assumed that all basket-added products from a session including an order completion were purchased. Since the data available for this study is very sparse and severely limits the quality of our analysis, new csv files including new data had to be extracted twice a few months later to validate the algorithm and to eventually test it. The new files had the same structure and features as the training files. Consequently, as the additional data was not available at the beginning of our research, the validation of the data set was executed on data that did not originate from the same period of time as the training data Pre-processing of the data This section justifies and describes the cleaning of the dataset and the different transformation tasks performed on the dataset as necessary for our analysis to be performed Cleaning and transformation of the data set As our data is separated in three different datasets, a transformation task must be performed. Using Python and an additional open source python library called pandas, the three datasets were merged by session number to obtain a complete file and to gather all available information for each session (See Figure 1.). Some of the session number columns had to be renamed in several files in order for the merging to succeed. The merged dataset was initially composed of around 41,000 rows and about 15 columns. Since we wish to predict the next product to be purchased, we aim at only keeping sessions that showed a high interest in the products. Therefore, the merged dataset was cleaned to only retain sessions including a purchase order completion or a wish list goal completion. Due to the formatting of the tracking system export, some empty columns were present in the file and had to be removed. Columns that were irrelevant or were of no use after the exploratory data analysis were deleted such as Goal Name and Total times goal achieved. Merging tasks can often duplicate columns as identical columns with identical information might be present in several of the files merged (e.g. columns including the date of purchase). Therefore, some columns were duplicated due to the merging of the files, such as dates and time of the different actions. The dates and times columns were deleted to only retain 16

18 the column date and the column time of the goal completion. Additional columns were created using information extracted from other columns, such as Month and Day of the week, that were both extracted from the date of goal completion. Figure 1. Illustration of the merging of the three datasets. There were two important issues remaining in the merged dataset. First, certain rows or session contained a goal completion but the product column was empty or contained a missing value. Consequently, the rows with no meaningful name in the product column were deleted out of the dataset. Secondly, it was also noticed that certain pages were wrongly considered as product by the tracking system, such as retour-service Nederland or any product name including http and consisting of a redirecting website link to a website page such as a social media page. Therefore, all the rows including the words retour, order, or http in the product column were also removed. Ultimately, the dataset had to be grouped by session number to obtain only one row per session and all basket-added products in each session were grouped in a new column as tuples. As the chosen algorithm can only be applied to a list of transactions, this column of tuples including all transactions of our dataset was extracted in a new variable and will be referred to as transactions dataset for the rest of this thesis Temporal splitting of the dataset Our second research question partly focuses on measuring whether temporal dimension has an influence on the accuracy of the prediction. Surprisingly, we could not find any literature 17

19 on the measurement of the temporal influence on association rules. Therefore, two analysis were carried out to determine whether time or day of purchase has an influence on the accuracy of the obtained association rules. The first analysis aims at trying to determine whether there is a significant difference between association rules in week days transactions and weekend days transactions. This question is to be answered by splitting the original transactions dataset between weekdays and weekend days and generating association rules for each period. This split was carried out by filtering the data set according to the Weekday column and retaining for one part only the days from Monday to Friday, and for the second part, only Saturday and Sunday. If the rules are significantly different between each period, the accuracy of the rules generated on the split dataset should be higher than the accuracy of the rules generated on the complete dataset. The second analysis aims at determining whether association rules are also significantly different in transactions carried out before 12 pm and transactions carried out after 12 pm. This analysis was executed similarly to the previous analysis described above. The entire dataset was again split in two new datasets according to the time of the goal completion in order to have one dataset with all transactions completed before 12pm and another dataset with all transactions completed after 12pm Geographical splitting of the dataset Additionally, as our third research question focuses on measuring the influence of the geographical dimension on the accuracy of the association rules, we perform a splitting of the dataset to individually generate association rules for each of the three most represented country. To that end, three new data sets were also created to retain transactions from each of the three countries most represented in the data set: the Netherlands, Great Britain and Belgium. This split was carried out by filtering the dataset according to the value present in the column Country. The data set with purchases from the Netherlands was significantly larger than the two other data sets (N Netherlands = 961, N GreatBritain = 148, N Belgium =137). The same pre-processing tasks were carried out on the validation dataset and on the test dataset after their extraction from the system Exploratory data analysis Once the dataset is cleaned and transformed, the list of unique products sold is constituted of 488 items. Most of the sessions retained in our dataset have a completed order goal (N = 1591) and only a low number of sessions have only added one or more products to the Wish list (N = 18

20 27). The first most frequent product was basket-added 487 times while the second most frequent product was basket-added 195 times (see Table 1). There is a total number of 1618 sessions that have been retained for this analysis. The sessions have their origin in 37 countries and only 6 sessions had a country of origin that could not be tracked. There were 432 sessions carried out during a week end day while 1186 were carried out during a week day. The three countries the most represented in this dataset are the Netherlands (N = 1125), Great Britain (N = 162) and Belgium (N = 159). Considering the low number of purchases available for other countries, only the three most represented countries were kept for the analysis including the geographical origin of the transaction. Table 1. Number of basket adds for the top 5 most popular products on training set. Name of the Product N product purchased cederhouten-schoenspanners paar-cederhouten-schoenspanners 195 pommadier-cream 142 schoen-oprekker 118 saphir-renovateur 117 It is important to mention that some products were duplicated in the dataset with two or three different names, one in the Dutch language and one in the English language or German language as the website is available in the three languages. Consequently, the total number of unique products appearing in our transaction dataset does not reflect the real number of unique products available on the website as it might contain the name of certain products in several languages (e.g. cedar-shoe-trees and cederhouten-schoenspanners are two different product names for the same product sold on two different versions of the same website.). Nevertheless, as each transaction could only be entirely completed on only one of the three sites, each transaction is always composed of products from the same language. Therefore, the names cannot be duplicated with more than one language in a transaction and this matter does not affect the extraction nor the accuracy of the association rules. As this cleaning task would be very time consuming, we decided not to translate the entire list of products names. However, the presence of several names for one product was taken into account when calculating the baseline and did not negatively influence its reliability (See section 5.6.). 19

21 3.4. Experimental procedure Since the association rule mining method was a successful choice for a similar small-scale retailer case in the previous studies of Chen et al (2014) and Kaminsky et al (2016), this data mining method was chosen for the design of our recommender system algorithm. This method was also chosen for its simplicity and its flexibility as described by Geyer-Schulz & Hahsler (2002). To generate the frequent item sets out of our data, a frequent item set mining algorithm was applied to the transactions dataset previously created in Section 3.2. Two functions from the pymining package were used to extract the frequent item sets and calculate their minimum support (See Related Work Section 4.4). All integers between 2 and 6 were input as minimum support value in order to determine the optimal minimum support and to avoid a lack of frequent item sets. Indeed, it is important to consider that a too low minimum support would generated a large amount of frequent item sets that would slow down the recommendation system while a too high minimum support would generate a very low number of frequent item sets that would prevent recommendations for a large fraction of the products available. Moreover, a very low support score might provide us with a high number of unreliable frequent item sets and a very high support score might force the recommender system to ignore very relevant frequent item sets that do not fit the criteria. Nevertheless, the values tried for the minimum support were kept under 6 because of the low amount of transactions available. Once a sufficient number of frequent item sets was obtained, the association rule mining function was used to generate rules out of the frequent item sets. As explained in the Section 4.4, the confidence is an additional parameter available with the association rule mining algorithm to filter the obtained association rules. Several values of minimum confidence were input in order to determine the best parameters for generating highly accurate rules. Figure 2 gives a short and clear overview of the transaction mining process. 20

22 Figure 2. Process of extraction of frequent item sets and association rules The association rules are generated in the following format: if a and b occur together, then recommend c, support, confidence score. The example below shows the construction of a rule as occurring with the transactions currently analysed. (frozenset({'applicator-cloth-by-saphir', polishing-cloth-by-saphir', 'pommadiercream'}), frozenset({'pate-de-luxe-wax-shoe-polish-100ml'}), 4, 0.8) This rule can be translated in natural language to: If the product 'applicator-cloth-by-saphir' and the product polishing-cloth-by-saphir' and the product 'pommadier-cream' are in the transaction, then the product 'pate-de-luxe-waxshoe-polish-100ml' is highly likely to be the next product added to the transaction. The support is 4 and the confidence of this rule is Evaluation of the model Association rules are commonly complex to validate and the validation of such algorithm is often very challenging when executed in an offline environment. To carry out the validation of our model and to tune the parameters of the association rules algorithm, the association rules were applied on the cleaned validation set with several combinations of parameters. The measures used to validate and evaluate our recommender systems are the accuracy and the coverage, which are two common measures used to evaluate the performance of such machine learning algorithms (Geyer-Schulz & Hahsler, 2002). The accuracy measures the share of correct recommendations compared to the total number of possible recommendations and the coverage represents the share of items for which recommendations are available, compared to the total number of items available. The accuracy is calculated in the following way. For each rule that had its premise and its recommendation in one of the validation set transactions, a score of +1 was added for that rule in the correct count list. If the same transaction includes more items than the one present in the rule, the correct count remains positive as the recommendation would have remained correct. If the rule had only the premise in the rule and not its recommendation, the rule had a score of +1 added for that rule in the incorrect count list. The rules that did not have their premise in any transaction were ignored and did not influence the accuracy score. The accuracy per rule was then computed for each rule by dividing the number of correct counts of each rule by the sum of correct counts and incorrect counts of the same rule. The accuracy per rule is a meaningful measurement in this thesis as it provides us with the accuracy scores needed to further on determine how many basket-added 21

23 products are needed to create an accurate prediction. This analysis is part of our third research question as described later in this section. Lastly, to know the average performance of the complete set of association rules, the average accuracy score was computed for the whole list of association rules found or partially found in the validation set. Once the best performing parameters on the validation set were found, these parameters were applied to generate the rules that would be evaluated on the test set. The frequent item sets and association rules were generated in the same way for the transactions that were previously temporally split, using the same parameters as for the original transaction set. The frequent item set and association rules were also generated for the transaction set split by country using the parameters retained from the validation of the complete transaction dataset. The accuracy and the coverage scores were compared for the whole dataset and for the temporally split datasets in order to determine whether the splitting between two different period (weekdays and weekend days transactions or before 12pm transactions and after 12pm transactions) allowed a better accuracy in our model. The same analysis was carried out for the geographically split dataset to measure whether the geographical dimension split improved the accuracy of the rules Baseline comparison Aiming at comparing the performance of our recommender system to a relevant baseline, we selected the 5 most popular products available in our training dataset and calculated the percentage of accuracy if we had recommended the top 5 most popular products. This baseline follows the method of Chen et al (2014) study, which used the top 8 most popular products of their small-scale retailer as baseline to compare their recommender system to. As our data set is smaller than the one used in that study and the number of products sold is lower, we decided to limit the number of most popular product to 5 instead of 8. However, as the website is available in three languages, the most popular products found are represented three times. To tend to this issue, the sum of purchases on each website for each popular product were summed. Subsequently, the total number of basket adds for our top 5 most popular products was divided by the total number of basket adds in order to get our baseline score. This popularity baseline allows us to determine whether recommending popular products would be an easier and better performing solution than providing personalized recommendation after implementation of our association rule system. To evaluate whether our recommender system performs significantly better than the baseline, a statistical analysis of the accuracy scores on test set was carried out by calculating the confidence intervals of each accuracy score and comparing them to the confidence intervals of the baseline s accuracy on test set. 22

24 Calculation of the length of the most accurate rules Hereafter, to answer our third research question regarding the number of basket-added items needed to influence the accuracy of the product recommender system, we used the accuracy score of each association rules previously calculated. As most of the rules with a high number of correct counts had a minimum accuracy score of 0.50, we chose to consider that these association rules are very likely to provide a highly accurate prediction. A list was created, retaining only association rules with an accuracy score higher than 0.50 and the mean number of items available in the premise of the rule was computed. In other words, the premise represents the number of basket adds on which the association rule based the recommendation (see Figure 3). The mode as well as the standard deviation were also computed to learn whether there are many differences in the number of products needed for an accurate association rule and whether the mean average is reliable. This number was computed both on the validation set and on the test set for comparison. It is important to consider that this average is computed using only the accuracy score of the rules that were partially or entirely present in the validation set or in the test set. The number of rules that was ignored during the evaluation was also calculated to give an insight into the relevance of the association rules generated on the training set. Figure 3. Association rule composition. This figure shows the two different parts of an association rule. The premise is the basis from which the conclusion is drawn, meaning the products already basket added by the user in the current session. The conclusion is the outcome of the rule, meaning the final product to be recommended. 23

25 4. Results In this section, we present the detailed results of our experiment. In Section 6.1, we describe the scores obtained during the tuning of the parameters carried out on the validation set. Subsequently, in Section 6.2, we present the final performance of the data mining technique chosen using accuracy and coverage scores. These scores are also presented for the temporally and geographically split data set to compare their results to the general data set performance. Finally, the number of products needed for an accurate prediction on the test set is given Performance of the algorithm on the validation set First, this section presents the results obtained on the validation set when we apply the chosen algorithms using different set of parameters. Secondly, we present the number of association rules generated for each temporal and geographical split using the best performing parameters Tuning of the parameters on the complete data set This section will describe the results obtained on the validation set during the tuning of the model parameters. There was a total number of 1618 unique transactions obtained from the dataset. The frequent item set mining algorithm was run three times with different minimum support score varying from 2 to 4 in order to obtain the optimal number of frequent items frequent item sets were generated with a minimum support of 2 and 813 frequent item sets were generated with a minimum support of 3 while 468 frequent item sets were generated with a minimum support of 4. A minimum support score of 2 was found to generate too many frequent item sets (> 1 million) as we aim at designing a simple and efficient product recommender that does not have a long running time (See Table 2.). As shown in the table, such a large number of frequent item sets would produce a similar or larger number of association rules, requiring a large running time to produce recommendations or a large computational capacity. 24

26 ACCURACY (IN PERCENTAGE) Table 2. Number of frequent item sets and association rules generated with different support and confidence scores. Support Number Frequent item sets Number of Association rules generated with 0,8 minimum confidence Number of Association rules generated with 0,5 confidence Subsequently, the association rules were generated several times with frequent item sets of a minimum support of 3 and with various minimum confidence between 0.5 to Figure 4 displays the accuracy and coverage scores obtained on the validation set with the different sets of association rules ,7 16,8 27,96 25,95 28,66 28,76 28,33 15,57 14,55 13,93 13,11 12,91 41,67 38,27 38,27 35,89 11,27 10,45 10,45 10,45 0,5 0,55 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 MINIMUM CONFIDENCE OF GENERATED ASSOCIATION RULES Accuracy Coverage Figure 4. Evolution of accuracy and coverage with different minimum confidence scores (minimum support is constant at 3). The best accuracy score on the validation set was obtained when the association rules were generated with a minimum confidence of Indeed, Figure 1 clearly displays that the accuracy decreases again when the confidence reaches higher than The best coverage is measured at a minimum confidence of 0.5 as coverage decreases continuously as the confidence increases. However, one can observe that the coverage remains stable when the confidence reaches 0.85 or 25

Introduction to Recommendation Engines

Introduction to Recommendation Engines Introduction to Recommendation Engines A guide to algorithmically predicting what your customers want and when. By Tuck Ngun, PhD Introduction Recommendation engines have become a popular solution for

More information

Using Decision Tree to predict repeat customers

Using Decision Tree to predict repeat customers Using Decision Tree to predict repeat customers Jia En Nicholette Li Jing Rong Lim Abstract We focus on using feature engineering and decision trees to perform classification and feature selection on the

More information

Predicting Purchase Behavior of E-commerce Customer, One-stage or Two-stage?

Predicting Purchase Behavior of E-commerce Customer, One-stage or Two-stage? 2016 International Conference on Artificial Intelligence and Computer Science (AICS 2016) ISBN: 978-1-60595-411-0 Predicting Purchase Behavior of E-commerce Customer, One-stage or Two-stage? Chen CHEN

More information

Retail Product Bundling A new approach

Retail Product Bundling A new approach Paper 1728-2018 Retail Product Bundling A new approach Bruno Nogueira Carlos, Youman Mind Over Data ABSTRACT Affinity analysis is referred to as Market Basket Analysis in retail and e-commerce outlets

More information

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities

More information

Uncover possibilities with predictive analytics

Uncover possibilities with predictive analytics IBM Analytics Feature Guide IBM SPSS Modeler Uncover possibilities with predictive analytics Unlock the value of data you re already collecting by extracting information that opens a window into customer

More information

Netflix Optimization: A Confluence of Metrics, Algorithms, and Experimentation. CIKM 2013, UEO Workshop Caitlin Smallwood

Netflix Optimization: A Confluence of Metrics, Algorithms, and Experimentation. CIKM 2013, UEO Workshop Caitlin Smallwood Netflix Optimization: A Confluence of Metrics, Algorithms, and Experimentation CIKM 2013, UEO Workshop Caitlin Smallwood 1 Allegheny Monongahela Ohio River 2 TV & Movie Enjoyment Made Easy Stream any video

More information

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.

More information

LE NUOVE FRONTIERE DALL AI ALL AR E L IMPATTO SULLA QUOTIDIANITÀ

LE NUOVE FRONTIERE DALL AI ALL AR E L IMPATTO SULLA QUOTIDIANITÀ LE NUOVE FRONTIERE DALL AI ALL AR E L IMPATTO SULLA QUOTIDIANITÀ Deloitte Analytics & Information Management Torino, 26/03/2018 1 AUGMENTED REALITY FUNDAMENTALS EXAMPLES OF DEEP LEARNING ARTIFICIAL INTELLIGENCE

More information

Context-aware recommendation

Context-aware recommendation Context-aware recommendation Eirini Kolomvrezou, Hendrik Heuer Special Course in Computer and Information Science User Modelling & Recommender Systems Aalto University Context-aware recommendation 2 Recommendation

More information

Online Appendix for Are Online and Offline Prices Similar? Evidence from Multi-Channel Retailers

Online Appendix for Are Online and Offline Prices Similar? Evidence from Multi-Channel Retailers Online Appendix for Are Online and Offline Prices Similar? Evidence from Multi-Channel Retailers Alberto Cavallo MIT & NBER This version: August 29, 2016 A Appendix A.1 Price-level Comparison with Amazon.com

More information

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models. Powerful machine learning software for developing predictive, descriptive, and analytical models. The Company Minitab helps companies and institutions to spot trends, solve problems and discover valuable

More information

The State of Cross-Device Commerce

The State of Cross-Device Commerce The State of Cross-Device Commerce 1 STATE OF CROSS-DEVICE COMMERCE H2 2016 Copyright 2017 Criteo H2 2016 United Kingdom The challenge ahead Closing the gap across devices A new paradigm. Marketing personalisation

More information

Introduction to Analytics Tools Data Models Problem solving with analytics

Introduction to Analytics Tools Data Models Problem solving with analytics Introduction to Analytics Tools Data Models Problem solving with analytics Analytics is the use of: data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based

More information

Sage 200 Business Intelligence Cubes and Reports

Sage 200 Business Intelligence Cubes and Reports Sage 200 Business Intelligence Cubes and Reports Sage (UK) Limited Copyright Statement Sage (UK) Limited, 2017. All rights reserved If this documentation includes advice or information relating to any

More information

Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling

Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Peter I. Hofgesang and Wojtek Kowalczyk Free University of Amsterdam, Department of Computer Science, Amsterdam, The Netherlands

More information

Predicting Loyal Customers for Sellers on Tmall to Increase Return on Promoting Cost

Predicting Loyal Customers for Sellers on Tmall to Increase Return on Promoting Cost Predicting Loyal Customers for Sellers on Tmall to Increase Return on Promoting Cost Wendy Huang Yu-Chih, Shih Jessy Yang Zoe Cheng BADM Team 9 Summary Sellers on E-commerce platform sometimes run big

More information

Determining Consumer Intent for Automotive Dealership Visits

Determining Consumer Intent for Automotive Dealership Visits Determining Consumer Intent for Automotive Dealership Visits Overview In the automotive industry, it is often difficult for companies who pay for advertising to get lower funnel metrics based on consumer

More information

The State of Cross-Device Commerce

The State of Cross-Device Commerce The State of Cross-Device Commerce 1 STATE OF CROSS-DEVICE COMMERCE H2 2016 Copyright 2017 Criteo H2 2016 Australia The challenge ahead Closing the gap across devices A new paradigm. Marketing personalisation

More information

Machine learning mechanisms in modern Omnichannel marketing and sales.

Machine learning mechanisms in modern Omnichannel marketing and sales. Machine learning mechanisms in modern Omnichannel marketing and sales. Currently, companies operating in retail must face a very competitive market. They very often set their goal at gaining a high number

More information

The development of hardware, software and scientific advancements. made the computerization of business easier. Scientific advancements

The development of hardware, software and scientific advancements. made the computerization of business easier. Scientific advancements Chapter 5 A CASE STUDY ON A SUPERMARKET 5.1 Introduction The development of hardware, software and scientific advancements made the computerization of business easier. Scientific advancements have made

More information

Oracle Knowledge Analytics User Guide

Oracle Knowledge Analytics User Guide Oracle Knowledge Analytics User Guide Working with Oracle Knowledge Analytics Reports Oracle Knowledge Version 8.4.2.2 April, 2012 Oracle, Inc. COPYRIGHT INFORMATION Copyright 2002, 2011, Oracle and/or

More information

TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011

TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011 TNM033 Data Mining Practical Final Project Deadline: 17 of January, 2011 1 Develop Models for Customers Likely to Churn Churn is a term used to indicate a customer leaving the service of one company in

More information

Cold-start Solution to Location-based Entity Shop. Recommender Systems Using Online Sales Records

Cold-start Solution to Location-based Entity Shop. Recommender Systems Using Online Sales Records Cold-start Solution to Location-based Entity Shop Recommender Systems Using Online Sales Records Yichen Yao 1, Zhongjie Li 2 1 Department of Engineering Mechanics, Tsinghua University, Beijing, China yaoyichen@aliyun.com

More information

New Customer Acquisition Strategy

New Customer Acquisition Strategy Page 1 New Customer Acquisition Strategy Based on Customer Profiling Segmentation and Scoring Model Page 2 Introduction A customer profile is a snapshot of who your customers are, how to reach them, and

More information

MEASURING, MODELING AND MONITORING YOUR LOCKBOX

MEASURING, MODELING AND MONITORING YOUR LOCKBOX MEASURING, MODELING AND MONITORING YOUR LOCKBOX A Practical Guide It is a good practice for corporations to review their remittance systems on a regular basis as the number of electronic payments increases.

More information

Predicting user rating for Yelp businesses leveraging user similarity

Predicting user rating for Yelp businesses leveraging user similarity Predicting user rating for Yelp businesses leveraging user similarity Kritika Singh kritika@eng.ucsd.edu Abstract Users visit a Yelp business, such as a restaurant, based on its overall rating and often

More information

IO1 A2: QUESTIONNAIRE REPORT (DANMAR, PL)

IO1 A2: QUESTIONNAIRE REPORT (DANMAR, PL) IO1 A2: QUESTIONNAIRE REPORT (, PL) [This publication reflects the views only of the author, and the Commission cannot be held responsible for any use which may be made of the information contained herein.]

More information

CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES

CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES 49 CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES 4.1 INTRODUCTION Different groups of customers prefer some special products. Customers type recognition is one of the main

More information

The State of Cross-Device Commerce

The State of Cross-Device Commerce The State of Cross-Device Commerce 1 STATE OF CROSS-DEVICE COMMERCE H2 2016 H2 2016 United States The challenge ahead Closing the gap across devices A new paradigm. Personalized marketing starts with understanding

More information

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING Abbas Heiat, College of Business, Montana State University, Billings, MT 59102, aheiat@msubillings.edu ABSTRACT The purpose of this study is to investigate

More information

Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce

Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce Young Sung Cho 1, Song Chul Moon 2, Seon-phil Jeong 3, In-Bae Oh 4, Keun Ho Ryu 1 1 Department of Computer Science,

More information

Data Warehousing Class Project Report

Data Warehousing Class Project Report Portland State University PDXScholar Engineering and Technology Management Student Projects Engineering and Technology Management Winter 2018 Data Warehousing Class Project Report Gaya Haciane Portland

More information

Weka Evaluation: Assessing the performance

Weka Evaluation: Assessing the performance Weka Evaluation: Assessing the performance Lab3 (in- class): 21 NOV 2016, 13:00-15:00, CHOMSKY ACKNOWLEDGEMENTS: INFORMATION, EXAMPLES AND TASKS IN THIS LAB COME FROM SEVERAL WEB SOURCES. Learning objectives

More information

An Executive s Guide to Predictive Data Modeling. An introductory look at how data modeling can drive better business decisions.

An Executive s Guide to Predictive Data Modeling. An introductory look at how data modeling can drive better business decisions. An Executive s Guide to Predictive Data Modeling An introductory look at how data modeling can drive better business decisions. Introduction Executives are making multi-million dollar decisions every day,

More information

Trust-Networks in Recommender Systems

Trust-Networks in Recommender Systems San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 2008 Trust-Networks in Recommender Systems Kristen Mori San Jose State University Follow this and additional

More information

Social Network Collaborative Filtering

Social Network Collaborative Filtering Social Network Collaborative Filtering Rong Zheng, Foster Provost, Anindya Ghose Abstract This paper reports on a preliminary empirical study comparing methods for collaborative filtering (CF) using explicit

More information

Is Machine Learning the future of the Business Intelligence?

Is Machine Learning the future of the Business Intelligence? Is Machine Learning the future of the Business Intelligence Fernando IAFRATE : Sr Manager of the BI domain Fernando.iafrate@disney.com Tel : 33 (0)1 64 74 59 81 Mobile : 33 (0)6 81 97 14 26 What is Business

More information

Jialu Yan, Tingting Gao, Yilin Wei Advised by Dr. German Creamer, PhD, CFA Dec. 11th, Forecasting Rossmann Store Sales Prediction

Jialu Yan, Tingting Gao, Yilin Wei Advised by Dr. German Creamer, PhD, CFA Dec. 11th, Forecasting Rossmann Store Sales Prediction Jialu Yan, Tingting Gao, Yilin Wei Advised by Dr. German Creamer, PhD, CFA Dec. 11th, 2015 Forecasting Rossmann Store Sales Prediction Problem Understanding It is very important for retail stores to save

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council Distr.: General 19 March 2014 ECE/CES/2014/32 English only Economic Commission for Europe Conference of European Statisticians Sixty-second plenary session Paris,

More information

Marketing and CS. Ranking Ad s on Search Engines. Enticing you to buy a product. Target customers. Traditional vs Modern Media.

Marketing and CS. Ranking Ad s on Search Engines. Enticing you to buy a product. Target customers. Traditional vs Modern Media. Enticing you to buy a product Marketing and CS 1. What is the content of the ad? 2. Where to advertise? TV, radio, newspaper, magazine, internet, 3. Who is the target audience/customers? Philip Chan Which

More information

Improve Alerting Accuracy

Improve Alerting Accuracy New Relic s Apdex-Driven Approach Honed by Big Data Table of Contents OVERVIEW 03 UNDERSTANDING WEB PERFORMANCE 04 A BETTER APPROACH TO ALERTING DRIVEN BY APDEX 06 GETTING STARTED WITH NEW RELIC ALERTING

More information

Using offline shopping data to boost e-shop sales

Using offline shopping data to boost e-shop sales Using offline shopping data to boost e-shop sales Real case story of Pharmacy chain SQL Saturday #529 event Bratislava, June 4 th 2016 1 This event was possible thanks to [following sponsors] 2 6/3/2016

More information

Welcome to Managerial Economics, Session 1. Lets begin the first of many exciting sessions in this course.

Welcome to Managerial Economics, Session 1. Lets begin the first of many exciting sessions in this course. Welcome to Managerial Economics, Session 1. Lets begin the first of many exciting sessions in this course. 1 So today we will be taking an economics perspective on effective management. The main topics

More information

Data Science and Technology Entrepreneurship

Data Science and Technology Entrepreneurship ! Data Science and Technology Entrepreneurship Course Recap Week 14 Sameer Maskey Announcements Friday Open Office hours + Co-work with teams Come and work together Ask questions Dates : April 19-4:30-6:30

More information

Chapter 5. Market Equilibrium 5.1 EQUILIBRIUM, EXCESS DEMAND, EXCESS SUPPLY

Chapter 5. Market Equilibrium 5.1 EQUILIBRIUM, EXCESS DEMAND, EXCESS SUPPLY Chapter 5 Price SS p f This chapter will be built on the foundation laid down in Chapters 2 and 4 where we studied the consumer and firm behaviour when they are price takers. In Chapter 2, we have seen

More information

Agent Intelligence White Paper. Intelligently prioritize and route tasks across department with Agent Intelligence.

Agent Intelligence White Paper. Intelligently prioritize and route tasks across department with Agent Intelligence. Agent Intelligence White Paper Intelligently prioritize and route tasks across department with Agent Intelligence. Table of Contents Outline and Overview.... 3 The need for intelligent automation.... 3

More information

SAS ANALYTICS AND OPEN SOURCE

SAS ANALYTICS AND OPEN SOURCE GUIDEBOOK SAS ANALYTICS AND OPEN SOURCE April 2014 2014 Nucleus Research, Inc. Reproduction in whole or in part without written permission is prohibited. THE BOTTOM LINE Many organizations balance open

More information

[Type the document title]

[Type the document title] EFFECTIVE PREMIUM - CUSTOMER TARGETING USING CLASSIFICATION METHODS - Increase number of purchases of high margin products using classification methods [Type the document title] [Type the document subtitle]

More information

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 02 Data Mining Process Welcome to the lecture 2 of

More information

Amadeus Activities & Entertainment

Amadeus Activities & Entertainment Amadeus Activities & Entertainment User Manual Version 1.1 Contents 1 Introduction... 3 2 Available Functions... 4 3 Description of booking procedure... 4 3.1 Accessing Amadeus Activities & Entertainment...

More information

Improving Financial Performance with Predictive Analytics

Improving Financial Performance with Predictive Analytics Improving Financial Performance with Predictive Analytics Use Data Science to Gain Competitive Advantage Sponsored by IBM Predictive Analytics Arrives Many organizations today are applying predictive analytics.

More information

Engagement Portal. Employee Engagement User Guide Press Ganey Associates, Inc.

Engagement Portal. Employee Engagement User Guide Press Ganey Associates, Inc. Engagement Portal Employee Engagement User Guide 2015 Press Ganey Associates, Inc. Contents Logging In... 3 Summary Dashboard... 4 Results For... 5 Filters... 6 Summary Page Engagement Tile... 7 Summary

More information

Product and Content Personalisation

Product and Content Personalisation Product and Content Personalisation with IBM Digital Recommendations Lani Kakiet EMM Solutions Consultant, IBM laarnikakiet@uk.ibm.com Get Connected @IBMEMM Jon Adamson Multichannel Analytics Manager,

More information

RECOGNIZING USER INTENTIONS IN REAL-TIME

RECOGNIZING USER INTENTIONS IN REAL-TIME WHITE PAPER SERIES IPERCEPTIONS ACTIVE RECOGNITION TECHNOLOGY: RECOGNIZING USER INTENTIONS IN REAL-TIME Written by: Lane Cochrane, Vice President of Research at iperceptions Dr Matthew Butler PhD, Senior

More information

Solutions Implementation Guide

Solutions Implementation Guide Solutions Implementation Guide Salesforce, Winter 18 @salesforcedocs Last updated: November 30, 2017 Copyright 2000 2017 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of

More information

ASSOCIATION AND SEQUENCING

ASSOCIATION AND SEQUENCING ASSOCIATION AND SEQUENCING KEYS TO SUCCESSFUL MARKET BASKET ANALYSIS AND WEB MINING Until recently, association and sequencing were often overlooked. Many data mining products often omitted one or both

More information

CSC-272 Exam #1 February 13, 2015

CSC-272 Exam #1 February 13, 2015 CSC-272 Exam #1 February 13, 2015 Name Questions are weighted as indicated. Show your work and state your assumptions for partial credit consideration. Unless explicitly stated, there are NO intended errors

More information

Executive Summary Avenue is a sneaker store located in Antwerp, Belgium that mostly offers exclusive models of sneaker brands like adidas, Nike,

Executive Summary Avenue is a sneaker store located in Antwerp, Belgium that mostly offers exclusive models of sneaker brands like adidas, Nike, Executive Summary Avenue is a sneaker store located in Antwerp, Belgium that mostly offers exclusive models of sneaker brands like adidas, Nike, Reebok, Asics, etc. Since December 2016 the store has a

More information

Concept Searching is unique. No other statistical search and classification vendor puts compound terms in their index. This technique delivers high

Concept Searching is unique. No other statistical search and classification vendor puts compound terms in their index. This technique delivers high 1 2 Concept Searching is unique. No other statistical search and classification vendor puts compound terms in their index. This technique delivers high precision without the loss of recall. Why Classification?

More information

September Understanding the value of combining first and third-party data

September Understanding the value of combining first and third-party data September 2017 Understanding the value of combining first and third-party data First-party data is a marketer s most valuable data asset. It s what a company generates through interactions with their customers.

More information

DOI: /IJITKM Page 68

DOI: /IJITKM Page 68 Analysis and Prediction Framework: Case Study in Fast Moving Consumer Goods A. Aponso 1, K. Karunaratne 2, N. Madubashini 3, L. Gunathilaka 4 and I. Guruge 5 SLIIT Computing, Sri Lanka Institute of Information

More information

Linear model to forecast sales from past data of Rossmann drug Store

Linear model to forecast sales from past data of Rossmann drug Store Abstract Linear model to forecast sales from past data of Rossmann drug Store Group id: G3 Recent years, the explosive growth in data results in the need to develop new tools to process data into knowledge

More information

Combinational Collaborative Filtering: An Approach For Personalised, Contextually Relevant Product Recommendation Baskets

Combinational Collaborative Filtering: An Approach For Personalised, Contextually Relevant Product Recommendation Baskets Combinational Collaborative Filtering: An Approach For Personalised, Contextually Relevant Product Recommendation Baskets Research Project - Jai Chopra (338852) Dr Wei Wang (Supervisor) Dr Yifang Sun (Assessor)

More information

The Customer Is Always Right: Analyzing Existing Market Feedback to Improve TVs

The Customer Is Always Right: Analyzing Existing Market Feedback to Improve TVs The Customer Is Always Right: Analyzing Existing Market Feedback to Improve TVs Jose Valderrama 1, Laurel Rawley 2, Simon Smith 3, Mark Whiting 4 1 University of Central Florida 2 University of Houston

More information

ISTQB Certified Tester. Foundation Level. Sample Exam 1

ISTQB Certified Tester. Foundation Level. Sample Exam 1 ISTQB Certified Tester Foundation Level Version 2015 American Software Testing Qualifications Board Copyright Notice This document may be copied in its entirety, or extracts made, if the source is acknowledged.

More information

Predicting Airbnb Bookings by Country

Predicting Airbnb Bookings by Country Michael Dimitras A12465780 CSE 190 Assignment 2 Predicting Airbnb Bookings by Country 1: Dataset Description For this assignment, I selected the Airbnb New User Bookings set from Kaggle. The dataset is

More information

Understanding Churn. Context: retail sales

Understanding Churn. Context: retail sales Context: retail sales Dataset Real data describing customers and transactions Several department stores Purchases performed over several years Includes product details, customer ID articolo.csv cliente.csv

More information

Predicting ratings of peer-generated content with personalized metrics

Predicting ratings of peer-generated content with personalized metrics Predicting ratings of peer-generated content with personalized metrics Project report Tyler Casey tyler.casey09@gmail.com Marius Lazer mlazer@stanford.edu [Group #40] Ashish Mathew amathew9@stanford.edu

More information

GLOBAL PLANNING SURVEY

GLOBAL PLANNING SURVEY GLOBAL PLANNING SURVEY Operations and Strategy Who Wins? A Survey of Corporate Planning Processes in Global 2 Companies CONTENTS Executive Summary 3 Methodology 5 Results 6 Objectives 6 Best Practice 8

More information

Lumière. A Smart Review Analysis Engine. Ruchi Asthana Nathaniel Brennan Zhe Wang

Lumière. A Smart Review Analysis Engine. Ruchi Asthana Nathaniel Brennan Zhe Wang Lumière A Smart Review Analysis Engine Ruchi Asthana Nathaniel Brennan Zhe Wang Purpose A rapid increase in Internet users along with the growing power of online reviews has given birth to fields like

More information

What Makes Google Tick?

What Makes Google Tick? What Makes Google Tick? Greg Taylor Oxford Internet Institute, University of Oxford, Oxford, UK This short article is an educational piece for aspiring economists, written for and published in Economic

More information

Web Usage Mining. Recommender Systems. Web Log Mining. Introduction Memory-Based Recommender Systems Model-Based Recommender Systems. 1 J.

Web Usage Mining. Recommender Systems. Web Log Mining. Introduction Memory-Based Recommender Systems Model-Based Recommender Systems. 1 J. Web Usage Mining Recommender Systems Introduction Memory-Based Recommender Systems Model-Based Recommender Systems Web Log Mining 1 J. Fürnkranz Recommender Systems Scenario: Users have a potential interest

More information

Product Carbon Footprint Protocol

Product Carbon Footprint Protocol Product Carbon Footprint Protocol Required data and documentation to achieve product carbon footprint certification in preparation for communication and labelling. Part 1: Requirements for Certification

More information

Automatic Tagging and Categorisation: Improving knowledge management and retrieval

Automatic Tagging and Categorisation: Improving knowledge management and retrieval Automatic Tagging and Categorisation: Improving knowledge management and retrieval 1. Introduction Unlike past business practices, the modern enterprise is increasingly reliant on the efficient processing

More information

Business Intelligence. Performing a Market Basket Analysis (Grocery Store without ItemCount)

Business Intelligence. Performing a Market Basket Analysis (Grocery Store without ItemCount) Business Intelligence Professor Chen NAME: Due Date: Performing a Market Basket Analysis (Grocery Store without ItemCount) Grocery Store Scenario Here we have an Excel-based dataset containing information

More information

Case studies in Data Mining & Knowledge Discovery

Case studies in Data Mining & Knowledge Discovery Case studies in Data Mining & Knowledge Discovery Knowledge Discovery is a process Data Mining is just a step of a (potentially) complex sequence of tasks KDD Process Data Mining & Knowledge Discovery

More information

Benchmarking Driving Efficiency using Data Science Techniques applied on Large-Scale Smartphone Data (PhD Summary)

Benchmarking Driving Efficiency using Data Science Techniques applied on Large-Scale Smartphone Data (PhD Summary) Benchmarking Driving Efficiency using Data Science Techniques applied on Large-Scale Smartphone Data (PhD Summary) The main objective of this PhD is to provide a methodological approach for driving safety

More information

QUANTITATIVE COMPARABILITY STUDY of the ICC INDEX and THE QUALITY OF LIFE DATA

QUANTITATIVE COMPARABILITY STUDY of the ICC INDEX and THE QUALITY OF LIFE DATA QUANTITATIVE COMPARABILITY STUDY of the ICC INDEX and THE QUALITY OF LIFE DATA Dr. Kseniya Rubicondo - November 2016 Table of Contents Introduction...p.3 Methodology. p.4 Analysis and Key Findings. p.5

More information

3 Ways to Improve Your Targeted Marketing with Analytics

3 Ways to Improve Your Targeted Marketing with Analytics 3 Ways to Improve Your Targeted Marketing with Analytics Introduction Targeted marketing is a simple concept, but a key element in a marketing strategy. The goal is to identify the potential customers

More information

Chart your future with predictive analytics

Chart your future with predictive analytics IBM Analytics Feature Guide IBM SPSS Modeler Chart your future with predictive analytics Finding hidden trends in your data can give you tremendous insights into your business. 2 Chart Your Future Contents

More information

Dynamic Generation of Personalized Product Bundles in Enterprise Networks

Dynamic Generation of Personalized Product Bundles in Enterprise Networks Dynamic Generation of Personalized Product Bundles in Enterprise Networks Anthony Karageorgos (1) and Elli Rapti (2) (1) Manchester Business School University of Manchester, UK anthony.karageorgos@mbs.ac.uk

More information

A Survey on Recommendation Techniques in E-Commerce

A Survey on Recommendation Techniques in E-Commerce A Survey on Recommendation Techniques in E-Commerce Namitha Ann Regi Post-Graduate Student Department of Computer Science and Engineering Karunya University, India P. Rebecca Sandra Assistant Professor

More information

Sell More, Pay Less: Drive Conversions with Unrelated Keywords

Sell More, Pay Less: Drive Conversions with Unrelated Keywords 851 SW 6th Ave., Suite 1600 Portland, OR 97204 1.503.294.7025 fax: 1.503.294.7 130 Webtrends Sales 1.888.932.8736 sales@webtrends.com Europe, Middle East, Africa +44 (0) 1784 415 700 emea@webtrends.com

More information

AFFILIATE PROGRAM Allysian Sciences Inc. REV V

AFFILIATE PROGRAM Allysian Sciences Inc. REV V AFFILIATE PROGRAM ways to earn Our goals while simple will inspire success and allow you to Redefine Possible with Allysian Sciences Inc. ( Allysian Sciences ). Allysian Sciences generously rewards experienced

More information

Predicting user rating on Amazon Video Game Dataset

Predicting user rating on Amazon Video Game Dataset Predicting user rating on Amazon Video Game Dataset CSE190A Assignment2 Hongyu Li UC San Diego A900960 holi@ucsd.edu Wei He UC San Diego A12095047 whe@ucsd.edu ABSTRACT Nowadays, accurate recommendation

More information

Test Management: Part I. Software Testing: INF3121 / INF4121

Test Management: Part I. Software Testing: INF3121 / INF4121 Test Management: Part I Software Testing: INF3121 / INF4121 Summary: Week 6 Test organisation Independence Tasks of the test leader and testers Test planning and estimation Activities Entry and exit criteria

More information

Data Strategy: How to Handle the New Data Integration Challenges. Edgar de Groot

Data Strategy: How to Handle the New Data Integration Challenges. Edgar de Groot Data Strategy: How to Handle the New Data Integration Challenges Edgar de Groot New Business Models Lead to New Data Integration Challenges Organisations are generating insight Insight is capital 3 Retailers

More information

Association Discovery. Janjao Mongkolnavin Department of Statistics Faculty of Commerce and Accountancy Chulalongkorn University

Association Discovery. Janjao Mongkolnavin Department of Statistics Faculty of Commerce and Accountancy Chulalongkorn University Association Discovery Janjao Mongkolnavin Department of Statistics Faculty of Commerce and Accountancy Chulalongkorn University Outline Market basket analysis Association rules Support-confidence framework

More information

TRANSPORTATION PROBLEM AND VARIANTS

TRANSPORTATION PROBLEM AND VARIANTS TRANSPORTATION PROBLEM AND VARIANTS Introduction to Lecture T: Welcome to the next exercise. I hope you enjoyed the previous exercise. S: Sure I did. It is good to learn new concepts. I am beginning to

More information

ECONOMIC AND STRATEGIC BENEFITS

ECONOMIC AND STRATEGIC BENEFITS THE ECONOMIC AND STRATEGIC BENEFITS OF CLOUD COMPUTING Grab a seat and enjoy. Read Time: 12 minutes THE ECONOMIC AND STRATEGIC BENEFITS OF CLOUD COMPUTING Does SaaS save money? Traditional vendors of IT

More information

Profit Optimization ABSTRACT PROBLEM INTRODUCTION

Profit Optimization ABSTRACT PROBLEM INTRODUCTION Profit Optimization Quinn Burzynski, Lydia Frank, Zac Nordstrom, and Jake Wolfe Dr. Song Chen and Dr. Chad Vidden, UW-LaCrosse Mathematics Department ABSTRACT Each branch store of Fastenal is responsible

More information

Using Channel Data to Manage Partner Programs

Using Channel Data to Manage Partner Programs 2014 Channel Data Management Survey Results Using Channel Data to Manage Partner Programs Introduction Technology manufacturers command visibility into every element of channel operations, ranging from

More information

Process Mining techniques in complex Administrative Processes

Process Mining techniques in complex Administrative Processes Process Mining techniques in complex Administrative Processes Jan Suchy, Milan Suchy GRADIENT ECM, Kosicka 56, 82108 Bratislava, Slovakia {Jan.Suchy, Milan.Suchy}@gradientecm.com Abstract. This research

More information

Waldemar Jaroński* Tom Brijs** Koen Vanhoof** COMBINING SEQUENTIAL PATTERNS AND ASSOCIATION RULES FOR SUPPORT IN ELECTRONIC CATALOGUE DESIGN

Waldemar Jaroński* Tom Brijs** Koen Vanhoof** COMBINING SEQUENTIAL PATTERNS AND ASSOCIATION RULES FOR SUPPORT IN ELECTRONIC CATALOGUE DESIGN Waldemar Jaroński* Tom Brijs** Koen Vanhoof** *University of Economics in Wrocław Department of Artificial Intelligence Systems ul. Komandorska 118/120, 53-345 Wrocław POLAND jaronski@baszta.iie.ae.wroc.pl

More information

Intelligent Production Cost Allocation System

Intelligent Production Cost Allocation System Session 2259 Intelligent Production Cost Allocation System Michael L. Rioux, Dr. Bruce E. Segee University of Maine Department of Electrical and Computer Engineering Instrumentation Research Laboratory

More information

MEASURING REAL-TIME PREDICTIVE MODELS. 1. Introduction

MEASURING REAL-TIME PREDICTIVE MODELS. 1. Introduction MEASURING REAL-IME PREDICIVE MODELS SAMUEL SEINGOLD, RICHARD WHERRY, AND GREGORY PIAESKY-SHAPIRO Abstract. In this paper we examine the problem of comparing real-time predictive models and propose a number

More information

EXAMINERS REPORT ON THE PERFORMANCE OF CANDIDATES CSEE, 2014

EXAMINERS REPORT ON THE PERFORMANCE OF CANDIDATES CSEE, 2014 THE NATIONAL EXAMINATIONS COUNCIL OF TANZANIA EXAMINERS REPORT ON THE PERFORMANCE OF CANDIDATES CSEE, 2014 062 BOOK KEEPING (For School Candidates) THE NATIONAL EXAMINATIONS COUNCIL OF TANZANIA EXAMINERS

More information

Nubeprint REPORT. July 2017

Nubeprint REPORT. July 2017 Nubeprint REPORT MPS COMPLIANT - ANALYSIS OF PRINTER / COPIER MODELS Introduction July 2017 The first Nubeprint MPS Compliance Report was published July 2011. For the first time, a company tested printers

More information

How modern analytics keeps the marketing mix relevant

How modern analytics keeps the marketing mix relevant How modern analytics keeps the marketing mix relevant Advertising has become more data-driven and competitive. Big data and improved analytics technology have given marketers more opportunities than ever

More information