E-commerce models for banks profitability

Size: px
Start display at page:

Download "E-commerce models for banks profitability"

Transcription

1 Data Mining VI 485 E-commerce models for banks profitability V. Aggelis Egnatia Bank SA, Greece Abstract The use of data mining methods in the area of e-business can already be considered of great assistance in prediction, knowledge management, and decision support. In e-commerce in particular there are a significant number of metrics which have been tested and used for measuring interesting parameters. In most cases these parameters are in relation with customer habits and customer profitability. Nowadays, many merchants cooperate with banks for authorizing credit card transactions in order to purchase products. Banks are also interested, in measuring the profit and the strength of this cooperation. In this paper we introduce two models with this scope, (a) a merchant clustering model and (b) a bank revenue predictive model. According to the first model, a bank scores and classifies its cooperating merchants using a number of parameters, while in the second model, a bank predicts its revenue from e-commerce transactions. Keywords: data mining, clustering, k-means algorithm, predictive model, linear regression. 1 Introduction Banking or financial data treatment is generally conducted using several data mining methods such as Linear Regression, Clustering, Classification and others aiming at the development of patterns, rules, predictive models and finally forecasting. These methods produce interesting as well as useful results. However, not all kinds of results lead to rigid conclusions. From this point of view the data miner and the judgment of the user are essential in evaluating the results and especially the predictive models efficiency. Therefore the co-operation between people expert in data mining and others with good knowledge of the data sets is important leading to proper evaluation of the predictive model. In the banking area this combination is definitely necessary due to the singularity of bank data as well as bank market rules.

2 486 Data Mining VI A specific kind of bank service is the application of credit card authorization through internet (e-commerce). E-commerce is relatively new making relevant feature extraction very important. Since future tendencies suggest the increase of its use, a bank should be naturally concerned with enlargement of its cooperated merchant share in this specific area and increasing its revenue. In the present paper, the merchants clustering is studied along with the ranking of these merchants according to a new merchant scoring analysis. Also we develop and evaluate a predictive model for bank s revenue through e- commerce. Τhe software used is SPSS Clementine 7.0. Description of various clustering techniques and algorithms and a general description of predictive models follow in section 2 while in section 3 the calculation of merchants clustering and the procedure of predictive model is described. Section 4 contains experimental results derived from the data set of section 3 and conclusions and future work is stated in section 5. 2 Theoretical background 2.1 Clustering basics Clustering techniques [1, 4] fall into a group of undirected data mining tools. The goal of undirected data mining is to discover structure in the data as a whole. There is no target variable to be predicted, thus no distinction is being made between independent and dependent variables. Clustering techniques are used for combining observed examples into clusters (groups) that satisfy two main criteria: each group or cluster is homogeneous; examples that belong to the same group are similar to each other. each group or cluster should be different from other clusters, that is, examples that belong to one cluster should be different from the examples of other clusters. Depending on the clustering technique, clusters can be expressed in different ways: identified clusters may be exclusive, so that any example belongs to only one cluster. they may be overlapping; an example may belong to several clusters. they may be probabilistic, whereby an example belongs to each cluster with a certain probability. clusters might have hierarchical structure, having crude division of examples at highest level of hierarchy, which is then refined to subclusters at lower levels. 2.2 K-means algorithm K-means [1, 4, 8, 9] is the simplest clustering algorithm. This algorithm uses as input a predefined number of clusters that is the k from its name. Mean stands for an average, an average location of all the members of a particular cluster. When

3 Data Mining VI 487 dealing with clustering techniques, a notion of a high dimensional space must be adopted, or space in which orthogonal dimensions are all attributes from the table of analysed data. The value of each attribute of an example represents a distance of the example from the origin along the attribute axes. Of course, in order to use this geometry efficiently, the values in the data set must all be numeric and should be normalized in order to allow fair computation of the overall distances in a multi-attribute space. K-means algorithm is a simple, iterative procedure, in which a crucial concept is the one of centroid. Centroid is an artificial point in the space of records that represents an average location of the particular cluster. The coordinates of this point are averages of attribute values of all examples that belong to the cluster. The steps of the K-means algorithm are given in Figure Select randomly k points (it can be also examples) to be the seeds for the centroids of k clusters. 2. Assign each example to the centroid closest to the example, forming in this way k exclusive clusters of examples. 3. Calculate new centroids of the clusters. For that purpose average all attribute values of the examples belonging to the same cluster (centroid). 4. Check if the cluster centroids have changed their "coordinates". If yes, start again form the step 2). If not, cluster detection is finished and all examples have their cluster memberships defined. Figure 1: K-means algorithm. Usually this iterative procedure of redefining centroids and reassigning the examples to clusters needs only a few iterations to converge. 2.3 Predictive models basics A model is an abstract representation of a real-word process. A typical form of a model is Y=aX+b, where Y, X are variables and a, b are parameters. In a predictive model [12, 13, 14, 15], one variable is expressed as a function of the others. This permits the value of the response variable to be predicted from given values of the others (the predictor variables). The response variable in general predictive models is often denoted by Y, and the p predictor variables by X 1,, ^ ^ X p. The model will yield predictions, y = f(x 1,, x p ;θ) where y is the prediction of the model and θ represents the parameters of the model structure. When Y is quantitative, this task of estimating a mapping from the p- dimensional X to Y is known as regression. Prediction models [1, 14, 16] in which the response variable is a linear function of the predictor variables, yields prediction:

4 488 Data Mining VI ^ Y = a 0 + a X j j j= 1 p (1) ^ where θ = {a 0,, a p }. We have used Y rather than simply Y on the left of the expression because it is a model, which has been constructed from the data. In ^ other words, the values of Y are values predicted from the X, and not values actually observed. 3 Clustering merchants and predicting in banking data set The term «merchant» describes the on-line shop who has conducted at least one transaction during the searching period. The data sample used concern the period of the first quarter of The following variables are calculated for this specific time period. Authorized Transactions: We use the OK parameter for the authorized transactions. OK is defined as the count of authorized transactions the customers of on-line shop conducted within the period of interest (1 st quarter 2004) Amount of authorized transactions: We use the OKAmt parameter for the amount of authorized transactions. OKAmt is the total amount of authorized transactions within the above stated period. Void Transactions: Void parameter is used for the void transactions. There are many reasons making a transaction void, such as not authorized card transaction, blocked credit card, connection errors and other. Void is the count of void transactions the customers of on-line shop conducted within the period of interest. Amount of void transactions: VoidAmt parameter is used for the amount of void transactions. VoidAmt is the total amount of void transactions within the first quarter of Revenue: We refer in bank s revenue, which is a percentage of the total authorized amount for every cooperated on-line shop. The revenue amount is the 3% of total authorized amount, for our case study purposes. Recency: Recency is the date of the customer s last transaction. Since the recency s value contributes to a scoring determination, a numeric value is necessary. Therefore, we define the rec variable as the number of days between the first date concerned (1/1/2004) and the date of the last customer s transaction. In order to have steady results amounts are expressed in terms of thousands euro. Also rec is set to zero if the value of OK is less than 3, which means that customers of the on-line shop did not conduct at least one purchase per month. Merchant Score is calculated using the formula: Score = OK + OKAmt Void VoidAmt + Revenue + rec. A sample of the data set on which data mining methods are applied lies in Table 1.

5 Data Mining VI 489 Table 1: Sample data set. Merchant Id OK OKAmt Void VoidAmt Revenue rec ( /1000) ( /1000) Merchant clustering is performed using the K-means algorithm, which was discussed in section 2. In order to generate a prediction the stepwise linear regression [2, 6] method was used. The Stepwise method of field selection builds the equation in steps, as the name implies. The initial model is the simplest model possible, with no input fields in the equation. At each step, input fields that have not yet been added to the model are evaluated, and if the best of those input fields adds significantly to the predictive power of the model, it is added. In addition, input fields that are currently in the model are reevaluated to determine if any of them can be removed without significantly detracting from the model. If so, they are removed. Then the process is repeated, and other fields are added or removed. When no more fields can be added to improve the model, and no more can be removed without detracting from the model, the final model is generated. Figure 2: Score distribution.

6 490 Data Mining VI 4 Experimental results 4.1 Clustering As seen in the histogram of Figure 2, score distribution is high over values less than 500. This is a natural trend since the majority of merchants have small transaction numbers. Application of the K-means algorithm results in the 2 clusters of Figure 3. Next to each cluster one can see the number of appearances as well as the average value of each variable. The above clustering results in the distribution of Table 2 and Figure 4. The most important observations concerning the above results are the following: There are two basic categories of merchants, the Big ones and the Small Big merchants bring the most revenue in Banks. Merchants with a lot of void transactions usually are not members of big merchants. 4.2 Prediction model The stepwise method builds in two steps the following prediction (Figure 5). Revenue^ = (2.364)*OK + ( )*Void (2) In order to evaluate and test the appropriateness of the model (2) lift chart was used along with some indicative measures such as R, R-square, Adjusted R- Square and Linear Correlation. Figure 3: Table 2: K-means clusters. Clustering results. Cluster 1 (88,46%) Big 90% Cluster 2 (11,54%) Small 10%

7 Data Mining VI 491 Figure 4: Clusters. Figure 5: Prediction model. An example of lift Chart is shown in Figure 6. As can be seen, Chart starts well above 1.0 on the left, remains on a high plateau as we move to the right, and then trails off sharply towards 1.0 on the right side of the chart. Using the prediction of the model shows the actual lift. 4.3 Other measures Other measures of the suitability of the models are supplied in Figure 7.

8 492 Data Mining VI Figure 6: Lift chart. Figure 7: Model summary. The degree to which two or more predictors (X variables) are related to the response (Y) variable is expressed in the correlation coefficient R, which is the square root of R-square [D. Hand, H. Mannila, P. Smyth, 2001., Clementine 7.0 Users s Guide, 2002., K. Joreskog, 1999., N.R. Draper, and H. Smith, 1998]. To interpret the direction of the relationship between variables, one should look at the signs (plus or minus) of the regression or parameters (θ). If a parameter is positive, then the relationship of this variable with the dependent variable is positive; otherwise in case the parameter is negative so is the relationship. As can be seen in Figure 4 the value of R concerning the second step model is appropriate since it is close to 1. Additionally it can be observed that decrease of

9 Data Mining VI 493 ASMR and the increase of BSV, is accompanied by an increase of the count of Logins in e-banking services. R square is commonly used as measure of a model s goodness of fit. An R square value near 1 indicates a perfect regression. R square value of is considered satisfactory and indicates an acceptable model, bearing in mind that: R square is a non-descending function of the number of predictor variables present in the model; that is, adding more historical data and predictor variables (X's), has almost constantly an increasing effect on R square. This is because the addition of predictor variables to the model reduces the prediction errors. R square assumes that the data set being analysed is the entire population while in fact, it represents only a sample of the population. Αdjusted R square measures the proportion of the variation in the response variable due to the predictor variables. Unlike R square, adjusted R square accounts for the degrees of freedom associated with the sums of the squares. Therefore, even though the residual sum of squares decreases or remains constant as new predictor variables are added, this is not the case for the residual variance. This is the reason, adjusted R square is generally considered to be a more accurate goodness-of-fit measure than R square. If adjusted R square is significantly lower than R square, this normally means that some predictor variables are missing. The absence of these variables causes the improper measurement of the variation in the dependent variable. The nearest the adjusted R square is to 1, the better the model is. Adjusted R square value of is almost the same with R square indicating therefore an acceptable model. Figure 8: Linear correlation. Finally, as can been seen in Figure 8 the level of Linear Correlation of the model is Since this value approaches unity it indicates a strong positive relation, such that high predicted values are associated with high actual values and vice versa.

10 494 Data Mining VI 5 Conclusions and future work In the present paper it is shown that the knowledge of scoring of merchants who are cooperated with them can rank them according to a two level model. This result was highlighted by the use of K-Means method. Therefore, the e-banking unit of a bank may easily identify the most important merchants. The model continuously trained reveals also the way merchants are transposed between different levels so that the bank administration has the opportunity to diminish merchant leakage. At the same time merchant approach and new services and products promotion is improved since it is the bank s knowledge that it is more likely a merchant to respond to a promotion campaign if this customer belongs to the 10% of more beneficial ones. Correct recognition and analysis of the clustering results offers an advantage to the e-banking unit of a bank over the competition. Merchant clustering could be subjected to further exploitation and research. Also in this study, the development of a predictive model concerning the bank s revenue relatively to the merchant s transactions is described while experimental results are also supplied. It is concluded that there exists a strong relation between the bank s revenue and the whole merchant s transactions. Future plans employ the development of predictive models using other sources. The use of other clustering algorithms as well as other data mining methods is a promising and challenging issue for future work. References [1] D. Hand, H. Mannila, P. Smyth Principles of Data Mining. The MIT Press, [2] Clementine 7.0 Users s Guide. SPSS, Integral solutions Limited, [3] Clementine Application Template for Customer Relationship Management 6.5. SPSS, Integral solutions Limited, [4] K. Collier, B. Carey, E. Grusy, C. Marjaniemi, and D. Sautter. A Perspective on Data Mining, Northern Arizona University, [5] J. Curry and A. Curry The Customer Marketing Method: How to Implement and Profit from Customer Relationship Management, [6] S.A. Madeira. Comparison of Target Selection Methods in Direct Marketing, MSc Thesis, Technical University of Lisbon, 2002 [7] Retain Customers and reduce risk, White Paper, COMPAQ, [8] P. Bradley and U. Fayyad. Refining Initial Points for K-Means Clustering, Proc. 15th International Conf. on Machine Learning, [9] H. Zha, C. Ding, M. Gu, X. He and H. Simon. Spectral Relaxation for K- means Clustering, Neural Info. Processing Systems, [10] Data-Driven Analysis Tools and Techniques, White Paper, DataPlus Millennium, 2001.

11 Data Mining VI 495 [11] K. Im, and S. Park. A Study on Analyzing Characteristics of Target Customers from Refined Sales Data, APIEMS, [12] Foster, D., and Stine, R. Variable Selection in Data Mining: Building a Predictive Model for Bankruptcy, Center for Financial Institutions Working Papers from Wharton School Center for Financial Institutions, University of Pennsylvania, [13] Zupan, B., Demsar, J., Kattan, M., Ohori, M., Graefen, M., Bohanec, M., and Beck, J.R. Orange and Decisions-at-Hand: Bridging Predictive Data Mining and Decision Support, Workshop Integrating Aspects of Data Mining, Decision Support and Meta-Learning, [14] Raftery, A., Madigan, D., and Hoeting, J. Bayesian Model Averaging for Linear Regression Models, Journal of the American Statistical Association, [15] Laud, P., and Ibrahim, J. Predictive Model selection, Journal of the Royal Statistics Society, [16] Draper, N.R., and Smith, H. "Applied Regression Analysis" John Wiley & Sons, Inc, 1998.

e-col Predictive Model for Electronic Banking Data

e-col Predictive Model for Electronic Banking Data In 5th European Conference on Knowledge Management, 2004 e-col Predictive Model for Electronic Banking Data Vasilis Aggelis University of Patras Department of Computer Engineering and Informatics Rio,

More information

RFM analysis for decision support in e-banking area

RFM analysis for decision support in e-banking area RFM analysis for decision support in e-banking area VASILIS AGGELIS WINBANK PIRAEUS BANK Athens GREECE AggelisV@winbank.gr DIMITRIS CHRISTODOULAKIS Computer Engineering and Informatics Department University

More information

Predictive Model in Electronic Banking Data

Predictive Model in Electronic Banking Data Στο περιοδικό Επιθεώρηση Οικονοµικών Επιστηµών, 2005 Predictive Model in Electronic Banking Data Vasilis Aggelis University of Patras Department of Computer Engineering and Informatics Rio, Patras, Greece

More information

e-banking Prediction using Data Mining Methods

e-banking Prediction using Data Mining Methods e-banking Prediction using Data Mining Methods VASILIS AGGELIS Department of Computer Engineering and Informatics University of Patras Rio, Patras GREECE PANAGIOTIS ANAGNOSTOU Technological Education Institute

More information

Association rules model of e-banking services

Association rules model of e-banking services Association rules model of e-banking services V. Aggelis Department of Computer Engineering and Informatics, University of Patras, Greece Abstract The introduction of data mining methods in the banking

More information

Association rules model of e-banking services

Association rules model of e-banking services In 5 th International Conference on Data Mining, Text Mining and their Business Applications, 2004 Association rules model of e-banking services Vasilis Aggelis Department of Computer Engineering and Informatics,

More information

Data Mining and Marketing Intelligence

Data Mining and Marketing Intelligence Data Mining and Marketing Intelligence Alberto Saccardi Abstract The technological advance has made possible to create data bases designed for the marketing intelligence, with the availability of large

More information

Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce

Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce Clustering Method using Item Preference based on RFM for Recommendation System in u-commerce Young Sung Cho 1, Song Chul Moon 2, Seon-phil Jeong 3, In-Bae Oh 4, Keun Ho Ryu 1 1 Department of Computer Science,

More information

e-trans Association Rules for e-banking Transactions

e-trans Association Rules for e-banking Transactions In IV International Conference on Decision Support for Telecommunications and Information Society, 2004 e-trans Association Rules for e-banking Transactions Vasilis Aggelis University of Patras Department

More information

Knowledge representation in our systems

Knowledge representation in our systems Data Mining VIII: Data, Text and Web Mining and their Business Applications 211 Knowledge representation in our systems V. Aggelis Piraeus Bank SA, Electronic Banking Division, Greece Abstract Data utilization

More information

Fraud Detection for MCC Manipulation

Fraud Detection for MCC Manipulation 2016 International Conference on Informatics, Management Engineering and Industrial Application (IMEIA 2016) ISBN: 978-1-60595-345-8 Fraud Detection for MCC Manipulation Hong-feng CHAI 1, Xin LIU 2, Yan-jun

More information

Customer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara

Customer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara Customer Relationship Management in marketing programs: A machine learning approach for decision Fernanda Alcantara F.Alcantara@cs.ucl.ac.uk CRM Goal Support the decision taking Personalize the best individual

More information

Case studies in Data Mining & Knowledge Discovery

Case studies in Data Mining & Knowledge Discovery Case studies in Data Mining & Knowledge Discovery Knowledge Discovery is a process Data Mining is just a step of a (potentially) complex sequence of tasks KDD Process Data Mining & Knowledge Discovery

More information

A Decision Support Method for Investment Preference Evaluation *

A Decision Support Method for Investment Preference Evaluation * BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 6, No 1 Sofia 2006 A Decision Support Method for Investment Preference Evaluation * Ivan Popchev, Irina Radeva Institute of

More information

Frequently Asked Questions (FAQ)

Frequently Asked Questions (FAQ) Frequently Asked Questions (FAQ) What is a cluster? The assignment to small areas of geography, such as the census block group, to one of forty or more lifestyle codes established through the application

More information

MEASURING REAL-TIME PREDICTIVE MODELS. 1. Introduction

MEASURING REAL-TIME PREDICTIVE MODELS. 1. Introduction MEASURING REAL-IME PREDICIVE MODELS SAMUEL SEINGOLD, RICHARD WHERRY, AND GREGORY PIAESKY-SHAPIRO Abstract. In this paper we examine the problem of comparing real-time predictive models and propose a number

More information

Multivariate analysis of marketing data - applications for bricolage market

Multivariate analysis of marketing data - applications for bricolage market Bulletin of Transilvania University of Braşov Series V: Economic Sciences Vol. 9 (58) No. 2 2016 Multivariate analysis of marketing data - applications for bricolage market Mihai FÂNARU 1 Abstract: By

More information

Segmenting Customer Bases in Personalization Applications Using Direct Grouping and Micro-Targeting Approaches

Segmenting Customer Bases in Personalization Applications Using Direct Grouping and Micro-Targeting Approaches Segmenting Customer Bases in Personalization Applications Using Direct Grouping and Micro-Targeting Approaches Alexander Tuzhilin Stern School of Business New York University (joint work with Tianyi Jiang)

More information

LOSS DISTRIBUTION ESTIMATION, EXTERNAL DATA

LOSS DISTRIBUTION ESTIMATION, EXTERNAL DATA LOSS DISTRIBUTION ESTIMATION, EXTERNAL DATA AND MODEL AVERAGING Ethan Cohen-Cole Federal Reserve Bank of Boston Working Paper No. QAU07-8 Todd Prono Federal Reserve Bank of Boston This paper can be downloaded

More information

Cluster Analysis and Segmentation

Cluster Analysis and Segmentation Cluster Analysis and Segmentation T. Evgeniou What is this for? In Data Analytics we often have very large data (many observations - rows in a flat file), which are however similar to each other hence

More information

WEEK 9 DATA MINING 1

WEEK 9 DATA MINING 1 WEEK 9 DATA MINING 1 Week 9 Data Mining Introduction The purpose of this paper is to present the illustration of different aspects, which are associated with data mining. In the current era, businesses

More information

New Customer Acquisition Strategy

New Customer Acquisition Strategy Page 1 New Customer Acquisition Strategy Based on Customer Profiling Segmentation and Scoring Model Page 2 Introduction A customer profile is a snapshot of who your customers are, how to reach them, and

More information

Application of Association Rule Mining in Supplier Selection Criteria

Application of Association Rule Mining in Supplier Selection Criteria Vol:, No:4, 008 Application of Association Rule Mining in Supplier Selection Criteria A. Haery, N. Salmasi, M. Modarres Yazdi, and H. Iranmanesh International Science Index, Industrial and Manufacturing

More information

IM S5028. Architecture for Analytical CRM. Architecture for Analytical CRM. Customer Analytics. Data Mining for CRM: an overview.

IM S5028. Architecture for Analytical CRM. Architecture for Analytical CRM. Customer Analytics. Data Mining for CRM: an overview. Customer Analytics Data Mining for CRM: an overview Architecture for Analytical CRM customer contact points Retrospective analysis tools OLAP Query Reporting Customer Data Warehouse Operational systems

More information

Preface to the third edition Preface to the first edition Acknowledgments

Preface to the third edition Preface to the first edition Acknowledgments Contents Foreword Preface to the third edition Preface to the first edition Acknowledgments Part I PRELIMINARIES XXI XXIII XXVII XXIX CHAPTER 1 Introduction 3 1.1 What Is Business Analytics?................

More information

A Soft Classification Model for Vendor Selection

A Soft Classification Model for Vendor Selection A Soft Classification Model for Vendor Selection Arpan K. Kar, Ashis K. Pani, Bijaya K. Mangaraj, and Supriya K. De Abstract This study proposes a pattern classification model for usage in the vendor selection

More information

CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES

CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES 49 CHAPTER 4 A FRAMEWORK FOR CUSTOMER LIFETIME VALUE USING DATA MINING TECHNIQUES 4.1 INTRODUCTION Different groups of customers prefer some special products. Customers type recognition is one of the main

More information

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.

More information

ENGG1811: Data Analysis using Excel 1

ENGG1811: Data Analysis using Excel 1 ENGG1811 Computing for Engineers Data Analysis using Excel (weeks 2 and 3) Data Analysis Histogram Descriptive Statistics Correlation Solving Equations Matrix Calculations Finding Optimum Solutions Financial

More information

Application of an Improved Neural Network Algorithm in E- commerce Customer Satisfaction Evaluation

Application of an Improved Neural Network Algorithm in E- commerce Customer Satisfaction Evaluation Application of an Improved Neural Network Algorithm in E- commerce Customer Satisfaction Evaluation Lei Yang 1,2 *, Ying Ma 3 1 Science and Technology Key Laboratory of Hebei Province, Baoding 071051,

More information

A Decision Support System for Market Segmentation - A Neural Networks Approach

A Decision Support System for Market Segmentation - A Neural Networks Approach Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1995 Proceedings Americas Conference on Information Systems (AMCIS) 8-25-1995 A Decision Support System for Market Segmentation

More information

Comparative study on demand forecasting by using Autoregressive Integrated Moving Average (ARIMA) and Response Surface Methodology (RSM)

Comparative study on demand forecasting by using Autoregressive Integrated Moving Average (ARIMA) and Response Surface Methodology (RSM) Comparative study on demand forecasting by using Autoregressive Integrated Moving Average (ARIMA) and Response Surface Methodology (RSM) Nummon Chimkeaw, Yonghee Lee, Hyunjeong Lee and Sangmun Shin Department

More information

Data Mining in CRM THE CRM STRATEGY

Data Mining in CRM THE CRM STRATEGY CHAPTER ONE Data Mining in CRM THE CRM STRATEGY Customers are the most important asset of an organization. There cannot be any business prospects without satisfied customers who remain loyal and develop

More information

Fraudulent Behavior Forecast in Telecom Industry Based on Data Mining Technology

Fraudulent Behavior Forecast in Telecom Industry Based on Data Mining Technology Fraudulent Behavior Forecast in Telecom Industry Based on Data Mining Technology Sen Wu Naidong Kang Liu Yang School of Economics and Management University of Science and Technology Beijing ABSTRACT Outlier

More information

Who Are My Best Customers?

Who Are My Best Customers? Technical report Who Are My Best Customers? Using SPSS to get greater value from your customer database Table of contents Introduction..............................................................2 Exploring

More information

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pages 37-64. The description of the problem can be found

More information

Domain Driven Data Mining: An Efficient Solution For IT Management Services On Issues In Ticket Processing

Domain Driven Data Mining: An Efficient Solution For IT Management Services On Issues In Ticket Processing International Journal of Computational Engineering Research Vol, 03 Issue, 5 Domain Driven Data Mining: An Efficient Solution For IT Management Services On Issues In Ticket Processing 1, V.R.Elangovan,

More information

Tutorial Segmentation and Classification

Tutorial Segmentation and Classification MARKETING ENGINEERING FOR EXCEL TUTORIAL VERSION v171025 Tutorial Segmentation and Classification Marketing Engineering for Excel is a Microsoft Excel add-in. The software runs from within Microsoft Excel

More information

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine R. Sathya Assistant professor, Department of Computer Science & Engineering Annamalai University

More information

Rounding a method for estimating a number by increasing or retaining a specific place value digit according to specific rules and changing all

Rounding a method for estimating a number by increasing or retaining a specific place value digit according to specific rules and changing all Unit 1 This unit bundles student expectations that address whole number estimation and computational fluency and proficiency. According to the Texas Education Agency, mathematical process standards including

More information

Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS

Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS Paper 1772-2018 Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS Spyridon Potamitis, Centrica; Paul Malley, Centrica ABSTRACT Numerous papers have discussed

More information

A Noble Approach of Clustering the Users in M-Commerce for Providing Segmented Promotion of Goods & Services Using K-means Algorithm

A Noble Approach of Clustering the Users in M-Commerce for Providing Segmented Promotion of Goods & Services Using K-means Algorithm 2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. 47 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V47.19 A Noble Approach of Clustering the Users

More information

A Framework for the Optimizing of WWW Advertising

A Framework for the Optimizing of WWW Advertising A Framework for the Optimizing of WWW Advertising Charu C. Aggarwal, Joel L. Wolf and Philip S. Yu IBM T.J. Watson Research Center, Yorktown Heights, New York Abstract. This paper discusses a framework

More information

Cold-start Solution to Location-based Entity Shop. Recommender Systems Using Online Sales Records

Cold-start Solution to Location-based Entity Shop. Recommender Systems Using Online Sales Records Cold-start Solution to Location-based Entity Shop Recommender Systems Using Online Sales Records Yichen Yao 1, Zhongjie Li 2 1 Department of Engineering Mechanics, Tsinghua University, Beijing, China yaoyichen@aliyun.com

More information

The prediction of economic and financial performance of companies using supervised pattern recognition methods and techniques

The prediction of economic and financial performance of companies using supervised pattern recognition methods and techniques The prediction of economic and financial performance of companies using supervised pattern recognition methods and techniques Table of Contents: Author: Raluca Botorogeanu Chapter 1: Context, need, importance

More information

Chapter 6: Customer Analytics Part II

Chapter 6: Customer Analytics Part II Chapter 6: Customer Analytics Part II Overview Topics discussed: Strategic customer-based value metrics Popular customer selection strategies Techniques to evaluate alternative customer selection strategies

More information

STAT 2300: Unit 1 Learning Objectives Spring 2019

STAT 2300: Unit 1 Learning Objectives Spring 2019 STAT 2300: Unit 1 Learning Objectives Spring 2019 Unit tests are written to evaluate student comprehension, acquisition, and synthesis of these skills. The problems listed as Assigned MyStatLab Problems

More information

Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS

Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS Appendix A Mixed-Effects Models 1. LONGITUDINAL HIERARCHICAL LINEAR MODELS Hierarchical Linear Models (HLM) provide a flexible and powerful approach when studying response effects that vary by groups.

More information

A Survey on Recommendation Techniques in E-Commerce

A Survey on Recommendation Techniques in E-Commerce A Survey on Recommendation Techniques in E-Commerce Namitha Ann Regi Post-Graduate Student Department of Computer Science and Engineering Karunya University, India P. Rebecca Sandra Assistant Professor

More information

SolidQ Data Science Services Fraud Detection

SolidQ Data Science Services Fraud Detection SolidQ Data Science Services Fraud Detection www.solidq.com Agenda Introduction The Continuous Learning Cycle The Structure of the POC The Benefits 1 Initial Situation Attempts to fraud happen every day!

More information

HUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS

HUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS HUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS Janaki Sivasankaran 1, B Thilaka 2 1,2 Department of Applied Mathematics, Sri Venkateswara College of Engineering, (India) ABSTRACT

More information

Quantitative Analysis of Dairy Product Packaging with the Application of Data Mining Techniques

Quantitative Analysis of Dairy Product Packaging with the Application of Data Mining Techniques Quantitative Analysis of Dairy Product Packaging with the Application of Data Mining Techniques Ankita Chopra *,.Yukti Ahuja #, Mahima Gupta # * Assistant Professor, JIMS-IT Department, IP University 3,

More information

2. What is the problem with using the sum of squares as a measure of variability, that is, why is variance a better measure?

2. What is the problem with using the sum of squares as a measure of variability, that is, why is variance a better measure? 1. Identify and define the three main measures of variability: 2. What is the problem with using the sum of squares as a measure of variability, that is, why is variance a better measure? 3. What is the

More information

Machine Learning 101

Machine Learning 101 Machine Learning 101 Mike Alperin September, 2016 Copyright 2000-2016 TIBCO Software Inc. Agenda What is Machine Learning? Decision Tree Models Customer Analytics Examples Manufacturing Examples Fraud

More information

Predicting Customer Loyalty Using Data Mining Techniques

Predicting Customer Loyalty Using Data Mining Techniques Predicting Customer Loyalty Using Data Mining Techniques Simret Solomon University of South Africa (UNISA), Addis Ababa, Ethiopia simrets2002@yahoo.com Tibebe Beshah School of Information Science, Addis

More information

The Impact of Agile. Quantified.

The Impact of Agile. Quantified. The Impact of Agile. Quantified. Agile and lean are built on a foundation of continuous improvement: You need to inspect, learn from and adapt your performance to keep improving. Enhancing performance

More information

IBM SPSS Direct Marketing 24 IBM

IBM SPSS Direct Marketing 24 IBM IBM SPSS Direct Marketing 24 IBM Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to ersion 24, release

More information

Building a Mathematical Model for Predicting the Cost of the Communication Towers Projects Using Multifactor Linear Regression Technique

Building a Mathematical Model for Predicting the Cost of the Communication Towers Projects Using Multifactor Linear Regression Technique International Journal of Construction Engineering and Management 2016, 5(1): 25-29 DOI: 10.5923/j.ijcem.20160501.03 Building a Mathematical Model for Predicting the Cost of the Communication Towers Projects

More information

Effective CRM Using Predictive Analytics

Effective CRM Using Predictive Analytics Effective CRM Using Predictive Analytics Effective CRM Using Predictive Analytics Antonios Chorianopoulos This edition first published 2016 2016 John Wiley & Sons, Ltd Registered Office John Wiley & Sons,

More information

Data Science Training Course

Data Science Training Course About Intellipaat Intellipaat is a fast-growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 600,000 in over

More information

Predictive Analytics Using Support Vector Machine

Predictive Analytics Using Support Vector Machine International Journal for Modern Trends in Science and Technology Volume: 03, Special Issue No: 02, March 2017 ISSN: 2455-3778 http://www.ijmtst.com Predictive Analytics Using Support Vector Machine Ch.Sai

More information

Segmentation and Targeting

Segmentation and Targeting Segmentation and Targeting Outline The segmentation-targeting-positioning (STP) framework Segmentation The concept of market segmentation Managing the segmentation process Deriving market segments and

More information

Is Machine Learning the future of the Business Intelligence?

Is Machine Learning the future of the Business Intelligence? Is Machine Learning the future of the Business Intelligence Fernando IAFRATE : Sr Manager of the BI domain Fernando.iafrate@disney.com Tel : 33 (0)1 64 74 59 81 Mobile : 33 (0)6 81 97 14 26 What is Business

More information

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos Effective CRM Using Predictive Analytics Antonios Chorianopoulos WlLEY Contents Preface Acknowledgments xiii xv 1 An overview of data mining: The applications, the methodology, the algorithms, and the

More information

PERSONALIZED INCENTIVE RECOMMENDATIONS USING ARTIFICIAL INTELLIGENCE TO OPTIMIZE YOUR INCENTIVE STRATEGY

PERSONALIZED INCENTIVE RECOMMENDATIONS USING ARTIFICIAL INTELLIGENCE TO OPTIMIZE YOUR INCENTIVE STRATEGY PERSONALIZED INCENTIVE RECOMMENDATIONS USING ARTIFICIAL INTELLIGENCE TO OPTIMIZE YOUR INCENTIVE STRATEGY CONTENTS Introduction 3 Optimizing Incentive Recommendations 4 Data Science and Incentives: Building

More information

After completion of this unit you will be able to: Define data analytic and explain why it is important Outline the data analytic tools and

After completion of this unit you will be able to: Define data analytic and explain why it is important Outline the data analytic tools and After completion of this unit you will be able to: Define data analytic and explain why it is important Outline the data analytic tools and techniques and explain them Now the difference between descriptive

More information

CHAPTER 5 RESULTS AND ANALYSIS

CHAPTER 5 RESULTS AND ANALYSIS CHAPTER 5 RESULTS AND ANALYSIS This chapter exhibits an extensive data analysis and the results of the statistical testing. Data analysis is done using factor analysis, regression analysis, reliability

More information

Grouping of Retail Items by Using K-Means Clustering

Grouping of Retail Items by Using K-Means Clustering Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 72 (2015 ) 495 502 The Third Information Systems International Conference Grouping of Retail Items by Using K-Means Clustering

More information

Top-down Forecasting Using a CRM Database Gino Rooney Tom Bauer

Top-down Forecasting Using a CRM Database Gino Rooney Tom Bauer Top-down Forecasting Using a CRM Database Gino Rooney Tom Bauer Abstract More often than not sales forecasting in modern companies is poorly implemented despite the wealth of data that is readily available

More information

QUANTIFIED THE IMPACT OF AGILE. Double your productivity. Improve Quality by 250% Balance your team performance. Cut Time to Market in half

QUANTIFIED THE IMPACT OF AGILE. Double your productivity. Improve Quality by 250% Balance your team performance. Cut Time to Market in half THE IMPACT OF AGILE QUANTIFIED SWAPPING INTUITION FOR INSIGHTS KEY FIndings TO IMPROVE YOUR SOFTWARE DELIVERY Extracted by looking at real, non-attributable data from 9,629 teams using the Rally platform

More information

Putting Big Data & Analytics to Work!

Putting Big Data & Analytics to Work! Putting Big Data & Analytics to Work! Prof. dr. Bart Baesens Department of Decision Sciences and Information Management, KU Leuven (Belgium) School of Management, University of Southampton (United Kingdom)

More information

Research on Customer Knowledge Acquisition Model based on Data Mining

Research on Customer Knowledge Acquisition Model based on Data Mining JOURNAL OF SIMULATION, VOL. 5, NO. 2, May 2017 147 Research on Knowledge Acquisition Model based on Data Mining Shen Nali Southwest University of Political Science and Law, School of Management, Chongqing,China

More information

Predicting Over Target Baselines (OTBs) 1

Predicting Over Target Baselines (OTBs) 1 2010, Issue 4 The Magazine of the Project Management Institute s College of Performance Management Predicting Over Target Baselines (OTBs) 1 By Kristine Thickstun, MBA, MS, and Dr. Edward D. White Stick

More information

Software Data Analytics. Nevena Lazarević

Software Data Analytics. Nevena Lazarević Software Data Analytics Nevena Lazarević 1 Selected Literature Perspectives on Data Science for Software Engineering, 1st Edition, Tim Menzies, Laurie Williams, Thomas Zimmermann The Art and Science of

More information

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Band Battery

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Band Battery 1 1 1 0 1 0 1 0 1 Glossary of Terms Ability A defined domain of cognitive, perceptual, psychomotor, or physical functioning. Accommodation A change in the content, format, and/or administration of a selection

More information

Maths Tables and Formulae wee provided within the question paper and are available elsewhere on the website.

Maths Tables and Formulae wee provided within the question paper and are available elsewhere on the website. Examination Question and Answer Book Foundation Level 3c Business Mathematics FBSM 18 November 00 Day 1 late afternoon INSTRUCTIONS TO CANDIDATES Read this page before you look at the questions THIS QUESTION

More information

SOFTWARE DEVELOPMENT PRODUCTIVITY FACTORS IN PC PLATFORM

SOFTWARE DEVELOPMENT PRODUCTIVITY FACTORS IN PC PLATFORM SOFTWARE DEVELOPMENT PRODUCTIVITY FACTORS IN PC PLATFORM Abbas Heiat, College of Business, Montana State University-Billings, Billings, MT 59101, 406-657-1627, aheiat@msubillings.edu ABSTRACT CRT and ANN

More information

Prediction of Personalized Rating by Combining Bandwagon Effect and Social Group Opinion: using Hadoop-Spark Framework

Prediction of Personalized Rating by Combining Bandwagon Effect and Social Group Opinion: using Hadoop-Spark Framework Prediction of Personalized Rating by Combining Bandwagon Effect and Social Group Opinion: using Hadoop-Spark Framework Lu Sun 1, Kiejin Park 2 and Limei Peng 1 1 Department of Industrial Engineering, Ajou

More information

What is DSC 410/510? DSC 410/510 Multivariate Statistical Methods. What is Multivariate Analysis? Computing. Some Quotes.

What is DSC 410/510? DSC 410/510 Multivariate Statistical Methods. What is Multivariate Analysis? Computing. Some Quotes. What is DSC 410/510? DSC 410/510 Multivariate Statistical Methods Introduction Applications-oriented oriented introduction to multivariate statistical methods for MBAs and upper-level business undergraduates

More information

Hidden Markov Model based Credit Card Fraud Detection System with Time Stamp and IP Address

Hidden Markov Model based Credit Card Fraud Detection System with Time Stamp and IP Address Hidden Markov Model based Credit Card Fraud Detection System with Time Stamp and IP Address Aayushi Gupta Scholar, Department of Information Technology, Oriental Institute of Science & Technology, Bhopal

More information

Supplemental Digital Content. A new severity of illness scale using a subset of APACHE data elements shows comparable predictive accuracy

Supplemental Digital Content. A new severity of illness scale using a subset of APACHE data elements shows comparable predictive accuracy Supplemental Digital Content A new severity of illness scale using a subset of APACHE data elements shows comparable predictive accuracy Alistair E. W. Johnson, BS Centre for Doctoral Training in Healthcare

More information

A NOVEL FOREST FIRE PREDICTION TOOL UTILIZING FIRE WEATHER AND MACHINE LEARNING METHODS

A NOVEL FOREST FIRE PREDICTION TOOL UTILIZING FIRE WEATHER AND MACHINE LEARNING METHODS A NOVEL FOREST FIRE PREDICTION TOOL UTILIZING FIRE WEATHER AND MACHINE LEARNING METHODS Leo Deng* Portland State University, Portland, Oregon, USA, leodeng5@gmail.com Marek Perkowski Portland State University,

More information

Section 9: Presenting and describing quantitative data

Section 9: Presenting and describing quantitative data Section 9: Presenting and describing quantitative data Australian Catholic University 2014 ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced or used in any form

More information

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions Near-Balanced Incomplete Block Designs with An Application to Poster Competitions arxiv:1806.00034v1 [stat.ap] 31 May 2018 Xiaoyue Niu and James L. Rosenberger Department of Statistics, The Pennsylvania

More information

A Study on Customer Satisfaction of Mobile Wallet Services Provided by Paytm

A Study on Customer Satisfaction of Mobile Wallet Services Provided by Paytm A Study on Customer Satisfaction of Mobile Wallet Services Provided by Paytm Saviour F Research Scholar, Department of Commerce, Scott Christian College, Nagercoil, INDIA Corresponding Author: itssaviour4u@gmail.com

More information

Assistent Professor, Department of MCA, St. Mary's Group of Institutions, Guntur, Andhra Pradesh, India

Assistent Professor, Department of MCA, St. Mary's Group of Institutions, Guntur, Andhra Pradesh, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Tele Comm. Customer Data Analysis using Multi-Layer

More information

Discriminant Analysis Applications and Software Support

Discriminant Analysis Applications and Software Support Mirko Savić Dejan Brcanov Stojanka Dakić Discriminant Analysis Applications and Stware Support Article Info:, Vol. 3 (2008), No. 1, pp. 029-033 Received 12 Januar 2008 Accepted 24 April 2008 UDC 311.42:004

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................3

More information

Can Advanced Analytics Improve Manufacturing Quality?

Can Advanced Analytics Improve Manufacturing Quality? Can Advanced Analytics Improve Manufacturing Quality? Erica Pettigrew BA Practice Director (513) 662-6888 Ext. 210 Erica.Pettigrew@vertexcs.com Jeffrey Anderson Sr. Solution Strategist (513) 662-6888 Ext.

More information

CHAPTER IV DATA ANALYSIS

CHAPTER IV DATA ANALYSIS CHAPTER IV DATA ANALYSIS 4.1 Descriptive statistical analysis 4.1.1 The basic characteristics of the sample 145 effective questionnaires are recycled. The sample distribution of each is rational. The specific

More information

Application of neural network to classify profitable customers for recommending services in u-commerce

Application of neural network to classify profitable customers for recommending services in u-commerce Application of neural network to classify profitable customers for recommending services in u-commerce Young Sung Cho 1, Song Chul Moon 2, and Keun Ho Ryu 1 1. Database and Bioinformatics Laboratory, Computer

More information

Introduction to Analytics Tools Data Models Problem solving with analytics

Introduction to Analytics Tools Data Models Problem solving with analytics Introduction to Analytics Tools Data Models Problem solving with analytics Analytics is the use of: data, information technology, statistical analysis, quantitative methods, and mathematical or computer-based

More information

DATA PREPROCESSING METHOD FOR COST ESTIMATION OF BUILDING PROJECTS

DATA PREPROCESSING METHOD FOR COST ESTIMATION OF BUILDING PROJECTS DATA PREPROCESSING METHOD FOR COST ESTIMATION OF BUILDING PROJECTS Sae-Hyun Ji Ph.D. Candidate, Dept. of Architecture, Seoul National Univ., Seoul, Korea oldclock@snu.ac.kr Moonseo Park Professor, Ph.D.,

More information

TDWI Analytics Principles and Practices

TDWI Analytics Principles and Practices TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY Previews of TDWI course books offer an opportunity to see the quality of our material

More information

Paper Performing Machine Learning Techniques in a Contextual Marketing Scenario. Francisco Capetillo, Telefonica Chile

Paper Performing Machine Learning Techniques in a Contextual Marketing Scenario. Francisco Capetillo, Telefonica Chile Paper 2904-2018 Performing Machine Learning Techniques in a Contextual Marketing Scenario Francisco Capetillo, Telefonica Chile Abstract Although information for identifying high-potential customers is

More information

1. Measures are at the I/R level, independent observations, and distributions are normal and multivariate normal.

1. Measures are at the I/R level, independent observations, and distributions are normal and multivariate normal. 1 Neuendorf Structural Equation Modeling Structural equation modeling is useful in situations when we have a complicated set of relationships among variables as specified by theory. Two main methods have

More information

RiskyProject Professional 7

RiskyProject Professional 7 RiskyProject Professional 7 Project Risk Management Software Getting Started Guide Intaver Institute 2 Chapter 1: Introduction to RiskyProject Intaver Institute What is RiskyProject? RiskyProject is advanced

More information

Statistical Analysis of Gene Expression Data Using Biclustering Coherent Column

Statistical Analysis of Gene Expression Data Using Biclustering Coherent Column Volume 114 No. 9 2017, 447-454 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu 1 ijpam.eu Statistical Analysis of Gene Expression Data Using Biclustering Coherent

More information

Hierarchical Linear Modeling: A Primer 1 (Measures Within People) R. C. Gardner Department of Psychology

Hierarchical Linear Modeling: A Primer 1 (Measures Within People) R. C. Gardner Department of Psychology Hierarchical Linear Modeling: A Primer 1 (Measures Within People) R. C. Gardner Department of Psychology As noted previously, Hierarchical Linear Modeling (HLM) can be considered a particular instance

More information