PREDICTION OF CRM USING REGRESSION MODELLING

Similar documents
Using Decision Tree to predict repeat customers

CREDIT RISK MODELLING Using SAS

SPM 8.2. Salford Predictive Modeler

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Kvalitativ Introduktion til Matematik-Økonomi

Who Are My Best Customers?

Data Mining Applications with R

LEAN SIX SIGMA BLACK BELT CHEAT SHEET

Gasoline Consumption Analysis

Predictive Analytics With Oracle Data Mining

DATA MINING: A BRIEF INTRODUCTION

Unit 6: Simple Linear Regression Lecture 2: Outliers and inference

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN

Ranking Potential Customers based on GroupEnsemble method

Examining Turnover in Open Source Software Projects Using Logistic Hierarchical Linear Modeling Approach

2 Maria Carolina Monard and Gustavo E. A. P. A. Batista

Add Sophisticated Analytics to Your Repertoire with Data Mining, Advanced Analytics and R

SOLUTION TO AN ECONOMIC LOAD DISPATCH PROBLEM USING FUZZY LOGIC

A Parametric Bootstrapping Approach to Forecast Intermittent Demand

How to Get More Value from Your Survey Data

Analysis of Factors Affecting Resignations of University Employees

Stock Market Prediction with Multiple Regression, Fuzzy Type-2 Clustering and Neural Networks

Data Mining in CRM THE CRM STRATEGY

Customer Chain Operations Reference Model

Insights from the Wikipedia Contest

Comparative analysis on the probability of being a good payer

COMPARISON OF LOGISTIC REGRESSION MODEL AND MARS CLASSIFICATION RESULTS ON BINARY RESPONSE FOR TEKNISI AHLI BBPLK SERANG TRAINING GRADUATES STATUS

Harbingers of Failure: Online Appendix

Building Better Models with. Pro. Business Analytics Using SAS Enterprise Guide and. SAS Enterprise Miner. Jim Grayson Sam Gardner Mia L.

Examination of Cross Validation techniques and the biases they reduce.

Helping your business grow!

Techniques for Understanding Consumer Demand and Behavior

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Application of Intelligent Methods for Improving the Performance of COCOMO in Software Projects

Overview of Statistics used in QbD Throughout the Product Lifecycle

Describing DSTs Analytics techniques

Advanced Analytics through the credit cycle

Churn Prediction for Game Industry Based on Cohort Classification Ensemble

Modified Ratio Estimators for Population Mean Using Function of Quartiles of Auxiliary Variable

NCFM (NSE India) Certified - Business Analytics. IMS Proschool

Domain Driven Data Mining for Unavailability Estimation of Electrical Power Grids

The CMM Level for Reducing Defects and Increasing Quality and Productivity

DATA PREPROCESSING METHOD FOR COST ESTIMATION OF BUILDING PROJECTS

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

Approaching an Analytical Project. Tuba Islam, Analytics CoE, SAS UK

Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 2 Descriptive Analytics I: Nature of Data, Statistical Modeling, and Visualization

Machine Learning 101

Logistic Regression for Early Warning of Economic Failure of Construction Equipment

Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction

A Preliminary Evaluation of China s Implementation Progress in Energy Intensity Targets

Predicting user rating on Amazon Video Game Dataset

Manufacturing Cost Prediction in the Presence of Categorical and Numeric Design Attributes

The Influence of Capacity Development on the Values of Good Governance

Churn Prevention in Telecom Services Industry- A systematic approach to prevent B2B churn using SAS

Top-down Forecasting Using a CRM Database Gino Rooney Tom Bauer

A Treeboost Model for Software Effort Estimation Based on Use Case Points

Predictive Accuracy: A Misleading Performance Measure for Highly Imbalanced Data

Timing Production Runs

A SIMULATION STUDY OF THE ROBUSTNESS OF THE LEAST MEDIAN OF SQUARES ESTIMATOR OF SLOPE IN A REGRESSION THROUGH THE ORIGIN MODEL

Building a Mathematical Model for Predicting the Cost of the Communication Towers Projects Using Multifactor Linear Regression Technique

MODELING THE EXPERT. An Introduction to Logistic Regression The Analytics Edge

A statistical analysis of value of imports in Nigeria

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

Business Analytics Course

Hotel Industry Demand Curves

CORPORATE SOCIAL RESPONSIBILITY AND COMPANY PERFORMANCE

ENERGY STAR Portfolio Manager. Technical Reference. ENERGY STAR Score for Supermarkets and Food Stores in Canada OVERVIEW

A data mining approach for analyzing semiconductor MES and FDC data to enhance overall usage effectiveness (OUE)

CROP PRODUCTION AND SOIL SALINITY: EVALUATION OF FIELD DATA FROM INDIA BY SEGMENTED LINEAR REGRESSION WITH BREAKPOINT

Module 7: Multilevel Models for Binary Responses. Practical. Introduction to the Bangladesh Demographic and Health Survey 2004 Dataset.

The Impact of SEM Programs on Customer Participation Dan Rubado, JP Batmale and Kati Harper, Energy Trust of Oregon

Customer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara

Let the data speak: Machine learning methods for data editing and imputation

The Combined Model of Gray Theory and Neural Network which is based Matlab Software for Forecasting of Oil Product Demand

Hierarchical Linear Modeling: A Primer 1 (Measures Within People) R. C. Gardner Department of Psychology

Methodology Statement: Esri Data Tapestry Segmentation. An Esri White Paper March 2013

Concepts, Technology, and Applications of Mobile Commerce

EVALUATION OF LOGISTIC REGRESSION MODEL WITH FEATURE SELECTION METHODS ON MEDICAL DATASET 1 Raghavendra B. K., 2 Dr. Jay B. Simha

2016 Summer/Autumn Webinar Calendar. To sign up for a class, or call

Data Mining and Applications in Genomics

Chapter 8 Analytical Procedures

Leaf Disease Detection Using K-Means Clustering And Fuzzy Logic Classifier

Achieve Better Insight and Prediction with Data Mining

Week 1 Unit 4: Defining Project Success Criteria

3 Ways to Improve Your Targeted Marketing with Analytics

The Institute of Chartered Accountants of Sri Lanka

Managerial Decision-Making Introduction To Using Excel In Forecasting

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, Jul-Aug 2014

Getting Started with HLM 5. For Windows

Segmentation and Targeting

Text Analysis of American Airlines Customer Reviews

Application of Machine Learning to Financial Trading

Kapil Sharma 3 Department of Computer Science & Engineering Delhi Technological University Delhi India

MANAGERIAL ECONOMICS WILEY A JOHN WILEY & SONS, INC., PUBLICATION. A Mathematical Approach

Trip Generation Characteristics of Free- Standing Discount Stores: A Case Study

Priscilla Jennifer Rumbay. The Impact of THE IMPACT OF CUSTOMER LOYALTY PROGRAM TO CUSTOMER LOYALTY (STUDY OF GAUDI CLOTHING STORE MANADO)

Managerial Economics

A comparative study of Linear learning methods in Click-Through Rate Prediction

PREDICTING THE TIME REQUIRED FOR CNMP DEVELOPMENT FOR SWINE FARMS USING STATISTICAL METHODS AND REAL DATA

Transcription:

PREDICTION OF CRM USING REGRESSION MODELLING Aroushi Sharma #1, Ayush Gandhi #2, Anupam Kumar #3 #1, 2 Students, Dept. of Computer Science, MAIT, GGSIP University, Delhi, INDIA #3 Assisstant Prof., Dept. of Computer Science, MAIT, GGSIP University, Delhi, INDIA 1 sharma.aroushi@gmail.com 2 ayushgandhi16@gmail.com 3 anupamkumar@mait.ac.in Abstract- Regression Analysis Technique can be applied for improved customer experience and retaining the customer to the organization. This predictive analytics technique is proposed to predict the sales of the laptop using various customer related attributes like base, width, and processor configuration, date of purchase, individual s income and price of the product. This paper proposes the implementation of the regression analysis technique for the prediction of CRM (Customer Relationship Management). Upon the application of regression analysis on this data, we get the actual attributes responsible for driving the sales of the laptop, and hence focus on those particular attributes to improve the quality of the business and improve the profits. Keywords- CRM, Forecasting with Regression, Logistic Regression, Predictive Analytics, Regression Analysis, Segmented Regression I. INTRODUCTION Customer Relationship Management is an ERP (Enterprise Resource Planning) component which usually deals with managing the organizations current and the future prospects of the customers. CRM analyses the historical data of a customer to determine the elements that could help in retaining the customer to the organization and therefore improve the business. Not only this, but CRM also helps to bring new business to the organization with a technique called Lead management. Leads are the potential customers who can do business with the organization. The historical data upon which helps CRM in customer retention is obtained via various activities like reviews, telephone, company s website, email feedbacks, social media and many more. Adopting this technique may lead to favoritism of a particular group of people but this issue can be resolved by efficiently utilizing the CRM technique and its related approaches. The paper proposes the regression Model for CRM in order to retain customers of any business. II. MOTIVATIONS Sales prediction is the backbone of a business plan and it is a major requirement for each and every organization these days. Companies measure a business and its growth by sales, and sales prediction sets the 1939 www.ijaegt.com

standard for expenses, growth and profit. III. OBJECTIVES Here we are trying to apply the concepts of regression analysis to attempt to predict the sales for laptops by analyzing the relationships among the various customer-related attributes. In this paper we propose Regression Modeling Technique which deals with the correlation and association between statistical variables and the variables taken here are treated in a symmetric way. The various steps of Regression Modeling can firstly develop CRM Model that collects and analyzes data and targets the desired customer by finding out relationships between customers attributes, then generate Regression Model, followed by applying Regression Modeling for Data Mining and finally generate Results. This approach is followed by discussing all steps and analyzing the Results for predicting the future by examining relationships among the various data sets. Predictive analytics Predictive analytics is a part of various data mining strategies that can be used generation and gathering of information of data. This information collected can be used for the prediction of behavioral patterns or the current trends that have been going on. It has its applications in the field of crime detection and investigation and the identification of suspects. Fraudulent credit card users can be tracked by the use of this technique. This type of technique has its applications in prediction of an unknown event whether it belongs to the past, present, or future. The predictive analytics technique uses the information collected from the past experiences and looks to evaluate the relationships among the explanatory and predictor variables. These relationships help to predict those unknown events. Predictive modeling, data mining and machine learning form the very important components to predictive analytics. These components help in the analysis of the previous and present facts to predict about the future events. But this analysis for the accuracy and use of these results will completely depend upon various assumptions such as no outliers, no multicollinearity, linearity and normality and of course the quality of the analysis being done. Regression Model The solution approach to the above issue can be solved by applying Regression Modeling technique on data Mining Techniques which is based on statistical Methods for CRM that analyses continuous valued attributes. It is used to estimate the probability values associated with the data cube cells. The more the number of attributes, more will be the number of dimensions in the cuboid. These dimensions are mapped to attributes of the data set collected. Further, the dimensions can be reduced to the 3-D cuboids and 2-D cuboids for the particular set of attributes. Thus the higher order cuboids can be built from the lower- order cuboids. This proves to be the building block of most of our Regression Model. Thus our Regression Model includes the following steps: Regression Analysis The data can be analyzed with the help of statistical analytic technique. 1940 www.ijaegt.com

These techniques include Linear Regression, which is one of the simplest forms of regression. It is used to find the relationship between a random variable, Y known as the response variable and another variable X which is known as the predictor variable and this relationship comes out to be linear. Thus the equation becomes like this according to linear Regression: y=a+bx Where the variance of Y is assumed to be constant, a and b are regression coefficients which specifies the Y- intercept and slope of the line. The coefficients can be solved with the method of Least Squares, which helps in minimization of the data between the actual data and the estimated line Where Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2), Intercept(a) = (ΣY - b(σx)) / N Logistic Regression Logistic regression is a type of a regression model in which the dependent variable (DV) is categorical. Logistic regression was developed by the famous statistician David Cox in 1958 (although much work was done in single independent variable case almost two decades earlier). Binary logistic model can be used to estimate the probability value of a binary response based on one or many predictor (or independent) variables. As such it is not merely a classification method; it could be called a qualitative response/discrete choice model in the terminology of economics. Segmented Regression Segmented regression could be a technique in multivariate analysis within which the predictor variables are partitioned off into intervals and a specific line section is work into every of the interval. Divided regression is helpful once the freelance variables that are clustered into completely different teams, exhibit completely different relationships between the variables in these regions. Divided multivariate analysis also can be performed on variable information by partitioning the varied freelance variables gift.. The boundaries between the segments are thoughtabout as breakpoints. Segmented linear regression is the one in which the relationships between the intervals are evaluated by using linear regression technique. Forecasting with Regression We can use an equation to easily generate forecasts from a simple linear model y^=β^0+β^1x Here x stores the value of the predictor variable for which we are trying to forecast. This means that if we provide some value for xx in this equation we could generate an equivalent forecasting for y^. A term fitted value can be defined as the resulting value of y^ for doing the calculation using an observed value of xx from the dataset. This is not considered as a plain forecast as the actual value of y for that predictor value was used in estimation of the model here. Thus the value of y^ is affected only by the true value of y. This shows that for y^ to show a resulting value for a genuine forecast the values of xx should have a new value or that should not exist in the data that were used in the estimation of the model. 1941 www.ijaegt.com

Screenshots Normality condition IV. RESULTS FIGURE 1 BEFORE NORMALIZATION This is an exponential curve and no logarithmic or exponential transformations have been applied here. An exponential function is any function where the variable is the exponent of a constant. FIGURE 2 AFTER NORMALIZATION Database normalization (or normalization) is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. This is a bell shaped curve that shows that log transformation have been applied on the predictors exponential curve. The normal distribution is the bell curve (or normal curve). 1942 www.ijaegt.com

Summary. ISSN No: 2309-4893 VIF or Variance Inflation Factor is used to check the condition for nomulticollinearity which is essential to fulfill the assumption made before applying the linear regression technique 1943 www.ijaegt.com

The summary of the estimation done by the Linear Modeling technique (applied using the lm function) can be seen here. V. FUTURE WORKS AND CONCLUSION Linear Regression techniques have been applied and it can be seen that how the target customer can be achieved by applying the Linear regression Analysis. Further Multiple Regression Analysis can be applicable where it is based on more than two predictor variables. Then the further constants can be retrieved by applying Least Square Methods. Also, further the Non-Linear Models can be converted to linear Models. As seen in our approach the implementation of Log-Linear Model can be shown with the help of data sets collected where the attributed can be transformed from the categorical labels to the continuous valued attributes. Thus an iterative technique can be followed to build higher order cubes from lower order cubes. VI. REFERENCES [1] Bueren, A., Schierholz, R., Kolbe, L., Brenner, W.:Customer Knowledge Management - Improving Performance of Customer Relationship Management with the Knowledge Management. St. Gallen, Switzerland [2] Weiss, S.M., Kulikowski, C.A.:Computer Systems That can Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and the Expert Systems. Morgan Kaufmann, San Mateo, CA (2011) [3] Weiss, S.M., Indurkhya, N.: Predictive Data Mining. Morgan Kaufmann, San Francisco (2012) [4] Li, Y.-M.: A general linear-regression analysis applied to the 3-parameter Weibull distribution. IEEE Transactions on Reliability 43(2), 255 263 (2014) [5] Shani, D. and Chalasani, S., "Exploiting niches using relationship marketing", The Journal of Consumer Marketing, Vol. 9, No. 3, pp. 33-42,2010. [6] Porter, M. E. and Millar, V. E., "How Information Gives You Competitive Advantage", Harvard Business Review, No. 4, pp. 149-160,2010. [7] Douglas C. Montgomery, Elizabeth A. Peck, G. Geoffrey Vining: Introduction to Linear Regression Analysis Department of statistics blacksburg,va, pp. 104-117,2012. [8] V. V Das, R. Vijaykumar et al. (Eds.): ICT 2010, CCIS 101 Springer-Verlag Berlin Heidelberg,pp. 195 200, 2010. 1944 www.ijaegt.com