PREDICTION OF CRM USING REGRESSION MODELLING Aroushi Sharma #1, Ayush Gandhi #2, Anupam Kumar #3 #1, 2 Students, Dept. of Computer Science, MAIT, GGSIP University, Delhi, INDIA #3 Assisstant Prof., Dept. of Computer Science, MAIT, GGSIP University, Delhi, INDIA 1 sharma.aroushi@gmail.com 2 ayushgandhi16@gmail.com 3 anupamkumar@mait.ac.in Abstract- Regression Analysis Technique can be applied for improved customer experience and retaining the customer to the organization. This predictive analytics technique is proposed to predict the sales of the laptop using various customer related attributes like base, width, and processor configuration, date of purchase, individual s income and price of the product. This paper proposes the implementation of the regression analysis technique for the prediction of CRM (Customer Relationship Management). Upon the application of regression analysis on this data, we get the actual attributes responsible for driving the sales of the laptop, and hence focus on those particular attributes to improve the quality of the business and improve the profits. Keywords- CRM, Forecasting with Regression, Logistic Regression, Predictive Analytics, Regression Analysis, Segmented Regression I. INTRODUCTION Customer Relationship Management is an ERP (Enterprise Resource Planning) component which usually deals with managing the organizations current and the future prospects of the customers. CRM analyses the historical data of a customer to determine the elements that could help in retaining the customer to the organization and therefore improve the business. Not only this, but CRM also helps to bring new business to the organization with a technique called Lead management. Leads are the potential customers who can do business with the organization. The historical data upon which helps CRM in customer retention is obtained via various activities like reviews, telephone, company s website, email feedbacks, social media and many more. Adopting this technique may lead to favoritism of a particular group of people but this issue can be resolved by efficiently utilizing the CRM technique and its related approaches. The paper proposes the regression Model for CRM in order to retain customers of any business. II. MOTIVATIONS Sales prediction is the backbone of a business plan and it is a major requirement for each and every organization these days. Companies measure a business and its growth by sales, and sales prediction sets the 1939 www.ijaegt.com
standard for expenses, growth and profit. III. OBJECTIVES Here we are trying to apply the concepts of regression analysis to attempt to predict the sales for laptops by analyzing the relationships among the various customer-related attributes. In this paper we propose Regression Modeling Technique which deals with the correlation and association between statistical variables and the variables taken here are treated in a symmetric way. The various steps of Regression Modeling can firstly develop CRM Model that collects and analyzes data and targets the desired customer by finding out relationships between customers attributes, then generate Regression Model, followed by applying Regression Modeling for Data Mining and finally generate Results. This approach is followed by discussing all steps and analyzing the Results for predicting the future by examining relationships among the various data sets. Predictive analytics Predictive analytics is a part of various data mining strategies that can be used generation and gathering of information of data. This information collected can be used for the prediction of behavioral patterns or the current trends that have been going on. It has its applications in the field of crime detection and investigation and the identification of suspects. Fraudulent credit card users can be tracked by the use of this technique. This type of technique has its applications in prediction of an unknown event whether it belongs to the past, present, or future. The predictive analytics technique uses the information collected from the past experiences and looks to evaluate the relationships among the explanatory and predictor variables. These relationships help to predict those unknown events. Predictive modeling, data mining and machine learning form the very important components to predictive analytics. These components help in the analysis of the previous and present facts to predict about the future events. But this analysis for the accuracy and use of these results will completely depend upon various assumptions such as no outliers, no multicollinearity, linearity and normality and of course the quality of the analysis being done. Regression Model The solution approach to the above issue can be solved by applying Regression Modeling technique on data Mining Techniques which is based on statistical Methods for CRM that analyses continuous valued attributes. It is used to estimate the probability values associated with the data cube cells. The more the number of attributes, more will be the number of dimensions in the cuboid. These dimensions are mapped to attributes of the data set collected. Further, the dimensions can be reduced to the 3-D cuboids and 2-D cuboids for the particular set of attributes. Thus the higher order cuboids can be built from the lower- order cuboids. This proves to be the building block of most of our Regression Model. Thus our Regression Model includes the following steps: Regression Analysis The data can be analyzed with the help of statistical analytic technique. 1940 www.ijaegt.com
These techniques include Linear Regression, which is one of the simplest forms of regression. It is used to find the relationship between a random variable, Y known as the response variable and another variable X which is known as the predictor variable and this relationship comes out to be linear. Thus the equation becomes like this according to linear Regression: y=a+bx Where the variance of Y is assumed to be constant, a and b are regression coefficients which specifies the Y- intercept and slope of the line. The coefficients can be solved with the method of Least Squares, which helps in minimization of the data between the actual data and the estimated line Where Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2), Intercept(a) = (ΣY - b(σx)) / N Logistic Regression Logistic regression is a type of a regression model in which the dependent variable (DV) is categorical. Logistic regression was developed by the famous statistician David Cox in 1958 (although much work was done in single independent variable case almost two decades earlier). Binary logistic model can be used to estimate the probability value of a binary response based on one or many predictor (or independent) variables. As such it is not merely a classification method; it could be called a qualitative response/discrete choice model in the terminology of economics. Segmented Regression Segmented regression could be a technique in multivariate analysis within which the predictor variables are partitioned off into intervals and a specific line section is work into every of the interval. Divided regression is helpful once the freelance variables that are clustered into completely different teams, exhibit completely different relationships between the variables in these regions. Divided multivariate analysis also can be performed on variable information by partitioning the varied freelance variables gift.. The boundaries between the segments are thoughtabout as breakpoints. Segmented linear regression is the one in which the relationships between the intervals are evaluated by using linear regression technique. Forecasting with Regression We can use an equation to easily generate forecasts from a simple linear model y^=β^0+β^1x Here x stores the value of the predictor variable for which we are trying to forecast. This means that if we provide some value for xx in this equation we could generate an equivalent forecasting for y^. A term fitted value can be defined as the resulting value of y^ for doing the calculation using an observed value of xx from the dataset. This is not considered as a plain forecast as the actual value of y for that predictor value was used in estimation of the model here. Thus the value of y^ is affected only by the true value of y. This shows that for y^ to show a resulting value for a genuine forecast the values of xx should have a new value or that should not exist in the data that were used in the estimation of the model. 1941 www.ijaegt.com
Screenshots Normality condition IV. RESULTS FIGURE 1 BEFORE NORMALIZATION This is an exponential curve and no logarithmic or exponential transformations have been applied here. An exponential function is any function where the variable is the exponent of a constant. FIGURE 2 AFTER NORMALIZATION Database normalization (or normalization) is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. This is a bell shaped curve that shows that log transformation have been applied on the predictors exponential curve. The normal distribution is the bell curve (or normal curve). 1942 www.ijaegt.com
Summary. ISSN No: 2309-4893 VIF or Variance Inflation Factor is used to check the condition for nomulticollinearity which is essential to fulfill the assumption made before applying the linear regression technique 1943 www.ijaegt.com
The summary of the estimation done by the Linear Modeling technique (applied using the lm function) can be seen here. V. FUTURE WORKS AND CONCLUSION Linear Regression techniques have been applied and it can be seen that how the target customer can be achieved by applying the Linear regression Analysis. Further Multiple Regression Analysis can be applicable where it is based on more than two predictor variables. Then the further constants can be retrieved by applying Least Square Methods. Also, further the Non-Linear Models can be converted to linear Models. As seen in our approach the implementation of Log-Linear Model can be shown with the help of data sets collected where the attributed can be transformed from the categorical labels to the continuous valued attributes. Thus an iterative technique can be followed to build higher order cubes from lower order cubes. VI. REFERENCES [1] Bueren, A., Schierholz, R., Kolbe, L., Brenner, W.:Customer Knowledge Management - Improving Performance of Customer Relationship Management with the Knowledge Management. St. Gallen, Switzerland [2] Weiss, S.M., Kulikowski, C.A.:Computer Systems That can Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and the Expert Systems. Morgan Kaufmann, San Mateo, CA (2011) [3] Weiss, S.M., Indurkhya, N.: Predictive Data Mining. Morgan Kaufmann, San Francisco (2012) [4] Li, Y.-M.: A general linear-regression analysis applied to the 3-parameter Weibull distribution. IEEE Transactions on Reliability 43(2), 255 263 (2014) [5] Shani, D. and Chalasani, S., "Exploiting niches using relationship marketing", The Journal of Consumer Marketing, Vol. 9, No. 3, pp. 33-42,2010. [6] Porter, M. E. and Millar, V. E., "How Information Gives You Competitive Advantage", Harvard Business Review, No. 4, pp. 149-160,2010. [7] Douglas C. Montgomery, Elizabeth A. Peck, G. Geoffrey Vining: Introduction to Linear Regression Analysis Department of statistics blacksburg,va, pp. 104-117,2012. [8] V. V Das, R. Vijaykumar et al. (Eds.): ICT 2010, CCIS 101 Springer-Verlag Berlin Heidelberg,pp. 195 200, 2010. 1944 www.ijaegt.com