A comparative study of Linear learning methods in Click-Through Rate Prediction
|
|
- Arabella Walters
- 6 years ago
- Views:
Transcription
1 2015 International Conference on Soft Computing Techniques and Implementations- (ICSCTI) Department of ECE, FET, MRIU, Faridabad, India, Oct 8-10, 2015 A comparative study of Linear learning methods in Click-Through Rate Prediction Antriksh Agarwal Avishkar Gupta Dr. Tanvir Ahmad Department of Computer Engg, Department of Computer Engg, Department of Computer Engg, Jamia Millia Islamia Jamia Millia Islamia Jamia Millia Islamia Okhla, Delhi Okhla, Delhi Okhla, Delhi antriksh5235@gmail.com avishkar.gupta.delhi@gmail.com tahmad2@jmi.ac.in Abstract A major challenge in the current era of search engine advertising is choosing which advertisements to show in response to a user query. This significantly impacts the overall user experience, and more importantly the advertising revenue stream for the search engine provider. Predicting click-through rates (CTR) for an advertisement is a massive-scale learning problem that is central to the multi-billion dollar online advertising industry. This study examines the performance of some wellknown statistical learning methods (linear and logistic) with respect to their efficiency in predicting the click through rate of an impression, where an impression can simply be defined as an instance of a particular advertisement, with each instance defined in terms of the learning parameters in our data set. Our data set consisted of three types of independent attributes to act as a regressor in predicting our dependent variable - the app through which it was clicked, the site type and the domain to which it led with the help of other anonymised variables. Fine tuning of the algorithm parameters was done to get promising results. Besides that a dimensionality check on the data set was conducted to observe the possibilities of dimensionality reduction. Logistic loss (log-loss) was used as the validation index in all cases. Our observations led us to the conclusion that with minimal data preprocessing, linear models give competitive on-par results suited for most practical applications, where the learning method chosen should not be computationally expensive. We go on to further verify this claim by comparing the performance of linear models on various subsets of the data set attributes, showing that the performance of the linear techniques was consistent all across. Keywords Logistic Regression; Click-Through Rate; Linear Models; Logistic Classifier-Regressor; Advertising I. INTRODUCTION Advertising via sponsored search results has become the platform for companies to gain a reputation for themselves beyond local markets, acting as a major income/revenue source for search engine providers such as Google/Yahoo/Bing etc., generating revenues of the order of 25 billion dollars and upwards[2]. A way to predict the effectiveness of a marketing campaign would be to record the user's reaction to the ad when it shows up. However, seeing as how that is not feasible with the current technology, click through rate, which tells us about how many visitors merely "initiated action" in response to the showing of the ad servers as a metric in understanding user behavior. Different advertisers target different kinds of users: a mountaineering equipment company will be interested in users who may be bought some sporting gear recently, and an airline would prefer to display its ads to people who are frequent fliers. Click through rate prediction plays an important role in this area of sponsored advertising. A higher click through rate is a clear indicator for predicting the success of an online marketing campaign, as well as the success of an marketing campaign. A higher click through rate means more number of users are clicking the ad, which means our campaign is reaching the target audience. Click through rate prediction is therefore necessary to be able to further optimize ad placement in the sponsored search market. The sponsored search advertising model exploits two key aspects of on-line advertising [3]. First, the user enters a query to the search engine, which is a give away of their intent and determines the type of advertisement that would be shown to them. Also, if a user is to follow the said link, then the success can be attributed directly to the search engine provider in the case of sponsored search. However, in the cases where these advertisements are placed on websites as banner ads, etc. a large number of factors come into play and things are not so straight forward. The positioning of the ad on the site, the device being used to surf the site, are some common ones. Also, the advertising on these sites is directly linked to the traffic volume on the original website where the ad is displayed. Because of this, it is necessary to factor in these attributes when trying to calculate the click through rate for an advertisement on a site other than that of a search engine provider. Our work is aimed at predicting CTR in these cases where the advertisement display is not necessarily on a search portal. In these scenarios user queries are no longer available to exploit, and factors such as the theme of the website, etc. then have to be taken into account. Most work in this area has been carried out by Search Engine Providers, but the techniques given by them are not applicable 'as-is' here because they in most cases, do not accommodate the said metrics. Click through rate can be defined as: /15/$ IEEE Track 3 : Hybrid Intelligence - 97
2 where each impression refers to one showing of the ad. This paper attempts to make a comparison of the performance of some well-known linear and logistic learning methods in click-through-rate prediction and touches on the key role that CTR prediction plays in sponsored search. We chose linear models, since training a single layer model would allow us to handle significantly larger data sets and larger models than have been reported elsewhere. Also, the data per- was to draw processing was kept to minimum as our objective a comparison based on the performance of classification on the data set. II. RELATED WORKS Craswell, Ramsey, et. al. [5] analyzed the effect of a links' position in determining the probability that the link will be clicked. They compared four real world situation models to that of logistic regression. They proposed a cascade model that can be applied without the need for training data, and parameter-free to click observations. Thir model however performed badly in lower ranks. Their results went into depth about how the position of an ad will affect its probability of being clicked, just like a search result. Azin Ashkan, Charles L.A. Clarke et. al. estimated ad click-through rate by exploring user queries and click-through logs. Their findings go on to prove that rank of an ad, query intent, no. of ads displayed on result page etc. are effective in estimating click-through rate[6]. A related paper [7] by Zhong, Wang et. al. explored the user's post-click behavior (such as the dwell time on the clicked document, and whether there are further clicks on the clicked document). They worked on monitoring the user's activity post -click after leaving the search page and proposed a click model. The works of Ye Chen, Tak W. Yan[8] and several others hint at the positional-biaworks of Jingfang Xu et. al[9] and Ben Carterette & Rosie problem in Unison. The Jones[10] touches on the problem of minimizing relevance judgment errors. Their findings provided a way for comparing raking functions by predicting relevance from click-through rates. In addition to this, previous eye-tracking experiments and studies on explaining position-bias of user clicks provide a spectrum of hypotheses and models on how an average user examines and possibly clicks web documents returned by a search engine with respect to the submitted query. III. PROPOSED ARCHITECTURE Fig. 1 describes the proposed methodology we employed for the prediction of click-through rate based on the independent variables, having following modules - Data Pre- Classifier Selection, Linear processing, Logistic Loss based Models and Dimensionality Reduction. A. Procuring the Data-set This is the primary step during which data obtained from logs of websites are used to derive the independent variables. A raw (not scaled) data set is obtained and saved in a standard format (eg. CSV). Fig. 1.Proposed Architecture of Prediction Track 3 : Hybrid Intelligence - 98
3 B. Data Pre-processing Various data pre-processing steps like data scaling, field removal and format conversion were applied that can be summarized as follows. Feature Selection: In the field removal steps columns like ID, Serial No. were removed from the data since these columns were used for identifying the rows and have no role in classification. Candidate features are chosen out of the features obtained in the previous step, such that, their removal does not affect the accuracy of classification model. Among those candidates for the pair about which we have a rationale for their removal are removed such as the identities of each of the table as well as the features provided in the table. Feature Engineering: Features such as time and hour which have been given in a date time format in the table had to be separated and special functions were created for the same. 1) Feature Extraction: Often features are not given as continuous values but categorical. When discrete values constitute the data of a particular feature, instead of the continuous values that are usually used to classify, we cannot use these features directly with the estimators. The estimators expect the input to be continuous and would interpret the categories as being ordered, which is not often desired. One possibility to convert categorical features to features that can be used is feature hashing. Feature hashing, also known as the hashing trick, is a fast and space-efficient way of vectorising features, i.e. turning categorical features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly, rather than looking the indices up in an associative array and creates features to determine column index in sample matrices directly. C. Log-Loss Based Classifier Selection In this step the emphasis is on the selection of the classification algorithm. The data set should be tried on various Machine learning (ML) algorithms. This aids in selection of the base learner. Logarithmic Loss is the loss function used in multinomial logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of the true labels given a probabilistic classifier s predictions. Logistic loss for y i {0,1}: where is a prediction for the i th. Logistic Loss for y i {- 1,1}: where p i is a raw score from the model and, i{1,,m}. D. Classifier Identifying to which set of categories a new observations belongs, on the basis of a training set and the statistical relationship among variables in the training set whose category membership is known is commonly referred to as classification. It includes many techniques for modelling and analysing several variables and finally derives a relationship between a dependent variable and the independent variables. Classifier performance depends greatly on the characteristics of the data to be classified. Various empirical tests have to be performed to compare classifier performance and to find the characteristics of data that determine classifier performance. A large number of algorithms for classification can be phrased ni terms of a linear function that assigns a score to each possible category k by combining the feature vector of an instance with a vector of weights, using a dot product. The predicted category is the one with the highest score. This type of score function is known as a linear predictor function and has the following general form: where X i is the feature vector for instance i, k is the vector of weights corresponding to category k, and score(x i, k) is the score associated with assigning instance i to category k. Algorithms with this basic setup are known as linear classifiers. E. Dimensionality Reduction (DR) DR is the process of removal of variables from the data set which are correlated with each other and might degrade the classifier accuracy. Following steps were performed in order to improve the accuracy. 1) Iterative Classification: Each variable in the data set is excluded one by one and a model is built using Logistic Regression, features whose exclusion results in logistic loss lower than the default logistic loss (when no variable is removed) are noted down. Dimension Reduction: Candidate features are chosen out of the features obtained in the previous step, such that, their removal does not affect the accuracy of classification model. Among those candidates for the pair about which we have a rationale for their removal are removed. This accuracy driven DR approach is also known as the wrapper approach. Track 3 : Hybrid Intelligence - 99
4 IV. EXPERIMENTAL SETUP Experimental Data-Set We were provided with ten days of sub sampled click- on kaggle, an through rate data by Avazu[1], made available online portal for data-science. This set consisted of about 40 million lines of training data, which we further sub-sampled this data set to create a split of training and testing data using the first 500,000 records. The sets had the following attributes. Fig. 2. Types of attributes in the data-sets, categorized according to their physical intuition The feature we had to predict in the attributes above is 'click' and the others function as the independent features. Out of the independent features, the parameters C1, C14-C21 represent categorical features(where each value represented an ID and has no quantitative significance) whose significance was anonymized by Avazu for businesss reasons. These anonymized features represent variant attributes such as the dimensions of the advertisement. The features whose attributions were known even though also containing hashed strings were also categorical (discrete) features that covered the following attributes: site_id, site_domain and site_category Features that specify the site on which an impression of the advertisement was put. app_id, app_domain and app_category Specify the app in which the advertisement/webpage with the advertisement was shown. device_id, device_ip, device_model, device type, conn_type Identify the device of the user on which the impressions were shown. Prototyping Tools We used the classifiers provider in the scikit-learn toolkit[11] available for the Python programming language. The SciPy toolkit featuring Numpy, Scipy, and all associated packages was also used. B. Experimental Procedure The experiment was then conducted using the architecture proposed in section III. We tested the architecture using three learning methods vanilla logistic regression, Stochastic Gradient Descent (SGD Classifier) and a Bayesian method(multinomial Bayes). Logistic regression and Stochastic Gradient Descent were used as logistic regression attempts to minimize log-loss and SGD for its ability to support supports different loss functions and penalties for classification. We used a Bayesian method so as to show that the features are not independent of each other, as otherwise the naïve Bayes assumption of feature independence would make it also a viable option. This was done to find which classifier performed best given our set of chosen features. Some variables, such as id, app_id, site_id, site_domain, were removed at the start, so that the models do not use these distinct valued attributes to create additional features that are too-specific to an impression. This was also done to not un-necessarilthe dataset. Other variables were removed and tested to see increase the size of which variables best fit our classification. The logistic loss computed provided a fair deal of insight on how good the algorithm was performing on our sub sampled data, with click, the click through rate of the impression being the binary attributed target feature for which probability of classification was calculated. Table I. Comparison of different learning methods with their output log- Learning Method Applied Logistic Loss losses. SGD Classifier Logistic Regression Multinomial Bayes We applied various learning methods to check out which one gave us the best results. As you can see above, Logistic Regression gave us the best output We conducted experiments on them to find out what set of attributes, taken together, gave us the best estimate of the click-through rate. For this we tested our results with iterative reduction and dimension reduction. Table II. Using linear models to see how attribute removal changed the output. Attributes Removed Log-Loss None app_category site_category device_conn_type C1, C14-C C1, app_category, site category C1, device_conn type C. Logistic Regression Logistic regression is a regression model in which the dependent variable is categorical. Logistic regression measures the relationship between the dependent variable and the independent variables by estimating probabilities using a logistic function. The mathematics of logistic regression Track 3 : Hybrid Intelligence - 100
5 begins with the explanation of logistic function. The logistic function is useful because it can take an input with any value from negative to positive infinity, whereas the output always takes values between zero and one and hence is interpretable as a probability. The logistic function is defined as follows: If is viewed as a linear function of an explanatory variable (or of a linear combination of explanatory variables), then we express as follows: And the logistic function can now be written as: Note F(x) is interpreted as the probability of the dependent variable equaling a "success" or "case" rather than a failure or non-case. It's clear that the response variables are not identically distributed: differs from one data point to another, though they are independent given design matrix and shared with parameters. V. RESULTS AND DISCUSSION The experiment resulted in finding out that among the listed linear models (in table 1), Logistic Regression was the best algorithm for finding the click-through rate. Such a result could be possible because logistic regression returns well calibrated predictions as it directly optimizes log-loss. This is because in the gradient descent of logistic regression the logistic regression is trying to minimize the cost, which is represented by equation (2). Hence, in a way logistic will always be giving better log-loss values. While Stochastic Gradient Descent (SGD), did not give a better result than logistic regression but it is an online classifier which does not need to be given all the data at the same time. For logistic regression, we have to feed all the data at the same time and with the amount of data we had in the dataset, we did consider using other algorithms before trying to use logistic for better results. It can be argued that the cost function being minimized in SGD is, the improvement of SGD with the size of data is good. Thus, it is very much possible that if would have supplied the whole of the data set that was available to us, we might just have been able to show that SGD was better than Logistic Regression. Other methods like Naïve Bayes tend to push probabilities to 0 or 1.This is mainly because it makes the assumption that features are conditionally independent given the class, which was not the case in this dataset. Another result that caught our eye was that increasing or decreasing any of the variables did not much contribute in improving the log-loss value. This shows that, as stated in the book [10], a regression model does not imply a cause-andeffect relationship between the independent and the dependent variables. Even though a strong empirical relationship may exist between them, it cannot be considered as evidence that the classifier features and the response are related in a causeand-effect manner. To establish casualty, the relationship between the classifiers and the response must be outside the sample data. VI. CONCLUSION AND FUTURE SCOPE The excellent performance of Logistic Regression in comparison to other models, and the consistency in results shown when using this technique across all sets of features, the recommendation based on our results would be to use logistic regression in a practical situation where once can afford to run batch learning jobs frequently. However, given that the SGD classifier came in as a close second, and given the fact that it is an online learning method, classification can be improved by partially fitting any new data that comes in, SGD is ideal for situations where data is constantly flowing in rather than arriving in batches. For such work flows, Logistic Regression would need one to train the classifier with the entire dataset each time some modification needs to be done. Also, from our dimensionality reduction efforts, it is clear that classification will remain consistent even if one of the key features is not present in the data set. We can look into more robust data preprocessing models and observe how preprocessing in different ways affects our results. We can also look into up and coming dataintensive, parallel programming techniques and GPU based programming to incorporate larger data sets, since at present we were able to use only part of the training data. Other potentially interesting future work would be to observe how variety websites such as aggregation or social media platforms compare to theme specific sites that focus on only one aspect of content, such as sports portals, etc. This constitutes our future work. which is very close to equation (2), but the summation over the terms, gave a better result, than one which was not being summed over. We also know that the performance of SGD improves as the size of data increases exponentially for it. So, REFERENCES [1] emarketer, April 2009 [2] Broder, Josifovski, Introduction to Computational Advertising at Stanford, Lecture Notes, 2009 Track 3 : Hybrid Intelligence - 101
6 [3] Nick Craswell, Onno Zoeter, Michael Taylor, Bill Ramsey, An Experimental Comparison of Click Position-Bias Models. [4] Azin Ashkan, Charles L.A. Clarke, Eugene Agichtein, Qi Guo, Estimating Ad Click-through Rate through Query Intent Analysis. [5] Zhong et. al, Incorporating Post-Click Behaviors into a Click Model. [6] Ye Chen, Tak W. Yan, Position-Normalized Click Prediction in Search Advertising [7] Xu et. al, Improving Quality of Training Data for Learning to Rank Using Click-Through Data. [8] Ben Carterette, Rosie Jones, Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks. [9 M. Young, The Technical Writer s Handbook. Mill Valley, CA: University Science, [10] Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." The Journal of Machine Learning Research 12 (2011): Track 3 : Hybrid Intelligence - 102
Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong
Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong Machine learning models can be used to predict which recommended content users will click on a given website.
More informationPredicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest
Predicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest 1. Introduction Reddit is a social media website where users submit content to a public forum, and other
More informationPredicting Restaurants Rating And Popularity Based On Yelp Dataset
CS 229 MACHINE LEARNING FINAL PROJECT 1 Predicting Restaurants Rating And Popularity Based On Yelp Dataset Yiwen Guo, ICME, Anran Lu, ICME, and Zeyu Wang, Department of Economics, Stanford University Abstract
More informationIntro Logistic Regression Gradient Descent + SGD
Case Study 1: Estimating Click Probabilities Intro Logistic Regression Gradient Descent + SGD Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade March 29, 2016 1 Ad Placement
More informationData Science Challenges for Online Advertising A Survey on Methods and Applications from a Machine Learning Perspective
Data Science Challenges for Online Advertising A Survey on Methods and Applications from a Machine Learning Perspective IWD2016 Dublin, March 2016 Online Advertising Landscape [Introduction to Computational
More informationMachine Learning Models for Sales Time Series Forecasting
Article Machine Learning Models for Sales Time Series Forecasting Bohdan M. Pavlyshenko SoftServe, Inc., Ivan Franko National University of Lviv * Correspondence: bpavl@softserveinc.com, b.pavlyshenko@gmail.com
More informationAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms Jieming Wei Sharon Zhang Introduction Many organizations prospect for loyal supporters and donors by sending direct mail appeals. This is an
More informationCSE 255 Lecture 3. Data Mining and Predictive Analytics. Supervised learning Classification
CSE 255 Lecture 3 Data Mining and Predictive Analytics Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression,
More informationUnravelling Airbnb Predicting Price for New Listing
Unravelling Airbnb Predicting Price for New Listing Paridhi Choudhary H John Heinz III College Carnegie Mellon University Pittsburgh, PA 15213 paridhic@andrew.cmu.edu Aniket Jain H John Heinz III College
More informationOptimization of Click-Through Rate Prediction in the Yandex Search Engine
ISSN 5-155, Automatic Documentation and Mathematical Linguistics, 213, Vol. 47, No. 2, pp. 52 58. Allerton Press, Inc., 213. Original Russian Text K.E. Bauman, A.N. Kornetova, V.A. Topinskii, D.A. Khakimova,
More informationPAST research has shown that real-time Twitter data can
Algorithmic Trading of Cryptocurrency Based on Twitter Sentiment Analysis Stuart Colianni, Stephanie Rosales, and Michael Signorotti ABSTRACT PAST research has shown that real-time Twitter data can be
More informationCross-channel measurement and optimization: Targeting mobile app usage to increase desktop brand engagement Gilad Barash, Brian Dalessandro, Claudia
Cross-channel measurement and optimization: Targeting mobile app usage to increase desktop brand engagement Gilad Barash, Brian Dalessandro, Claudia Perlich, Lauren Moores and Troy Raeder ARF Experiential
More informationConvex and Non-Convex Classification of S&P 500 Stocks
Georgia Institute of Technology 4133 Advanced Optimization Convex and Non-Convex Classification of S&P 500 Stocks Matt Faulkner Chris Fu James Moriarty Masud Parvez Mario Wijaya coached by Dr. Guanghui
More informationUsing AI to Make Predictions on Stock Market
Using AI to Make Predictions on Stock Market Alice Zheng Stanford University Stanford, CA 94305 alicezhy@stanford.edu Jack Jin Stanford University Stanford, CA 94305 jackjin@stanford.edu 1 Introduction
More informationApplications of Machine Learning to Predict Yelp Ratings
Applications of Machine Learning to Predict Yelp Ratings Kyle Carbon Aeronautics and Astronautics kcarbon@stanford.edu Kacyn Fujii Electrical Engineering khfujii@stanford.edu Prasanth Veerina Computer
More informationDo Ads Compete or Collaborate? Designing Click Models with Full Relationship Incorporated
Do s Compete or Collaborate? Designing Click Models with Full Relationship Incorporated Xin Xin School of Computer Science Beijing Institute of Technology xxin@bit.edu.cn Michael R. Lyu The Chinese University
More informationOCTOBOARD INTRO. Put your metrics around these practical questions and make sense out of your Facebook Ads Analytics!
OCTOBOARD INTRO The answer to all of your questions lies within one word - Data. You need loads and loads of data to be able to spot trends and get to insights on Facebook Advertising and see what works
More informationRank hotels on Expedia.com to maximize purchases
Rank hotels on Expedia.com to maximize purchases Nishith Khantal, Valentina Kroshilina, Deepak Maini December 14, 2013 1 Introduction For an online travel agency (OTA), matching users to hotel inventory
More informationData Visualization and Improving Accuracy of Attrition Using Stacked Classifier
Data Visualization and Improving Accuracy of Attrition Using Stacked Classifier 1 Deep Sanghavi, 2 Jay Parekh, 3 Shaunak Sompura, 4 Pratik Kanani 1-3 Students, 4 Assistant Professor 1 Information Technology
More informationMachine Learning Logistic Regression Hamid R. Rabiee Spring 2015
Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Probabilistic Classification Introduction to Logistic regression Binary logistic regression
More informationEvaluating Workflow Trust using Hidden Markov Modeling and Provenance Data
Evaluating Workflow Trust using Hidden Markov Modeling and Provenance Data Mahsa Naseri and Simone A. Ludwig Abstract In service-oriented environments, services with different functionalities are combined
More informationPRODUCT DESCRIPTIONS AND METRICS
PRODUCT DESCRIPTIONS AND METRICS Adobe PDM - Adobe Analytics (2015v1) The Products and Services described in this PDM are either On-demand Services or Managed Services (as outlined below) and are governed
More informationDigital Media Mix Optimization Model: A Case Study of a Digital Agency promoting its E-Training Services
Available online at: http://euroasiapub.org, pp. 127~137 Thomson Reuters Researcher ID: L-5236-2015 Digital Media Mix Optimization Model: A Case Study of a Digital Agency promoting its E-Training Services
More informationHUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS
HUMAN RESOURCE PLANNING AND ENGAGEMENT DECISION SUPPORT THROUGH ANALYTICS Janaki Sivasankaran 1, B Thilaka 2 1,2 Department of Applied Mathematics, Sri Venkateswara College of Engineering, (India) ABSTRACT
More informationClassification Model for Intent Mining in Personal Website Based on Support Vector Machine
, pp.145-152 http://dx.doi.org/10.14257/ijdta.2016.9.2.16 Classification Model for Intent Mining in Personal Website Based on Support Vector Machine Shuang Zhang, Nianbin Wang School of Computer Science
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org J.
More informationPredicting Customer Purchase to Improve Bank Marketing Effectiveness
Business Analytics Using Data Mining (2017 Fall).Fianl Report Predicting Customer Purchase to Improve Bank Marketing Effectiveness Group 6 Sandy Wu Andy Hsu Wei-Zhu Chen Samantha Chien Instructor:Galit
More informationAn Implementation of genetic algorithm based feature selection approach over medical datasets
An Implementation of genetic algorithm based feature selection approach over medical s Dr. A. Shaik Abdul Khadir #1, K. Mohamed Amanullah #2 #1 Research Department of Computer Science, KhadirMohideen College,
More informationPredicting Purchase Behavior of E-commerce Customer, One-stage or Two-stage?
2016 International Conference on Artificial Intelligence and Computer Science (AICS 2016) ISBN: 978-1-60595-411-0 Predicting Purchase Behavior of E-commerce Customer, One-stage or Two-stage? Chen CHEN
More informationPredicting Corporate 8-K Content Using Machine Learning Techniques
Predicting Corporate 8-K Content Using Machine Learning Techniques Min Ji Lee Graduate School of Business Stanford University Stanford, California 94305 E-mail: minjilee@stanford.edu Hyungjun Lee Department
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/5/18 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 High dim. data Graph data Infinite data Machine
More informationCryptocurrency Price Prediction Using News and Social Media Sentiment
Cryptocurrency Price Prediction Using News and Social Media Sentiment Connor Lamon, Eric Nielsen, Eric Redondo Abstract This project analyzes the ability of news and social media data to predict price
More informationP P C G L O S S A R Y PPC GLOSSARY
The following is a glossary of terms which you will see as you explore the world of PPC. A ACCELERATED AD DELIVERY A method of ad delivery which endeavours to show an ad as often as possible until the
More informationData mining and Renewable energy. Cindi Thompson
Data mining and Renewable energy Cindi Thompson June 2012 Analytics, Big Data, and Data Science 1 What is Analytics? makes extensive use of data, statistical and quantitative analysis, explanatory and
More informationUsing Decision Tree to predict repeat customers
Using Decision Tree to predict repeat customers Jia En Nicholette Li Jing Rong Lim Abstract We focus on using feature engineering and decision trees to perform classification and feature selection on the
More informationA Survey on Recommendation Techniques in E-Commerce
A Survey on Recommendation Techniques in E-Commerce Namitha Ann Regi Post-Graduate Student Department of Computer Science and Engineering Karunya University, India P. Rebecca Sandra Assistant Professor
More informationTDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.
Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide
More informationPredicting Airbnb Bookings by Country
Michael Dimitras A12465780 CSE 190 Assignment 2 Predicting Airbnb Bookings by Country 1: Dataset Description For this assignment, I selected the Airbnb New User Bookings set from Kaggle. The dataset is
More informationGlobal Media Intelligence Report
Q3 2013 Neustar Aggregate Knowledge Global Media Intelligence Report TABLE OF CONTENTS THE GLOBAL MEDIA INTELLIGENCE REPORT Where Math Men Meet Mad Men 3 About the Report 3 EXECUTIVE SUMMARY 4 COST INDEX
More informationAirbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.
Airbnb Price Estimation Liubov Nikolenko SUNet ID: liubov Hoormazd Rezaei SUNet ID: hoormazd Pouya Rezazadeh SUNet ID: pouyar Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.git
More informationPredict Commercial Promoted Contents Will Be Clicked By User
Predict Commercial Promoted Contents Will Be Clicked By User Gary(Xinran) Guo garyguo@stanford.edu SUNetID: garyguo Stanford University 1. Introduction As e-commerce, social media grows rapidly, advertisements
More informationA Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG 1, Hong-Yu FU 1
International Conference on Management Science and Management Innovation (MSMI 2014) A Study of Financial Distress Prediction based on Discernibility Matrix and ANN Xin-Zhong BAO 1,a,*, Xiu-Zhuan MENG
More informationBig Data. Methodological issues in using Big Data for Official Statistics
Giulio Barcaroli Istat (barcarol@istat.it) Big Data Effective Processing and Analysis of Very Large and Unstructured data for Official Statistics. Methodological issues in using Big Data for Official Statistics
More informationApplication of Decision Trees in Mining High-Value Credit Card Customers
Application of Decision Trees in Mining High-Value Credit Card Customers Jian Wang Bo Yuan Wenhuang Liu Graduate School at Shenzhen, Tsinghua University, Shenzhen 8, P.R. China E-mail: gregret24@gmail.com,
More informationPrediction of Google Local Users Restaurant ratings
CSE 190 Assignment 2 Report Professor Julian McAuley Page 1 Nov 30, 2015 Prediction of Google Local Users Restaurant ratings Shunxin Lu Muyu Ma Ziran Zhang Xin Chen Abstract Since mobile devices and the
More informationClassifying Search Advertisers. By Lars Hirsch (Sunet ID : lrhirsch) Summary
Classifying Search Advertisers By Lars Hirsch (Sunet ID : lrhirsch) Summary Multinomial Event Model and Softmax Regression were applied to classify search marketing advertisers into industry verticals
More informationAirbnb Capstone: Super Host Analysis
Airbnb Capstone: Super Host Analysis Justin Malunay September 21, 2016 Abstract This report discusses the significance of Airbnb s Super Host Program. Based on Airbnb s open data, I was able to predict
More informationSalford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.
Powerful machine learning software for developing predictive, descriptive, and analytical models. The Company Minitab helps companies and institutions to spot trends, solve problems and discover valuable
More informationML Methods for Solving Complex Sorting and Ranking Problems in Human Hiring
ML Methods for Solving Complex Sorting and Ranking Problems in Human Hiring 1 Kavyashree M Bandekar, 2 Maddala Tejasree, 3 Misba Sultana S N, 4 Nayana G K, 5 Harshavardhana Doddamani 1, 2, 3, 4 Engineering
More informationWhen to Book: Predicting Flight Pricing
When to Book: Predicting Flight Pricing Qiqi Ren Stanford University qiqiren@stanford.edu Abstract When is the best time to purchase a flight? Flight prices fluctuate constantly, so purchasing at different
More informationHow to Drive. Online Marketing through Web Analytics! Tips for leveraging Web Analytics to achieve the best ROI!
How to Drive Online Marketing through Web Analytics! Tips for leveraging Web Analytics to achieve the best ROI! an ebook by - Delhi School Of Internet marketing Table of Content Introduction Chapter1:
More informationChurn Prediction Model Using Linear Discriminant Analysis (LDA)
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 5, Ver. IV (Sep. - Oct. 2016), PP 86-93 www.iosrjournals.org Churn Prediction Model Using Linear Discriminant
More informationOnline appendix for THE RESPONSE OF CONSUMER SPENDING TO CHANGES IN GASOLINE PRICES *
Online appendix for THE RESPONSE OF CONSUMER SPENDING TO CHANGES IN GASOLINE PRICES * Michael Gelman a, Yuriy Gorodnichenko b,c, Shachar Kariv b, Dmitri Koustas b, Matthew D. Shapiro c,d, Dan Silverman
More informationBitcoin UTXO Lifespan Prediction
Bitcoin UTXO Lifespan Prediction Robert Konrad & Stephen Pinto December, 05 Background & Motivation The Bitcoin crypto currency [, ] is the most widely used and highly valued digital currency in existence.
More informationState-of-the-Art Diamond Price Predictions using Neural Networks
State-of-the-Art Diamond Price Predictions using Neural Networks Charley Yejia Zhang, Sean Oh, Jason Park Abstract In this paper, we discuss and evaluate models to predict the prices of diamonds given
More informationStrength in numbers? Modelling the impact of businesses on each other
Strength in numbers? Modelling the impact of businesses on each other Amir Abbas Sadeghian amirabs@stanford.edu Hakan Inan inanh@stanford.edu Andres Nötzli noetzli@stanford.edu. INTRODUCTION In many cities,
More informationPredicting and Explaining Price-Spikes in Real-Time Electricity Markets
Predicting and Explaining Price-Spikes in Real-Time Electricity Markets Christian Brown #1, Gregory Von Wald #2 # Energy Resources Engineering Department, Stanford University 367 Panama St, Stanford, CA
More informationA STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET
A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/8/2015 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 High dim. data Graph data Infinite data Machine
More information3DCNN for Lung Nodule Detection And False Positive Reduction
3DCNN for Lung Nodule Detection And False Positive Reduction Ping An Technology (Shenzhen) Co.,Ltd, China WUTIANBO484@pingan.com.cn Abstract The diagnosis of pulmonary nodules can be roughly divided into
More informationModeling User Click Behavior in Sponsored Search
Modeling User Click Behavior in Sponsored Search Vibhanshu Abhishek, Peter S. Fader, Kartik Hosanagar The Wharton School, University of Pennsylvania, Philadelpha, PA 1914, USA {vabhi, faderp, kartikh}@wharton.upenn.edu
More informationIBM SPSS & Apache Spark
IBM SPSS & Apache Spark Making Big Data analytics easier and more accessible ramiro.rego@es.ibm.com @foreswearer 1 2016 IBM Corporation Modeler y Spark. Integration Infrastructure overview Spark, Hadoop
More informationWhat about streaming data?
What about streaming data? 1 The Stream Model Data enters at a rapid rate from one or more input ports Such data are called stream tuples The system cannot store the entire (infinite) stream Distribution
More informationBusiness Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 02 Data Mining Process Welcome to the lecture 2 of
More informationFraud Detection for MCC Manipulation
2016 International Conference on Informatics, Management Engineering and Industrial Application (IMEIA 2016) ISBN: 978-1-60595-345-8 Fraud Detection for MCC Manipulation Hong-feng CHAI 1, Xin LIU 2, Yan-jun
More informationNICE Customer Engagement Analytics - Architecture Whitepaper
NICE Customer Engagement Analytics - Architecture Whitepaper Table of Contents Introduction...3 Data Principles...4 Customer Identities and Event Timelines...................... 4 Data Discovery...5 Data
More informationKeyword Performance Prediction in Paid Search Advertising
Keyword Performance Prediction in Paid Search Advertising Sakthi Ramanathan 1, Lenord Melvix 2 and Shanmathi Rajesh 3 Abstract In this project, we explore search engine advertiser keyword bidding data
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2 Classic model of algorithms You get to see the entire input, then compute some function of it In this context,
More informationProfit Optimization ABSTRACT PROBLEM INTRODUCTION
Profit Optimization Quinn Burzynski, Lydia Frank, Zac Nordstrom, and Jake Wolfe Dr. Song Chen and Dr. Chad Vidden, UW-LaCrosse Mathematics Department ABSTRACT Each branch store of Fastenal is responsible
More informationPROGRAMMATIC DEMYSTIFY DIGITAL. From the Digital Experts: An essential bite-size guide to the acronyms and underpinning of modern digital advertising.
PROGRAMMATIC DEMYSTIFY DIGITAL From the Digital Experts: An essential bite-size guide to the acronyms and underpinning of modern digital advertising. P O W E R ED B Y CONTENTS 1 - Demystify 2 - We demystify
More informationA HYBRID MODERN AND CLASSICAL ALGORITHM FOR INDONESIAN ELECTRICITY DEMAND FORECASTING
A HYBRID MODERN AND CLASSICAL ALGORITHM FOR INDONESIAN ELECTRICITY DEMAND FORECASTING Wahab Musa Department of Electrical Engineering, Universitas Negeri Gorontalo, Kota Gorontalo, Indonesia E-Mail: wmusa@ung.ac.id
More informationAPPLYING MACHINE LEARNING IN MOBILE DEVICE AD TARGETING. Leonard Newnham Chief Data Scientist
APPLYING MACHINE LEARNING IN MOBILE DEVICE AD TARGETING Leonard Newnham Chief Data Scientist Introduction Who is LoopMe? What we do The problem we solve Data Predictive models Bidders Future Research Lessons
More informationMulti-Touch Attribution
Multi-Touch Attribution BY DIRK BEYER HEAD OF SCIENCE, MARKETING ANALYTICS NEUSTAR A Guide to Methods, Math and Meaning Introduction Marketers today use multiple marketing channels that generate impression-level
More informationAzure ML Studio. Overview for Data Engineers & Data Scientists
Azure ML Studio Overview for Data Engineers & Data Scientists Rakesh Soni, Big Data Practice Director Randi R. Ludwig, Ph.D., Data Scientist Daniel Lai, Data Scientist Intersys Company Summary Overview
More informationCustomer Relationship Management in marketing programs: A machine learning approach for decision. Fernanda Alcantara
Customer Relationship Management in marketing programs: A machine learning approach for decision Fernanda Alcantara F.Alcantara@cs.ucl.ac.uk CRM Goal Support the decision taking Personalize the best individual
More informationENGG1811: Data Analysis using Excel 1
ENGG1811 Computing for Engineers Data Analysis using Excel (weeks 2 and 3) Data Analysis Histogram Descriptive Statistics Correlation Solving Equations Matrix Calculations Finding Optimum Solutions Financial
More informationNew restaurants fail at a surprisingly
Predicting New Restaurant Success and Rating with Yelp Aileen Wang, William Zeng, Jessica Zhang Stanford University aileen15@stanford.edu, wizeng@stanford.edu, jzhang4@stanford.edu December 16, 2016 Abstract
More informationPredicting Yelp Ratings From Business and User Characteristics
Predicting Yelp Ratings From Business and User Characteristics Jeff Han Justin Kuang Derek Lim Stanford University jeffhan@stanford.edu kuangj@stanford.edu limderek@stanford.edu I. Abstract With online
More informationKnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration
KnowledgeSTUDIO Advanced Modeling for Better Decisions Companies that compete with analytics are looking for advanced analytical technologies that accelerate decision making and identify opportunities
More informationPreference Elicitation for Group Decisions
Preference Elicitation for Group Decisions Lihi Naamani-Dery 1, Inon Golan 2, Meir Kalech 2, and Lior Rokach 1 1 Telekom Innovation Laboratories at Ben-Gurion University, Israel 2 Ben Gurion University,
More informationClassic model of algorithms
Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
More informationEnhanced Cost Sensitive Boosting Network for Software Defect Prediction
Enhanced Cost Sensitive Boosting Network for Software Defect Prediction Sreelekshmy. P M.Tech, Department of Computer Science and Engineering, Lourdes Matha College of Science & Technology, Kerala,India
More informationJeffrey D. Ullman Stanford University/Infolab. Slides mostly developed by Anand Rajaraman
Jeffrey D. Ullman Stanford University/Infolab Slides mostly developed by Anand Rajaraman 2 Classic model of (offline) algorithms: You get to see the entire input, then compute some function of it. Online
More informationA Soft Classification Model for Vendor Selection
A Soft Classification Model for Vendor Selection Arpan K. Kar, Ashis K. Pani, Bijaya K. Mangaraj, and Supriya K. De Abstract This study proposes a pattern classification model for usage in the vendor selection
More information2015 The MathWorks, Inc. 1
2015 The MathWorks, Inc. 1 MATLAB 을이용한머신러닝 ( 기본 ) Senior Application Engineer 엄준상과장 2015 The MathWorks, Inc. 2 Machine Learning is Everywhere Solution is too complex for hand written rules or equations
More informationReal Estate Appraisal
Real Estate Appraisal CS229 Machine Learning Final Project Writeup David Chanin, Ian Christopher, and Roy Fejgin December 10, 2010 Abstract This is our final project for Machine Learning (CS229) during
More informationData Science in a pricing process
Data Science in a pricing process Michaël Casalinuovo Consultant, ADDACTIS Software michael.casalinuovo@addactis.com Contents Nowadays, we live in a continuously changing market environment, Pricing has
More informationC3 Products + Services Overview
C3 Products + Services Overview AI CLOUD PREDICTIVE ANALYTICS IoT Table of Contents C3 is a Computer Software Company 1 C3 PaaS Products 3 C3 SaaS Products 5 C3 Product Trials 6 C3 Center of Excellence
More informationFINAL PROJECT REPORT IME672. Group Number 6
FINAL PROJECT REPORT IME672 Group Number 6 Ayushya Agarwal 14168 Rishabh Vaish 14553 Rohit Bansal 14564 Abhinav Sharma 14015 Dil Bag Singh 14222 Introduction Cell2Cell, The Churn Game. The cellular telephone
More informationStock Prediction using Machine Learning
Stock Prediction using Machine Learning Yash Omer e-mail: yashomer0007@gmail.com Nitesh Kumar Singh e-mail: nitesh.321.singh@gmail.com Awadhendra Pratap Singh e-mail: apsingh1096@gmail.com Dilshad Ashmir
More informationModel Selection, Evaluation, Diagnosis
Model Selection, Evaluation, Diagnosis INFO-4604, Applied Machine Learning University of Colorado Boulder October 31 November 2, 2017 Prof. Michael Paul Today How do you estimate how well your classifier
More informationRECOGNIZING USER INTENTIONS IN REAL-TIME
WHITE PAPER SERIES IPERCEPTIONS ACTIVE RECOGNITION TECHNOLOGY: RECOGNIZING USER INTENTIONS IN REAL-TIME Written by: Lane Cochrane, Vice President of Research at iperceptions Dr Matthew Butler PhD, Senior
More information2 Maria Carolina Monard and Gustavo E. A. P. A. Batista
Graphical Methods for Classifier Performance Evaluation Maria Carolina Monard and Gustavo E. A. P. A. Batista University of São Paulo USP Institute of Mathematics and Computer Science ICMC Department of
More informationCase studies in Data Mining & Knowledge Discovery
Case studies in Data Mining & Knowledge Discovery Knowledge Discovery is a process Data Mining is just a step of a (potentially) complex sequence of tasks KDD Process Data Mining & Knowledge Discovery
More informationINSIGHTS. Driving Decisions With Data: iquanti s Hybrid Approach to Attribution Modeling. Ajay Rama, Pushpendra Kumar
INSIGHTS Driving Decisions With Data: iquanti s Hybrid Approach to Attribution Modeling Ajay Rama, Pushpendra Kumar TABLE OF CONTENTS Introduction The Marketer s Dilemma 1. Media Mix Modeling (MMM) 1.
More informationFinding Hidden Intelligence with Predictive Analysis of Data Mining
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives Show use of Microsoft SQL Server
More informationNVIDIA AND SAP INDUSTRY CHALLENGES INTEGRATED SOLUTION
NVIDIA AND SAP ACCELERATING ENTERPRISE INTELLIGENCE Deep learning is a collection of statistical machine learning techniques that is transforming every digital business. Applications using deep learning
More informationMulti-classes Feature Engineering with Sliding Window for Purchase Prediction in Mobile Commerce
5 IEEE 5th International Conference on Data Mining Workshops Multi-classes Feature Engineering with Sliding Window for Purchase Prediction in Mobile Commerce Qiang Li, Maojie Gu, Keren Zhou and Xiaoming
More informationGetting Started with HLM 5. For Windows
For Windows Updated: August 2012 Table of Contents Section 1: Overview... 3 1.1 About this Document... 3 1.2 Introduction to HLM... 3 1.3 Accessing HLM... 3 1.4 Getting Help with HLM... 3 Section 2: Accessing
More informationThe People-Based Marketing Strategy. Optimize campaign success with humanized data.
The People-Based Marketing Strategy Optimize campaign success with humanized data. 01 Introducing: People-Based Marketing In an ever-evolving technological world, it s more imperative than ever to adapt
More information