Introduction to Data Mining

Similar documents
CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Classic model of algorithms

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS 345 Data Mining. Online algorithms Search advertising

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

CS425: Algorithms for Web Scale Data

Jeffrey D. Ullman Stanford University/Infolab. Slides mostly developed by Anand Rajaraman

What about streaming data?

Case study I: Advertising on the Web

Advertising on the Web

CSE 158 Lecture 14. Web Mining and Recommender Systems. AdWords

Algorithmic Game Theory

CSE 158 Lecture 13. Web Mining and Recommender Systems. Algorithms for advertising

A Framework for the Optimizing of WWW Advertising

Lecture 7 - Auctions and Mechanism Design

New Results for Lazy Bureaucrat Scheduling Problem. fa Sharif University of Technology. Oct. 10, 2001.

Online advertising. A brief introduction to Ad-networks Prepared by: Anna Alikhani

A Survey of Sponsored Search Advertising in Large Commercial Search Engines. George Trimponias, CSE

Introduction to computational advertising

Reverse-engineering AdWords Quality Score factors

AdWords Business Post-Campaign Summary Report - Fast Lane Clothing Co. Client Overview Campaign Overview: Fast Lane Clothing Company dba Tampa

Sponsored Search Markets

Strategic Analytics Framework

LinkedIn and Google Adwords Practical Tips

The tale behind the tail

P P C G L O S S A R Y PPC GLOSSARY

Bidding for Sponsored Link Advertisements at Internet

Search engine advertising

Online Advertising and Ad Auctions at Google

Internet Advertising and Generalized Second Price Auctions

Electronic Advertisements & Trading Agents. Enrique Areyan Viqueira

GOOGLE ADWORDS. Optimizing Online Advertising x The Analytics Edge

On-Line Restricted Assignment of Temporary Tasks with Unknown Durations

Casino Programmatic Buying & AdWords 12-Month Review

Keyword Analysis. Section 1: Google Forecast. Example, inc

Pay Per Click Advertising

MARCH 14, Show all work. Write legibly. Total 103 Calculators permitted.

Marketing and CS. Ranking Ad s on Search Engines. Enticing you to buy a product. Target customers. Traditional vs Modern Media.

SEO & PPC 6EFFECTIVE TACTICS TO IMPROVE SEARCH THE SMALL BUSINESS BLUEPRINT RANKINGS & DRIVE LEADS ON A LOCAL LEVEL

Digital Capabilities Packet

The Beginners Guide : Google Adwords. e-book. A Key Principles publication

The B2B Marketer s Guide to INTENT-DRIVEN MARKETING

The Guide to Cracking the LinkedIn Ads Platform

ARE YOU SPENDING YOUR PPC BUDGET WISELY? BEST PRACTICES AND CREATIVE TIPS FOR PPC BUDGET MANAGEMENT

PRIMISTA ONLINE MARKETING MADE EASY

Sell More, Pay Less: Drive Conversions with Unrelated Keywords

Sponsored Search: Theory and Practice

Introduction to computational advertising. Bogdan Cautis

CSE 255 Lecture 14. Data Mining and Predictive Analytics. Hubs and Authorities; PageRank

What Makes Google Tick?

The Evolution of Digital Advertising: From Direct Response to Branding Gian Fulgoni Executive Chairman and Co-Founder comscore, Inc

Direct marketing and internet. Dr. Eva Happ Associate professor

MATH 240: Problem Set 4

David Easley and Jon Kleinberg November 29, 2010

VCG in Theory and Practice

Wireless Networking with Selfish Agents. Li (Erran) Li Center for Networking Research Bell Labs, Lucent Technologies

Digital Advertising 101. Jeremy Kagan, Associate Professor, Digital Marketing & CEO of Pricing Engine Inc.

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Wellesley College CS231 Algorithms December 6, 1996 Handout #36. CS231 JEOPARDY: THE HOME VERSION The game that turns CS231 into CS23fun!

CS364B: Frontiers in Mechanism Design Lecture #17: Part I: Demand Reduction in Multi-Unit Auctions Revisited

An Algorithm for Stochastic Multiple-Choice Knapsack Problem and Keywords Bidding

Fast Traffic Tactics

AMS. Amazon Marketing Services

Title: A Mechanism for Fair Distribution of Resources with Application to Sponsored Search

HOW TO USE PAID PROGRAMMATIC ADVERTISING TO BOOST YOUR SOCIAL MEDIA STRATEGY

Databox Partner Program

1 Mechanism Design (incentive-aware algorithms, inverse game theory)

A DIGITAL MARKETING DEPOT WHITEPAPER. How to Grow Customer Lifetime Value with People-Based Search Marketing

BEST PRACTICES WHITEPAPER

OCTOBOARD INTRO. Put your metrics around these practical questions and make sense out of your Google Ads Analytics!

arxiv: v2 [cs.ir] 25 Mar 2014

HW #1 (You do NOT turn in these problems! Complete the online problems) Number of voters Number of voters Number of voters

Introduction. What is Cryptocurrency?

The No-Nonsense Guide to. In-App Ads. Learn how to make money from your app with advertising and how to get started.

Online Advertising. Ayça Turhan Hacettepe University Department Of Business Administration

1 Mechanism Design (incentive-aware algorithms, inverse game theory)

HOW YOUR CONTENT QUALITY IMPACTS YOUR SOCIAL MEDIA ROI

Provided By WealthyAffiliate.com

Advertisement Allocation for Generalized Second Pricing Schemes

CSC304: Algorithmic Game Theory and Mechanism Design Fall 2016

CHAPTER 3. Quantitative Demand Analysis

Measurement and Analysis of OSN Ad Auctions

Combine attribution with data onboarding to bridge the digital marketing divide

Introduction. We ll be going through some key notions in paid advertising through the four chapters of our guide.

1. COPPA imposes restrictions on the collection and use of data on people under the age of A. 13. B. 16. C. 18. D. 21.

Online Ad Auctions. By Hal R. Varian. Draft: December 25, I describe how search engines sell ad space using an auction.

1 Mechanism Design (incentive-aware algorithms, inverse game theory)

BUS 168 Chapter 6 - E-commerce Marketing Concepts: Social, Mobile, Local

Digital Media Mix Optimization Model: A Case Study of a Digital Agency promoting its E-Training Services

Operations and Supply Chain Management Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

BRAnd AwAReness Amplify Your Reach

Behavioural targeting Driving new visitors to your site & enticing non-converting visitors back Carola York, Managing Director, Jellyfish Publishing

SE350: Operating Systems. Lecture 6: Scheduling

GRACE CORE CORP. Bingo Case Study: Paid Search Case Study Within Casino Industry. Wayne Cowan

TRANSPORTATION PROBLEM AND VARIANTS

HABIT 2: Know and Love Quality Score

DIGITAL MARKETING 1.0

Transcription:

Introduction to Data Mining Lecture #16: Advertising Seoul National University 1

In This Lecture Learn the online bipartite matching problem, the greedy algorithm of it, and the notion of competitive ratio Learn the problem of web advertising, the adwords problem, and the algorithms for them 2

Online Algorithms Classic model of algorithms You get to see the entire input, then compute some function of it In this context, offline algorithm Online Algorithms You get to see the input one piece at a time, and need to make irrevocable decisions along the way Similar to the data stream model Why do we care? 3

Outline Online Bipartite Matching Web Advertising 4

Example: Bipartite Matching 1 a 2 b 3 c Boys 4 d Girls Nodes: Boys and Girls; Edges: Preferences Goal: Match boys to girls so that maximum number of preferences is satisfied (but, no person can be matched with >= 2 persons) 5

Example: Bipartite Matching 1 a 2 b 3 c Boys 4 d Girls M = {(1,a),(2,b),(3,d)} is a matching Cardinality of matching = M = 3 6

Example: Bipartite Matching 1 a 2 b 3 c Boys 4 d Girls M = {(1,c),(2,b),(3,d),(4,a)} is a perfect matching Perfect matching all vertices of the graph are matched Maximal matching a matching that contains the largest possible number of matches 7

Matching Algorithm Problem: Find a maximal matching for a given bipartite graph A perfect one if it exists There is a polynomial-time offline algorithm based on augmenting paths (Hopcroft & Karp 1973, see http://en.wikipedia.org/wiki/hopcroft-karp_algorithm) But what if we do not know the entire graph upfront? 8

Online Graph Matching Problem Initially, we are given the set boys In each round, one girl s choices are revealed That is, girl s edges are revealed At that time, we have to decide to either: Pair the girl with a boy Do not pair the girl with any boy Example of application: Assigning tasks to servers (Given a task, and list of servers that can process the task, determine which server to process the task) 9

Online Graph Matching: Example 1 2 3 4 a b c d (1,a) (2,b) (3,d) 10

Greedy Algorithm Greedy algorithm An algorithm that follows a heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum Greedy algorithm for the online graph matching problem: Pair the new girl with any eligible boy If there is none, do not pair girl How good is the algorithm? 11

Competitive Ratio For input I, suppose greedy produces matching M greedy while an optimal matching is M opt Competitive ratio = min all possible inputs I ( M greedy / M opt ) (what is greedy s worst performance over all possible inputs I) I.e., if competitive ratio is 0.4, we are assured that the greedy algorithm gives an answer which is >= 40% good compared to optimal alg, for ANY input. 12

Analyzing the Greedy Algorithm Claim: the greedy algorithm for the bipartite matching problem has the competitive ratio 0.5 Proof: (next 2 slides) 13

Analyzing the Greedy Algorithm Consider a case: M greedy M opt Consider the set G of girls matched in M opt but not in M greedy Then every boy B adjacent to girls in G is already matched in M greedy : If there would exist such non-matched (by M greedy ) boy adjacent to a non-matched girl then greedy would have matched them Since boys B are already matched in M greedy then (1) M greedy B 1 2 3 4 B={ } M opt a b c d G={ } 14

Analyzing the Greedy Algorithm Summary so far: Girls G matched in M opt but not in M greedy (1) M greedy B (2) G B, since G has at least G neighbors (at the optimal matching) So: G B M greedy (3) By definition of G also: M opt M greedy + G Worst case is when G = B = M greedy B={ } M opt M opt 2 M greedy then M greedy / M opt 1/2 1 2 3 4 a b c d G={ } 15

Worst-case Scenario 1 2 3 4 a b c d (1,a) (2,b) 16

Outline Online Bipartite Matching Web Advertising 17

History of Web Advertising Banner ads (1995-2001) Initial form of web advertising Popular websites charged X$ for every 1,000 impressions of the ad Called CPM rate (Cost per thousand impressions) Modeled similar to TV, magazine ads From untargeted to demographically targeted Low click-through rates Low ROI (return on investment) for advertisers CPM cost per mille Mille thousand in Latin 18

Performance-based Advertising Introduced by Overture around 2000 Advertisers bid on search keywords When someone searches for that keyword, the highest bidder s ad is shown Advertiser is charged only if the ad is clicked on Similar model adopted by Google with some changes around 2002 Called Adwords 19

Ads vs. Search Results 20

Web 2.0 Performance-based advertising works! Multi-billion-dollar industry Interesting problem: What ads to show for a given query? (Today s lecture) If I am an advertiser, which search terms should I bid on and how much should I bid? (Not focus of today s lecture) 21

Adwords Problem Given: 1. A set of bids by advertisers for search queries 2. A click-through rate for each advertiser-query pair 3. A budget for each advertiser (say for 1 month) 4. A limit on the number of ads to be displayed with each search query Respond to each search query with a set of advertisers such that: 1. The size of the set is no larger than the limit on the number of ads per query 2. Each advertiser has bid on the search query 3. Each advertiser has enough budget left to pay for the ad if it is clicked upon 22

Adwords Problem A stream of queries arrives at the search engine: q 1, q 2, Several advertisers bid on each query When query q i arrives, search engine must pick a subset of advertisers whose ads are shown Goal: Maximize search engine s revenues Simple solution: Instead of raw bids, use the expected revenue per click (i.e., Bid*CTR) Clearly we need an online algorithm! 23

The Adwords Innovation Advertiser Bid CTR Bid * CTR A $1.00 1% 1 cent B $0.75 2% 1.5 cents C $0.50 2.5% 1.125 cents Click through rate Expected revenue 24

The Adwords Innovation Advertiser Bid CTR Bid * CTR B $0.75 2% 1.5 cents C $0.50 2.5% 1.125 cents A $1.00 1% 1 cent 25

Complications: Budget Two complications: Budget CTR of an ad is unknown Each advertiser has a limited budget Search engine guarantees that the advertiser will not be charged more than their daily budget 26

Complications: CTR CTR: Each ad has a different likelihood of being clicked Advertiser 1 bids $2, click probability = 0.1 Advertiser 2 bids $1, click probability = 0.5 Clickthrough rate (CTR) is measured historically Very hard problem: Exploration vs. exploitation Exploit: Should we keep showing an ad for which we have good estimates of click-through rate or Explore: Shall we show a brand new ad to get a better sense of its click-through rate 27

Greedy Algorithm Our setting: Simplified environment For each query, show only 1 ad All advertisers have the same budget B All ads are equally likely to be clicked Value of each ad is the same (=1) Revenue increases by 1 whenever an ad is clicked Simplest algorithm is greedy: For a query pick any advertiser who has bid 1 for that query Competitive ratio of greedy is ½ Why? 28

Greedy Algorithm Simplest algorithm is greedy: For a query pick any advertiser who has bid 1 for that query Competitive ratio of greedy is ½ Why? Exactly the same problem as bipartite matching The revenue is the size of the matching 1 a 2 b 3 c Advertiser 4 d Query 29

Bad Scenario for Greedy Two advertisers A and B A bids on query x, B bids on x and y Both have budgets of $4 Query stream: x x x x y y y y Worst case greedy choice: B B B B Optimal: A A A A B B B B Competitive ratio = ½ This is the worst case! Note: Greedy algorithm is deterministic it always resolves draws in the same way 30

BALANCE Algorithm [MSVV] BALANCE Algorithm by Mehta, Saberi, Vazirani, and Vazirani For each query, pick the advertiser with the largest unspent budget Break ties arbitrarily (but in a deterministic way) 31

Example: BALANCE Two advertisers A and B A bids on query x, B bids on x and y Both have budgets of $4 Query stream: x x x x y y y y BALANCE choice: A B A B B B Optimal: A A A A B B B B 32

BALANCE on 2 Advertisers Claim: For BALANCE on 2 advertisers Competitive ratio = ¾ Proof: (next 3 slides) 33

BALANCE on 2 Advertisers Consider simple case (w.l.o.g.): 2 advertisers, A 1 and A 2, each with budget B ( 1) # of queries: 2B (*) Optimal solution exhausts both advertisers budgets: i.e., a query is assigned to at least an advertiser BALANCE must exhaust at least one advertiser s budget: If not, there would be some query assigned to neither advertiser, even though the advertisers have some remaining budgets => contradicts (*) Assume BALANCE exhausts A 2 s budget, but allocates x queries fewer than the optimal Revenue: BAL = 2B - x 34

BALANCE on 2 Advertisers B Queries allocated to A 1 in the optimal solution Queries allocated to A 2 in the optimal solution x y x y A 1 A 2 A 1 A 2 A 1 A 2 Not used Not used x x B B Optimal revenue = 2B Assume Balance gives revenue = 2B-x = B+y Unassigned queries should be assigned to A 2 (if we could assign to A 1 we would since we still have the budget) Goal: Show we have y x Case 1) ½ of A 1 s queries got assigned to A 2 then yy BB/22 Case 2) > ½ of A 1 s queries got assigned to A 2 then xx BB/22 (proof: next slide) Balance revenue is minimum for xx = yy = BB/22 Minimum Balance revenue = 3333/22 Competitive Ratio = 3/4 BALANCE exhausts A 2 s budget 35

BALANCE on 2 Advertisers Claim: in (Case 2), when > ½ of A 1 s queries got assigned to A 2, x B/2. (Proof) Consider the last query (of A 1 ) that is assigned to A 2 At that time (right before assigned to A 2 ), Budget of A 2 Budget of A 1 Also, at that time, Budget of A 2 ½ B Thus, Budget of A 1 ½ B Since the budget only decreases, x ½ B (Optimal) (Balance) B x B y x A 1 A 2 A 1 A 2 Not used 36

BALANCE: General Result In the general case (many bidders, arbitrary bid, and arbitrary budget), worst competitive ratio of BALANCE is 1 1/e = approx. 0.63 Interestingly, no online algorithm has a better competitive ratio! Let s see the worst case example that gives this ratio 37

Worst case for BALANCE N advertisers: A 1, A 2, A N Each with budget B > N Queries: N B queries appear in N rounds of B queries each Bidding: Round 1 queries: bidders A 1, A 2,, A N Round 2 queries: bidders A 2, A 3,, A N Round i queries: bidders A i,, A N Optimum allocation: Allocate round i queries to A i Optimum revenue N B 38

BALANCE Allocation B/(N-2) B/(N-1) A 1 A 2 A 3 A N-1 A N B/N BALANCE assigns each of the queries in round 1 to N advertisers. After k rounds, sum of allocations to each of advertisers A k,,a N is SS kk = SS kk+11 = = SS NN = kk BB ii=11 NN (ii 11) If we find the smallest k such that S k B, then after k rounds we cannot allocate any queries to any advertiser 39

BALANCE: Analysis B/1 B/2 B/3 B/(N-(k-1)) B/(N-1) B/N S 1 S 2 S k = B 1/1 1/2 1/3 1/(N-(k-1)) 1/(N-1) 1/N S 1 S 2 S k = 1 40

Fact: HH nn = nn ii=11 BALANCE: Analysis Result due to Euler 11/ii llnn nn for large n 1/1 1/2 1/3 1/(N-(k-1)) 1/(N-1) 1/N ln(n)-1 ln(n) S k = 1 SS kk = 11 implies: HH NN kk = llll(nn) 11 = llll( NN ee ) We also know: HH NN kk = llll(nn kk) So: NN kk = NN ee Then: kk = NN(11 11 ee ) N terms sum to ln(n). Last k terms sum to 1. First N-k terms sum to ln(n-k) but also to ln(n)-1 41

BALANCE: Analysis So after the first k=n(1-1/e) rounds, we cannot allocate a query to any advertiser Revenue = B N (1-1/e) Competitive ratio = 1-1/e 42

General Version of the Problem Arbitrary bids and arbitrary budgets! Consider we have 1 query q, advertiser i Bid = x i Budget = b i In a general setting BALANCE can be terrible Consider two advertisers A 1 and A 2 A 1 : x 1 = 1, b 1 = 110 A 2 : x 2 = 10, b 2 = 100 Consider we see 10 instances of q BALANCE always selects A 1 and earns 10 Optimal earns 100 43

Generalized BALANCE Arbitrary bids: consider query q, bidder i Bid = x i Budget = b i Amount spent so far = m i Fraction of budget left over f i = 1-m i /b i Define ψ i (q) = x i (1-e -f i) Allocate query q to bidder i with largest value of ψ i (q) Same competitive ratio (1-1/e) 44

Questions? 45