Influence Maximization on Social Graphs. Yu-Ting Wen

Size: px
Start display at page:

Download "Influence Maximization on Social Graphs. Yu-Ting Wen"

Transcription

1 Influence Maximization on Social Graphs Yu-Ting Wen

2 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 2

3 Social Network and Spread of Influence Social network plays a fundamental role as a medium for the spread of INFLUENCE among its members Opinions, ideas, information, innovation Direct Marketing takes the word-of-mouth effects to significantly increase profits (Gmail, Tupperware popularization, Microsoft Origami ) Adapted from author s slide at: 3

4 Problem Setting Given a limited budget B for initial advertising (e.g. give away free samples of product) estimates for influence between individuals Goal trigger a large cascade of influence (e.g. further adoptions of a product) Question Which set of individuals should B target at? Application besides product marketing spread an innovation detect stories in blogs Influence Maximization on Social Graphs 4

5 What we need Form models of influence in social networks. Obtain data about particular network (to estimate inter-personal influence). Devise algorithm to maximize spread of influence. Influence Maximization on Social Graphs 5

6 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Algorithms Modeling and learning social influence Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 6

7 Models of Influence Two basic classes of diffusion models: threshold and cascade General operational view: A social network is represented as a directed graph, with each person (customer) as a node Nodes start either active or inactive An active node may trigger activation of neighboring nodes Monotonicity assumption: active nodes never deactivate Influence Maximization on Social Graphs 7

8 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Algorithms Modeling and learning social influence Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 8

9 Linear Threshold Model A node v has random threshold θ v ~ U[0,1] A node v is influenced by each neighbor w according to a weight b vw such that w å neighbor of A node v becomes active when at least (weighted) θ v fraction of its neighbors are active v b vw, 1 w å active neighbor of v b vw, ³ q v Influence Maximization on Social Graphs 9

10 Example Inactive Node Active Node Threshold 0.4 X 0.1 U Active neighbors Stop! w 0.5 v Influence Maximization on Social Graphs 10

11 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks Location Influence in Location-based Social Networks Influence Maximization on Social Graphs 11

12 Independent Cascade Model When node v becomes active, it has a single chance of activating each currently inactive neighbor w. The activation attempt succeeds with probability p vw. Influence Maximization on Social Graphs 12

13 Example Inactive Node Active Node 0.4 w X v U 0.2 Newly active node Successful attempt Unsuccessful attempt Stop! Influence Maximization on Social Graphs 13

14 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks Location Influence in Location-based Social Networks Influence Maximization on Social Graphs 14

15 Influence Maximization Problem [KDD 03] Maximizing the Spread of Influence through a Social Network Influence of node set S: f(s) expected number of active nodes at the end, if set S is the initial active set Problem: Given a parameter k (budget), find a k-node set S to maximize f(s) Constrained optimization problem with f(s) as the objective function Influence Maximization on Social Graphs 15

16 f(s): properties (to be demonstrated) Non-negative (obviously) Monotone: Submodular: Let N be a finite set A set function f :2 N! Â " S ÌT Ì N, " vîn \ T, f( S + v) - f( S) ³ f( T + v) - f( T) (diminishing returns) f( S + v) ³ f( S) is submodular iff Influence Maximization on Social Graphs 16

17 Bad News For a submodular function f, if f only takes nonnegative value, and is monotone, finding a k-element set S for which f(s) is maximized is an NP-hard optimization problem. It is NP-hard to determine the optimum for influence maximization for both independent cascade model and linear threshold model. Influence Maximization on Social Graphs 17

18 Good News We can use Greedy Algorithm! Start with an empty set S For k iterations: Add node v to S that maximizes f(s +v) - f(s). How good (bad) it is? Theorem: The greedy algorithm is a (1 1/e) approximation. The resulting set S activates at least (1-1/e) > 63% of the number of nodes that any size-k set S could activate. Influence Maximization on Social Graphs 18

19 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 19

20 Key 1: Prove submodularity " S ÌT Ì N, " vîn \ T, f( S + v) - f( S) ³ f( T + v) - f( T) Influence Maximization on Social Graphs 20

21 Submodularity for Independent Cascade Coins for edges are flipped during activation attempts Influence Maximization on Social Graphs 21

22 Submodularity for Independent Cascade Coins for edges are flipped during activation attempts. Can pre-flip all coins and reveal results immediately n Active nodes in the end are reachable via green paths from initially targeted nodes. n Study reachability in green graphs 0.5 Influence Maximization on Social Graphs 22

23 Submodularity, Fixed Graph Fix green graph G. g(s) are nodes reachable from S in G. Submodularity: g(t +v) - g(t) Í g(s +v) - g(s) when S ÍT. V S T g(s) g(t) g(v) n g(s +v) - g(s): nodes reachable from S + v, but not from S. n From the picture: g(t +v) - g(t) g(s +v) - g(s) when S Í T (indeed!). Í Influence Maximization on Social Graphs 23

24 Submodularity of the Function Fact: A non-negative linear combination of submodular functions is submodular f ( S) = åprob( G is green graph) gg( S) G g G (S): nodes reachable from S in G. Each g G (S): is submodular (previous slide). Probabilities are non-negative. Influence Maximization on Social Graphs 24

25 Submodularity for Linear Threshold Use similar green graph idea. Once a graph is fixed, reachability argument is identical. How do we fix a green graph now? Each node picks at most one incoming edge, with probabilities proportional to edge weights. Equivalent to linear threshold model (trickier proof). Influence Maximization on Social Graphs 25

26 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 26

27 Key 2: Evaluating f(s) Influence Maximization on Social Graphs 27

28 Evaluating ƒ(s) How to evaluate ƒ(s)? Still an open question of how to compute efficiently But: very good estimates by simulation repeating the diffusion process often enough (polynomial in n; 1/ε) Achieve (1± ε)-approximation to f(s). Generalization of Nemhauser/Wolsey proof shows: Greedy algorithm is now a (1-1/e- εʹ)-approximation. Influence Maximization on Social Graphs 28

29 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks Location Influence in Location-based Social Networks Influence Maximization on Social Graphs 29

30 Algorithmic Efforts Influence maximization problem Discrete- and continuous-time models Hardness & greedy approximation Improving efficiency and scalability Ranking and score based heuristics Sketches, reverse influence sampling Context-aware influence maximization Topic- and time-aware Location-aware Competitive and comparative Dynamic influence maximization Other optimization objectives Business-oriented optimization: adoption, profit, recommendation, etc. Robustness of influence maximization Influence Maximization on Social Graphs 30

31 Outline Background Models of influence Linear Threshold Independent Cascade Influence maximization problem Proof of performance bound Compute objective function Algorithmic Efforts Broader optimization objectives Targeted Influence Maximization in Social Networks [CIKM 16] Influence Maximization on Social Graphs 31

32 Targeted Influence Maximization in Social Networks Chonggang Song, Wynne Hsu, Mong Li Lee

33 Introduction How online influence broadcast in a location-based social network event at a location limit budget of seed users Constraints time geographical regions Influence Maximization on Social Graphs 41

34 Preliminaries a user is influenced if her/his neighbors have propagated information of an event to him a user is registered if s/he is influenced and has decided to participate in the event. Influence Maximization on Social Graphs 42

35 Research Questions Traditional IM problem: How to influence the largest number of users online Targeted influence maximization problem : To identify k users that result in most registered participants. Influence Maximization on Social Graphs 43

36 Target Influence Maximization Input: Network G = (V, E) Each node v V is associated with a location l v and a login probability login(v) to be online Each edge <u, v> has an influence probability p(u, v) Given a deadline α and an event location γ Output: A set of k users S that maximizes the expected number of registered users: Influence Maximization on Social Graphs 44

37 Sampling-based method Applies a greedy selection process based on maximum coverage to find k nodes that cover the largest number of randomly sampling nodes in the graph (RR sets) Model the propagation delay incurred by login events temporal constraints Influence Maximization on Social Graphs 45

38 Generating WRR Tree Weighted Reverse Reachable tree Influence Maximization on Social Graphs 46

39 Generating WRR Tree (cont ) Influence Maximization on Social Graphs 47

40 Greedy Selection Influence Maximization on Social Graphs 48

41 Experiments Given: A random target location Evaluate the number of registered users as a result of the influence of the k users returned by various methods Influence Maximization on Social Graphs 51

42 Experiments with Deadline and Location set the deadline α = 10 Influence Maximization on Social Graphs 52

43 Conclusions Targeted influence maximization in social networks Temporal and geographical constraints Weighted Reverse Reachable tree A solution with (1 1/e ε) approximation bound Greedy Target-IM+ algorithm Future work multicampaign influence maximization tasks considering event types and user preferences Influence Maximization on Social Graphs 53

44 Future Directions of IM Theoretical problem Efficient approximation algorithms under specific conditions Practical problem Influence analysis from different networks Influence Maximization on Social Graphs 54

45 Thanks! Influence Maximization on Social Graphs 55