Influencer Communities Influencers are having many different conversations 1
1.0 Background A unique feature of social networks is that people with common interests are following (or friend-ing) similar people within the same community. As in any community, influence is more about who you know (i.e. connections) rather than how many people you know. As such, influence is not uniform across a social network but dependent on the specific community. In our analysis of social network influencers [1], we describe a method to determine the top influencers in a topic network. A topic network is a social network defined by a user s query (topic). The topic network aims to narrow down the population of interest from the entire social network to a specific set of users interested in a topic in order to more effectively market to them. However, we also observe that even within a topic network, influencers may affect different clusters of users to varying degrees. In reality, multiple clusters arise even within a single topic network. In the social network field, these clusters are called communities. Figure 1. Network Graph Communities 2
1.1 INFLUENCERS ARE NOT ONE-DIMENSIONAL The prevalent method of identifying influencers involves returning a single ordered list of users for a given topic. However, this flat list fails to capture the multi-dimensional aspect of social networks. In fact, for even a precisely defined topic, there are multiple conversations occurring across multiple communities. For effective influencer marketing, the goal is to target the key influencers in each community for maximal reach (and conversion). A single ordered list does not achieve this goal because the top 20 (or 50) influencers typically belong to the same community. Targeting only these top influencers could lead to missing out on marketing to other smaller communities, which may have higher engagement or conversion rates. 1.2 VIRAL MARKETING THROUGH INFLUENCER COMMUNITIES Precisely targeting a handful of top influencers within each community can be more effective than engaging a random number of influencers. Strategically targeting a list of influencers in each of the major communities may achieve maximal reach and engagement, while reducing the overall marketing resources and spend. In this paper, we outline a novel method to identify influencers and their communities. We return a set of N distinct communities and the top influencers in each of those communities. Additionally, we give an aggregated list of the top influencers across all communities (as described in [1]), to give the relative order of all the influencers. The visualization of the communities and the influencers allow endusers to understand the scale and relative significance of each of the influencers in their communities. Understanding conversations, influencers, and communities is essential for effective viral marketing. The Sysomos analytic software [1] provides tools to understand all these aspects of social media marketing campaigns including allowing one to listen, measure, and engage with their influencers and communities. 3
2.0 Influencer Communities 2.1 COMMUNITY DETECTION USING MODULARITY In the field of social network analytics research, Modularity [2] is a popular measure to detect communities within a network with many algorithms [2.3] developed to compute this metric. This measure analyzes a network and detects clusters of nodes with high (intra-cluster) connectivity and separates clusters of nodes with low (inter-cluster) connectivity. While searching over all possible combinations of dense and sparse communities is known to be computationally exhaustive, many scalable heuristic algorithms [3] have been devised to give good real-world results for large networks. The output of these modularity-based community detection algorithms is a mapping from each node in the network to a number representing its community. 2.2 INFLUENCER COMMUNITIES ALGORITHM The implementation of the community detection algorithm uses Gephi [4], an open source graph analysis package, and the visualization is done using both Gephi [4] and the D3 javascript library [5]. In combination with the Sysomos Influencer, which uses a PageRank [6] based algorithm, the community detection results are combined to produce an influencer metric across users and communities (for a given topic). The high level algorithm for Influencer Communities is outlined as follows: 1.) For a given a user query (topic), use the Sysomos search engine to return all the tweets from the specified time period. 2.) From the list of returned tweets, extract and order the list of tweet authors (user handles) using the Authority score, and take the top N user handles. This number can be increased and is dependent on the scalability of the architecture and the desired response time of the system (e.g. 5 second response). 3.) For each of the top N user handles, find the follower network induced by the N user handles by retrieving the follower list for each handle. The followers that do not appear in the list of N handles are discarded. 4.) Run PageRank on this interconnected network of users to calculate the Influencer metric. 5.) Run Modularity on this interconnected network to find the set of M communities. SYSOMOS KEYWORD SEARCH TOP AUTHORS OF TWEETS TWITTER FOLLOWER GRAPH PAGERANK INFLUENCE SCORE MODULARITY COMMUNITIES 4
The modularity algorithm has one main parameter called Resolution that controls both the density and the number of the communities returned. Through empirical analysis, we decided on a default Resolution value of 2.0, which gives about 2 to 10 communities for the user to interpret. In addition, the user may control this parameter to generate higher or lower granularity of communities as needed for business decisions. Each user handle will be assigned a Modularity class identifier (Mod ID) and a PageRank score. We return these results to the analyst as a graphic visualization as shown in Figure 2. Figure 2. Influencer Communities graphic for the athletic shoes topic. The communities can be color-coded and the size of each node will correspond to its influencer score. The edges in the network will show connections between the users (nodes) both within each community and across different communities. 5
2.2.1 COMMUNITY CONVERSATIONS While the identities of user handles in each community may give some insight into the demographics of the community, the analyst may want a more concrete description of the community. To provide that additional description, we take the sample of tweets returned from the search query, and run a frequency count on the relevant terms to generate a word cloud of the popular terms in the conversations of each community. With this graphic, an analyst can easily identify the behavioural characteristics of each community and use this information to create more targeted messages to the influencers in each community. 2.3 INFLUENCER COMMUNITIES CASE STUDIES We use the underlying Twitter data from the Sysomos system [1], which is formed by a user-defined list of Boolean keyword search terms over a specified period of time. We highlight the salient points in each of the case studies to demonstrate the efficacy of the Influencer Communities in the next sections. 2.3.1 THE ATHLETIC SHOES BRAND - CASE STUDY The darker shaded groups in Figures 2, 3 and 4 respectively, correspond to the three largest Communities in the athletic shoes topic network. The highlighted community (blue) in Figure 2 corresponds to the largest set of influencers. Judging from the word cloud and the user handles, the conversation in this community appears to be around the brand s sneakers and shoes. In Figure 3, the second largest community (orange) has conversations around a specific smartwatch manufactured by the brand for training. There are also many gadget review handles in this community such as Engadget, CNET, Mashable, FastCompany, and Gizmodo. In Figure 4, the brand s twitter handle associated with running is part of a smaller community (green), which includes serious running handles such as YohanBlake, RunBlogRun, LondonMarathon, B_A_A (Boston Athletic Association), RunningNetwork, etc. The business insight in this case study is that brand s Twitter handle associated with running may be well connected to the serious running community (green), but is not well connected to the larger influencer communities of sneaker aficionados (blue) and the gadget review (orange) communities. For effective influencer marketing, the brand may want to reach out to the key influencers in those other communities, and tailor their marketing message according to the conversations within those communities. 6
Figure 3. Influencer Community 1 graphic for the athletic shoes topic. 7
Figure 4. Influencer Community 3 graphic for the athletic shoes topic. 8
2.3.2 THE PERSONAL CARE PRODUCTS BRAND - CASE STUDY Figures 5 and 6 show the two largest communities in the brand s product (soap) topic network in darker shading. Figure 5 has the largest community (blue) of relatively low influencers. Judging from the user handles and the word cloud, they seem to be the mommy bloggers interested in saving, shopping, win, prize, a large grocery chain (supermarket). Moreover, the brand s girlsunstoppable campaign seems to resonate with this community. Figure 6 is a smaller community, which has the official corporate handles of the brand as well as some semiinfluential beauty bloggers. The business insight in this study is that while the brand s handle is well connected among influential beauty bloggers, they may need to reach out more to the mommy bloggers, as they are the larger community as compared to the beauty bloggers. Again, they may have to tailor the message differently to the influencers in this community without alienating others. Higher engagement in the mommy blogger community may indeed lead to higher sales conversion, while simultaneously managing the brand with the beauty blogger community. Figure 5. Influencer Community 1 graphic for the personal care products topic 9
Figure 6. Influencer Community 2 graphic for the personal care products topic 10
3.0 Summary In this paper, we present a novel method for identifying influencers in social communities detected in the network associated with a user s query topic. We show that influencers do not have uniform characteristics, and there are in fact communities of influencers even within a given topic network. These communities can be visualized in a network graphic to display the relative influence of individuals and their respective communities. Additionally, word clouds of each community s conversation could be used to understand the behavioural characteristics of individual communities. These powerful insights bring a new level of real-time analytics to social network analysis and help make smarter and more cost-effective business decisions for viral marketing. REFERENCES [1] E.D. Kim, B. Keng (2013). Contextual Influencer Graphs on Social Networks. Technical white paper (Sysomos blog). US Patent Application # 61/895,539. [2] M. E. J. Newman (2006). Modularity and community structure in networks. PROCEEDINGS- NATIONAL ACADEMY OF SCIENCES USA 103 (23): 8577 8696 [3] V.D. Blondel, J.-L. Guillaume, R. Lambiotte and E. Lefebvre (2008). Fast unfolding of community hierarchies in large networks. J. Stat. Mech. 2008 (10): P10008. doi:10.1088/1742-5468/2008/10/p10008. [4] Gephi, an open-source network analysis and visualization software package. www.gephi.org [5] D3 (Data-Driven Documents) Javascript library. www.d3js.org [6] Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999). The PageRank citation ranking: Bringing order to the Web. SITE WEB sysomos.com BLOG blog.sysomos.com TWITTER @sysomos FACEBOOK facebook.com/sysomos 2014 Marketwire L.P. All rights reserved. 201406 11