Branching dynamics of viral information spreading

José Luis Iribarren and Esteban Moro

arbol1

Physical Review E 84, 046116 (2011)  [pdf]

Abstract
Despite its importance for rumors or innovations propagation, peer-to-peer collaboration, social networking, or marketing, the dynamics of information spreading is not well understood. Since the diffusion depends on the heterogeneous patterns of human behavior and is driven by the participants’ decisions, its propagation dynamics shows surprising properties not explained by traditional epidemic or contagion models. Here we present a detailed analysis of our study of real viral marketing campaigns where tracking the propagation of a controlled message allowed us to analyze the structure and dynamics of a diffusion graph involving over 31 000 individuals. We found that information spreading displays a non-Markovian branching dynamics that can be modeled by a two-step Bellman-Harris branching process that generalizes the static models known in the literature and incorporates the high variability of human behavior. It explains accurately all the features of information propagation under the “tipping point” and can be used for prediction and management of viral information spreading processes.

Affinity Paths and information diffusion in social networks

José Luis Iribarren and Esteban Moro arbol
Social Networks 33, 134-142 (2011)  [pdf]

Abstract
Widespread interest in the diffusion of information through social networks has produced a large number of Social Dynamics models. A majority of them use theoretical hypothesis to explain their diffusion mechanisms while the few empirically based ones average out their measures over many messages of different contents. Our empirical research tracking the step-by-step email propagation of an invariable viral marketing message delves into the content impact and has discovered new and striking features. The topology and dynamics of the propagation cascades display patterns not inherited from the email networks carrying the message. Their disconnected, low transitivity, tree-like cascades present positive correlation between their nodes probability to forward the message and the average number of neighbors they target and show increased participants’ involvement as the propagation paths length grows. Such patterns not described before, nor replicated by any of the existing models of information diffusion, can be explained if participants make their pass-along decisions based uniquely on local knowledge of their network neighbors affinity with the message content. We prove the plausibility of such mechanism through a stylized, agent-based model that replicates the Affinity Paths observed in real information diffusion cascades.

Relationship mining

Each day trillions of emails, phone calls, comments on blogs, twitter messages, exchanges in online social networks, etc. are done. Not only the number of communications has increased, but also each of these transactions leaves a digital trace that can be recorded to reconstruct our high-frequency human activity. It is not only the amount and variety of data that is recorded what is important. Also its high-frequency character and its comprehensive nature have allowed researchers, companies and agencies to investigate individual and group dynamics at an unprecedented level of detail and applied them to client modeling, organizational analysis or epidemic spreading [1].

However, for technical or privacy reasons only the existence but not of the content of those exchanges is known. Thus we can quantify the intensity and frequency of the interaction but not its type. For decades, social science has measured relationships between individuals in the currency of tie strength, introduced by Granovetter [1]. Weak ties (loose acquaintances) can help to disseminate ideas and/or innovations between different groups, help to find a job or new information; while strong ties (family, trusted friends) hold together organizations and social groups and can affect emotional health. Despite its success to explain these phenomena, tie strength of human relationships is vaguely defined in most large-scale social empirical work. Specifically, relationships are generally quantified by the intensity or duration of communication, although they are known to have significant drawbacks as tie strength predictor [3,4]. Multiplexity, rhythm and depth of the communication seem to be better predictors of tie strength than intensity [4]. Incorporating those metrics in the data mining of online communication might improve the definition of relationships between individuals and in turn transform our understanding of individual dynamics and its impact in our lives, organizations and society [5]. The challenge is to unveil social relationships in social media and not just mere interactions between individuals, which in general over-represent the real structure of a social group [6] (see figure). And this is of paramount importance to understand the propagation of ideas, opinions, commercial messages, etc. in social networks, since most links declared in social networks might be meaningless from a relationship point of view.

undressing1

Undressing the social network: considering all e-mail interactions in a academic social network (left) yields to a highly dense and connected social network, while strong interactions (based on the individual relative frequency of communication) render the social group sparser and disconnected

References

  1. D. Lazer et al. Computational Social Science, Science 323, 721 (2009)
  2. M. S. Granovetter, The Strength of Weak Ties, The American Journal of Sociology 78(6), 1360 (1973)
  3. P. V. Marsden, and K. E. Campbell Measuring Tie Strength Social Forces 63(2), 482 (1990).
  4. E. Gilbert and K. Karahalios, Predicting Tie Strength with Social Media, presented in CHI 2009.
  5. C. T. Butts, Revisting the Foundations of Network Analysis, Science 325, 414 (2009)
  6. B. A. Huberman, D. M. Romero, and F. Wu, Social networks that matter, First Monday 14(1) (2009).

Note: This article appears in the Catalog of the exhibition “Culturas del Cambio: Átomos Sociales y Vidas Electrónicas” in the Center Arts Santa Mónica. Thanks to  Josep Perelló for his kind invitation to contribute

The speed and reach of forwarded emails, rumors, and hoaxes in electronic social networks

large_spain_5We have just published an experimental/theoretical work on the speed of information diffusion in social networks in Physical Review Letters. Specifically we have studied the impact of the heterogeneity of human activity in propagation of emails, rumors, hoaxes, etc. Tracking email marketing campaigns, executed by IBM Corporation in 11 European countries, we were able to compare their viral propagation with our theory (see below the campaigns details).

The results are very simple. Let me give you an example: the typical time between two emails sent by the same person is around 1 day. Traditional models of information diffusion will then yield to an infection speed of 1 day. However, some email computer viruses spread widely in a matter of hours (minutes, sometimes), while some viral propagation (for example the Veuve-Clicquot hoax) last for years. How can that occur? The reason is that traditional models are not correct because they neglect the large heterogeneity in the frequency of human activity: the average time between emails (1 day) does not actually represent the collectivity. In fact, most of us respond very quickly to emails, but some take a lot of time to do it. This fact (known and discovered previously by others) has a profound consequence in the way information spreads:

  1. When information spreads “successfully”, in the sense that it propagates and reaches most of the collectivity (i.e. it surpasses the tipping-point), its propagation speed of is determined by the people that have higher activity.
  2. However, when information reaches just a small fraction of the population (below the tipping-point), its propagation is controlled by those who take a lot of time to respond/forward and the spreading is very slow.

This phenomenon, as explained in our paper, has consequences for viral marketing, fads and hoaxes diffusion or opinion dynamics because the speed of their messages propagation depends strongly on the size of the sub-communities of very active and not-so active people. For example, in our campaigns (which were below the tipping-point yet successful from a viral marketing perspective), endogenous propagation of the commercial message lasted for months while the average time between getting the message and forwarding was only 1 day. We also found that messages do not “go viral”: They are viral because of the diffusion mechanism they use, but their spreading success largely depends on the social network propensity and heterogeneous behavior.

Finally, our work has some consequences for the way we model and understand human dynamics, since it shows that there is no such a thing as a typical time scale in the human dynamics. This is in sharp contrast with epidemic models, information diffusion models, etc. in which the heterogeneity in human activity and frequency is usually neglected, in favor of a more homogeneous picture of the activity of humans.

About the empirical data:
The viral marketing campaigns were conducted by IBM using the typical “refer-a-friend” mechanism which led to the endogenous diffusion of information. The campaigns’ offerings were promoted at the IBM. homepage where initial participants heard about them. Their primary marketing objective was to generate subscriptions to the company’s on-line newsletter. Subscriptions were entered through a form located in the campaign main web page (a.k.a. registration page). Additionally, a viral propagation mechanism accessible through a button located at the registration page was available to foster the message propagation. The button caption enticed visitors to recommend the page to friends and colleagues by offering, as additional incentive for people to forward the page, tickets for a prize draw to win a laptop computer. More technical details about the campaign can be found at Appendix D of the arXiv version of our paper

Press coverage:

  • ‘Infectious’ people spread memes across the web, New Scientist (12/08/09)
  • Email hoaxes are like viruses, The Inquirer (10/08/09)
  • The flow of viral video, ABC News (8/08/09)
  • New model for social marketing campaigns details why some information ‘goes viral’, PhysOrg (6/08/09)
  • Los perezosos frenan los rumores en Internet, ABC.es (14/8/09)
  • Party people spread viral internet memes, ComputerWeekly (14/8/09)
  • Desvelan las claves de la difusión de la información en las redes sociales, PlataformaSINC.es (7/9/09)
  • Nuevas claves para la difusión de información en las redes sociales, Noticias Madri+d (7/9/09)