First mentioned on Edge.org, and now percolating through the information research ecosystem is a new field called “Computational Social Science“. Another way I would describe it would be “Experimental Social Science.” What does that mean? Well, with the advent of the information age, and rich data collection facilitated through digital interfaces, we now have a lot of data to be able to tackle previously intractable or unmeasurable social phenomenon. Now, instead of relying on our intuition or limited set of personal experiences, we are able to ask and experimental answer questions of how human beings behave on a large scale, which is a tremendously exciting capability with some counter-intuitive findings, as you will see below.
Today, I am going to review the first part of a paper by Cameron Marlowe, who founded and used to lead the Data Science team at Facebook.
His paper, published in PNAS, is called Structural Diversity in Social Contagion.
In particular, the part that I am going to review is concerned with the first experiment he asked, which we will call the Recruitment Question:
“How does one’s contact neighborhood impact the decision to respond to an invitation to join FB?”
For the sake of the readers, I’m going to give you the actionable punchline before going into the analysis:
If you want to convert someone on FB, have 4-5 people who do not know each other, but all know the target, to send them an invitation.
Now, for the analysis….
Briefly, two definitions:
- Contact Neighborhood – the people (nodes) that are connected to a person of interest (target)
- For the Recruitment Question – Contact Neighborhood is derived as the set of people who have all imported the email address of the target
- Structural Diversity – the complexity and number of connected components of the neighborhood. For the Recruitment Question, we are going to focus on the number of connected components.
- The number of connected components is highly correlated with acceptance rate of invitation: ex. the higher number of separate connected components that exist with ties to the target, the higher the rate of acceptance. This is a very important graph, so pay attention! Note that as the number of connected components increases, recruitment success spikes up tremendously!
- The size of the contact neighborhood, after controlling for number of connected components, is actually negatively correlated with recruitment success
- Controlling for demographics to homogenous neighborhoods actually maintains this trend, so it is not the ‘culturally diversity’ of the contact neighborhood, but instead the ‘connected component diversity‘ nature of the neighborhood that is responsible for this type of conversion.
- Using co-tagging of pictures to infer connections finds further confirmation of this:
i. If two disconnected nodes have been co-tagged in a picture, the recruitment success is about the same as if the two nodes are connected
ii. If two connected nodes are co-tagged, (an indicator of increased connection strength) then the recruitment success drops even further.
Take Away Lesson:
In studying recruitment, it is not the number or size of the contact neighborhood, but the number of diverse endorsements (as in number of connected components) that is critical for recruitment conversion.
What it means practically:
If you want something to be adopted by a user, instead of relying on reaching as many number of people as possible, reach as many different types/groups of people as possible. In this example, the connected component number is actually a proxy for the underlying topology of different social groups, that are connected to themselves but not each other.
Food for thought:
Q: How does this generalize for other recruitment processes that are not Facebook?
Full Paper can be found here: http://cameronmarlow.com/media/ugander-structural-2012a.pdf
Like this? Leave a comment/like if you want me to write Pt2, which focuses on user engagement.