Jason Baldridge: Social network analysis and audience segmentation — Live from the Brands-Only Summit

Coverage of this session by Kristen Platt of SocialMedia.org. Connect with her by following her on Twitter.

3:35 — SocialMedia.org's Kurt Vanderah introduces People Pattern's Jason Baldridge.

3:36 — Jason: A bunch of what I'm going to talk about is research done by other people around the world.

3:37 — Jason: Let's start off with some text analysis: quoting patterns in political coverage. Measuring bias is subject and hard. Personal estimates of bias are influenced by the availability heuristic.

3:39 — Jason: The research on quoting patterns used automated tracking of quotations from Obama's speeches. When you can take that data, dimensionality reduction reveals two main bias dimensions: (one) independent-mainstream and (two) foreign-liberal-conservative.

3:41 — Jason: Now let's think about social networks and virality. Information cascades can propagate via broadcast and viral diffusion. Most cascades contain both broadcast and viral spreading.

3:43 — Jason: Virality happens almost never. Jason shows Twitter cascades characterized by structural virality, increasing down and to the right (which is the most rare). It turns out 99% of content adoptions terminate in a single generation. The largest image and video cascades are low on structural virality.

3:44 — Jason says that broadcast is by far the dominant mode to reach large audiences. This means pay-to-play when you need to go big reliably.

3:45 — Jason shares the Contagion Model: Information infects nodes, which become active. Information spreads from active nodes along the network edges. Given information cascades, infer network using contagion model. It's not a perfect network or necessarily social network, but it gives you data to work with.

3:47 — Jason: Blogs and mainstream media swap influence during the course of an event. Increased blog influence proportion correlates with social unrest.

3:49 — “However, is virality/contagion a bad metaphor?” He shares some examples: Taylor Swift has 65 million Twitter followers who can receive her messages. One individual cannot sneeze on and infect that many people simultaneously. Another example: The likelihood of disease infection increases independently with exposure to different infected individuals, but “infection” by an idea increases greatly when exposed to it by multiple, independent parties.

3:50 — Jason talks about “Majority illusion:” Social networks tend to look asymmetrical. The connectedness of “infected” people greatly impacts the perception of others. A minority opinion can appear extremely popular for each individual.

3:51 — Jason talks about personality classification: Language production proves a window on personality at scale. Then, you can target ads based on their personality! Twitter users whose language indicates higher openness and lower neuroticism are more likely to respond positively to an ad.

3:53 — Jason looks at race and sharing: Frequency of sharing for topics on social media varies by race.

3:54 — Tailored audiences: Human analysis and machine learning can be used to characterize and identify personas using social media profiles. Then, you can created promoted tweet copy informed by persona-based keywords.

3:55 — Jason: Persona-based campaigns with audience-driven ad copy produced higher engagement at lower cost per conversion. I'll end with micro-segmentation: We have limited attention and many options. The best, most relevant content is often created by those with very similar passions, interests, and demographics.

strong>3:56 — Jason concludes:

  1. Large scale analysis of networks and documents reveals hidden patterns.
  2. Pay-to-play to reliably get your word out.
  3. Audience understanding is essential: demographics, personality and microsegment relevance.


Q: Would ads be less effective if everyone is targeting open-minded consumers?

A: Jason: If everyone is bidding for them, the cost will go up to target them. So then maybe you have an opportunity to engage another audience at a lower cost.

Q: What are some other ways to collect data beyond Twitter?

A: Jason: If you can get the data at all, all of these techniques work. Facebook is hard because you don't have access to that data and you're not positive who is connected to who. But, there are some other things you can do: assigning users IDs that you can track and collect data on. At People Pattern, we're creating profiles for real people and linking those to their social profiles so that we can create a database based on the individuals.

Q: Is it difficult to match users across networks? And do they behave differently across different networks?

A: Jason: We use gimmies like username matching, and actual name matching in addition to location. People who are in the same location tend to be connected to each other. Once you have that data, you have a better idea of how inferring who else they are connected to. In terms of different personas acting on different networks, I know from my own experience that people do act different on each platform: Twitter professionally, Facebook personally, etc.

Q: Are there any characteristics of virality?

A: Jason: One study looked at the memorability of quotes. It seems there are certain quotes that people tend to remember more. Interestingly unique words used in syntactic patterns tend to be remembered most. So that may have something to do with something going viral. Otherwise, it could just be dumb luck!