New research from the University of Pennsylvania attempts to understand the personality traits of Americans and the well-being of the communities they live in, by studying what they tweet. In studying a mind-boggling volume of 37 billion tweets, the researchers at the World Well-Being Project have created an interactive map of U.S. counties with scores for each on select well-being indicators. The project has been busy: A year ago, it tracked heart disease trends based on a billion tweets, and is now working on projects in Spain, Mexico and the U.K., and is in the early stages of a project in China.
The work of the World Well-Being Project could find business applications – for example, insurers could use the data when trying to price stress factors or risky behavior in their premiums, and real estate companies could use it to understand their markets better. But it also could help reveal aspects of communities, such as in health care, that present opportunities for timely intervention by policy makers and governments, said Johannes Eichstaedt, a data scientist and University of Pennsylvania postdoctoral fellow in psychology. He co-founded the World Well-Being Project with Penn psychology professor Martin Seligman in 2011.
Eichstaedt discussed the far-reaching implications of the project’s research on the Knowledge at Wharton radio show on SiriusXM channel 111. (Listen to the podcast at the top of this page.)
Below are some key takeaways from the discussion:
Dimensions of Well-being
The objective of the World Well-Being Project is “to get a fairly nuanced assessment of what people’s lives are like without ever having to ask them,” said Eichstaedt, drawing comparisons with conventional survey methods. “We are piggy-backing on the larger trends of the last 15 years in artificial intelligence, machine learning and pattern recognition to sift through the [37 billion] tweets, find language patterns that are indicative of certain emotional and cognitive states and combine them into a larger-scale estimates of what these communities are like.” The dimensions of well-being that the researchers measured included life satisfaction, emotional happiness and also stress. The research process involves geo-tagging each tweet by county.
Improving by Measuring
Eichstaedt says tracking well-being will eventually lead to improved happiness for the people who are being studied. “You can only improve what you measure,” he said. “If you … do your measurement well enough, you unleash natural market forces, because well-being is desirable: People want to be happy, they want to live with happy people, people who do well.”
Well-being is often a key element that business owners or CEOs consider when deciding where to locate or where to expand. Insights derived from the World Well-Being Project could make those determinations easier, and could also convince policy-makers to take action that would address factors that could be dragging down their communities’ well-being scores, Eichstaedt says.
Finding the Right Data
The project’s early admirers were academics and scientists, such as behavioral economists who realized “that all these variables are incredibly hard to measure for communities, and we might have a method here where we can do this for the first time,” said Eichstaedt.
But how can researchers pinpoint the optimal type or volume of data needed to take accurate measurements? “It is a nuanced question to think through what is the right level of [data] aggregation and the right level of analysis to measure psychological states and to think about policy,” he said. For starters, the county level or zip code level might be the right size where a change in a particular neighborhood could be measured, he added.
Although such research could be done using posts from other social media sites, such as Facebook and Instagram, tweets are the best fit, said Eichstaedt. “It is public, so it is very easy to mass-collect, [and] you don’t need to secure consent from individuals, which means you don’t have to recruit them or pay them,” he added. He noted that it’s easier to derive and track meaning from text rather than images.
The Business Case for Well-being
Information about community well-being has a number of real-world potential business applications, Eichstaedt noted. For one, the real estate industry could use it to assess the desirability of different neighborhoods – and, in the future, even determine community health at the block level. Eichstaedt said it’s already possible to use overlays in ArcGIS – a geographic information system that combines maps with other data – to see that his own close neighbors “use laptops and drink a lot of lattes.”
As for uses in the future, Eichstaedt wondered about the possibilities for political analysis. “We see what is it in communities that makes people stressed, that makes people happy, that makes people satisfied” – all of which could factor in when they decide whom to vote for or which political party to support, he said.
“There will come a time when everything you write in shared digital spaces will be part of these market forces.”
The larger applications of such research depend on the incentives, or “who cares and for what reason,” he said. For example, in a state that is covered entirely by one health insurer – meaning there is no competition — information about people’s stress levels “becomes actionable, because you have somebody who cares that the stress doesn’t turn into smoking, diabetes or heart disease,” he explained. “The more you have a policy framework around [the data] that rewards people for doing something about it, the more this will be used for something better.”
Wrongful Uses and Protections
Eichstaedt recalled last year’s scandal involving the U.K. insurer Admiral trying to use Facebook-derived personality estimates in the pricing of its insurance. “So, if I know that you are neurotic, that you have problems with impulse control, that you are risk-seeking or reward-seeking, I’d like to take that into account in pricing my insurance to you,” he said. Admiral was forced to pull the plans following a media and public outcry.
“But there will come a time when everything you write in shared digital spaces will be part of these market forces,” Eichstaedt predicted. He called for “21st century data ownership” and “digital privacy frameworks” that establish an individual’s ownership of the information gleaned from his or her data.