Twitter is nothing if not versatile; it’s been used for everything from procrastination to revolution. Could the data it produces also be used to understand, and even help plan, cities?
Justin Hollander, an associate professor of urban and environmental policy and planning at Tufts, thinks it can. About a year ago, he founded the university’s Urban Attitudes Lab, which is devoted to studying how “big data” can inform planning and policy. He and his colleagues have begun developing models to analyze Twitter posts for key words and sentiments, in an effort to harness “the free-flowing ideas and thoughts that people express on social media,” he says.
“A lot of what I’m trying to uncover is, where are people happy? What makes them happy?” says Hollander. “This has the potential to revolutionize how local governments in particular plan for the future.”
The idea sounds uncannily reminiscent of Project Cyberfolk, a 1970s plan by the socialist government of Chile to monitor the real-time happiness of the Chilean people in response to government policy, as chronicled in a new book. Back then, the “algedonic meter,” a mood-measuring device that was supposed to transmit its inputs through the TV networks, sounded like science fiction; now, people constantly inform the world how they feel about lunch and what they think of the latest government snafu.
The challenge is extracting the meaningful information from that torrent of data. So far, Hollander has completed just one study, which attempted to compare the sentiments expressed in Twitter posts with those expressed in civic meetings. He collected 122,187 tweets geotagged to New Bedford, Mass., from February through April of this year, and analyzed them with an automated tool designed to classify sentiments as positive or negative. About 7 percent of the messages were categorized as positive, and 5.5 percent as negative; in the minutes, the corresponding figures were 1.7 percent and .7 percent. Zeroing in on terms related to civic life, he then compared the frequency of 24 key words in the tweets and the minutes. He found some similarity in the rankings of how often Twitter users and meeting participants used the key words, which included “school,” “health,” “safety,” “parks” and “children.”
The study was too preliminary to furnish any conclusions, but the main advance was to develop a systematic way of categorizing sentiments. (This automated method still isn’t perfect; it has trouble detecting sarcasm, for instance.) Hollander and a colleague are currently working on another study examining Twitter posts from four cities in Massachusetts that will be voting on whether to allow casinos. After the election, the professors will analyze the Twitter content in light of the results. At this point, a major objective is to assess how what people say on Twitter matches up — or doesn’t — with what they say and do in other settings.
Hollander is far from alone in his interest in using social media data to analyze cities. After all, it’s a vast trove of data, tantalizingly accessible; Twitter alone generates about 500 million short messages per day. “Some research starts with, here’s this problem we want to solve, and some starts with, here’s this opportunity, let’s see what we can do with it,” says Dan Tasse, a doctoral candidate in human-computer interaction at Carnegie Mellon who authored a recent conference paper titled “Using Social Media to Understand Cities.” “This was kind of the latter … we saw, ‘Well, shoot, we have pretty detailed information, and it’s just publicly available.”
This mode of research offers certain clear advantages. Aside from the sheer quantity of data, there’s the geographic specificity of geotagged posts. (Only a small percentage of posts are geotagged, but the absolute number is still huge.) Indeed, most of the work based on social media has relied exclusively on location data to determine mobility patterns; Hollander is one of the few researchers who is focusing on the message content as well.
Another obvious boon is the speed at which new data appears. The census is conducted every 10 years; the American Community Survey is annual. Tweets, by contrast, provide new data by the second. “You could do it every hour if you wanted,” Hollander says. “You wouldn’t want to do that, but certainly more often than every 10 years.”
The hope, then, is that the data can provide some of the same knowledge as older methods, but much more quickly and cheaply. For example, one recent study examined the venues in Foursquare check-ins to assess a neighborhood’s socioeconomic level. But Hollander sees another benefit: The spontaneous, freely volunteered opinions on social media might be more authentic reflections of public attitudes than what people reveal in the arguably more artificial context of a survey. “There are always going to be power imbalances, when there’s an official at your door,” says Hollander. “There’s a tendency for people to introduce their own biases in the responses.”
Of course, social media data is just as likely to contain biases. There’s “a lot more posturing,” Hollander acknowledges; the awareness that we are presenting our thoughts to the world will inevitably shape what we say. What’s more, certain groups are still more likely to use social media than others. Due to this unrepresentativeness, it’s very risky to extrapolate too much from the data, especially to inform policy.
For these reasons, skeptics have cautioned against relying too heavily on this tempting new resource. Robert Goodspeed, an assistant professor of urban and regional planning at the University of Michigan, wrote a paper last year on the data’s “limited usefulness.” “Data mining social media alone, you only have one data point. You only know where they go but not why, or you know what they’re saying but not what they’re doing,” Goodspeed says. “It’s very large, but no matter how large one perspective gets, it’s still at the end of the day one perspective.”
Still, even Goodspeed believes that social media data can play some role; he advocates combining it with tried-and-true methods. Hollander, too, agrees that this form of information is a supplement, not a replacement: In-person interviews, surveys, focus groups and official data are all essential in patching together an understanding of a city. The intention is that this novel material can add another layer to the picture, and perhaps give us a starting point for asking deeper questions.
Longer term, Hollander believes that not only academics but also city governments should consult this rich source — our present-day algedonic meter — to help make decisions, along with holding meetings and using other planning tools. “I’m not saying they shouldn’t do the survey,” he says, “but look at the social media data as well.”
The Science of Cities column is made possible with the support of the John D. and Catherine T. MacArthur Foundation.
Rebecca Tuhus-Dubrow was Next City’s Science of Cities columnist in 2014. She has also written for the New York Times, Slate and Dissent, among other publications.