Mining social media #research

Charlotte Stephenson considers the value of social media data

Teenagers using mobilesI check my Facebook news feed every morning: cue pictures of food-covered babies, duckface selfies, football rants, celebrity scandals and, if I’m lucky, a video of a dog dressed as a human – I hardly start the day feeling enlightened.

The popularity of social media continues to rise and so does the mass of consumer-generated information held in cyberspace. This ready-collected, naturally occurring, real-time data offers a significant opportunity for academic researchers. But is there any value in the content?

Social media users voluntarily share vast amounts of information. They might disclose their location, what videos they’ve watched, what food they’ve eaten, their favourite music and political preferences. Online profiles can reveal someone’s age, gender, location and social circle. Social media trends can tell us not only what the public is thinking but how they feel about it. All this data is there, on a virtual plate, waiting to be analysed.

Data mining

Such data would be extremely tricky to analyse manually. Happily, computer whizzes have built programs that detect high-quality content in social media posts. These programs ascertain the sentiment of posts, monitor trends and identify connections between users. Some social networking sites, such as Twitter, have open data sharing policies and offer a proportion of the data to consumers free of charge.

Social media data analysis has shown great potential. Researchers claim to have used Twitter to predict box office revenue and election results. Google hits have been used to estimate flu outbreaks and stock market trends. During the Arab Spring, social media activity was shown to have shaped political debates: increases in dialogue about democracy, liberty and revolution immediately preceded mass protests.

The pitfalls

Of course there are drawbacks to using this data. We know that the majority of social media content is created by a minority of users. Those who are vocal online are likely to share certain attributes, like extraversion and youth, and are more likely to have extreme opinions, all of which are likely to bias the data. Despite this, the reach of social media is growing: in 2014, 91 per cent of households in the UK had an internet connection, and, currently, a whopping 93 per cent of internet users aged 16-24 had at least one social media profile.

Using social media also poses ethical challenges. Ensuring anonymity is a problem. It is possible to link so-called private posts to specific users. Consent is another issue. Even though online content is intended for public viewing, users don’t knowingly sign up to be the subject of a research project. The ethical responsibilities that fall on researchers conducting social media research are currently fuzzy, but legal guidance is progressing quickly.

So, where does this leave us? It’s a no-brainer that using social media data has its advantages. It’s vast, fast, free, occurs naturally and can be accessed instantly. What’s more, methods for analysing and collecting it are advancing rapidly, as is the legal governance of its use in academic research. The ethical challenges pose a significant problem for researchers, but are they deal-breakers or just points worth noting in your discussion section?

Charlotte Stephenson

Are you using social media data in your research? Join the conversation on Twitter @CERP_UK #socialmediadata

Share this page