post six: data scraping / web surfing

Post Six: Scraping the Web for Data

As much as I love memes, the true value of social media rests within its ability to give a voice to marginalised people. Where the media was once dominated by the lucky few in power, the monopoly has been dismantled and the individual reigns supreme.

While the exact function of this new social media machine is difficult to pin down, one cannot deny it as being a rich source of information relating to prevailing attitudes around a specific issue – in my case, climate change.

Twitter scrape

While I initially planned to dive into the trenches of reddit, climate change is a topic that already struggles to find shape. I needed something that could give me lots of succinct data and quickly. For that reason, I chose to use twitter as my data source. It is a platform that has a strict character limit, uses hashtags, allows reposting and has a fairly balanced representation of well known people (journalists, activists, artists, politicians etc.) and regular folk.

I began my data scraping with very loose parameters. Using a twitter archiver plug-in, I generated a google spreadsheet that looked at the use of the #climatechange hashtag with ‘Australia‘ as a keyword. The result was some 1,500 tweets over the period of 12 days.

Twitter scraping using #climatechange and key term ‘Australia‘ (2016)

There were a few things that stood out to me when reading this data set;

  • The majority of the tweets were from mobile sources. Many conclusions could be drawn from this, but it is telling of the mobility of technology and suggests that the responses are made in a raw state. In other words, the tweeter is unlikely to sit and contemplate their response, they’re more likely to post on a whim.
  • Roughly half of the tweets containing #climatechange and keyword Australia were from countries other than Australia. Paris, Colombia, Czech Republic, USA, UK, Italy and more were all represented. My first thought was that this is representative of the fact that climate change is a global concern that is uninhibited by borders. My second thought was that individuals like to transfer blame – using another countries failings (in this case coral bleaching, emissions, and Australia’s G20 ranking) as a scapegoat for what is in fact a collective issue.
  • More than 85% of the tweets were RT’s (retweets). These are less emotional and tend to be a way to share information or show an individuals interest. If I were to run this scraper again, I would refine the search parameters to omit retweets so that I could get more unique information.
  • Nevertheless, its important to realise that most of the conversation is driven by sources external to social media; typically newspaper reports, journal articles or opinion pieces. These serve as the impetus for conversation/argument over social media, but rarely generate as much conversation as comments/tweets directly from a twitter user. One gets the feeling that retweets are slightly less emotive and more detached which seems to fit the current approach to the climate change model.


In greater depth

While I think that the broad overview was instrumental in grasping the presence of my issue in social media, it also just gave me a chance to understand the actual process and purpose of data scraping.

To supplement my findings and more closely respond to my vein of research, I decided to use the advanced search function in twitter to look closely a small population of tweeters between a tight timeframe.

My search parameters were:

Screen Shot 2016-09-03 at 8.14.32 PM
Climate / change / #mwf16

I decided to look at the tweets coming from those in attendance of the 2016 Melbourne Writers Festival. Reason being, I’m incredibly interested in the intersection of art and science and think that writers are instrumental in giving voice to the marginalised people and taboo issues of their time.

While the results were limited (32 tweets) the findings were profound. In relation to my interests specifically, there was a great focus on the impact of climate change on Indigenous communities and it’s subsequent relabelling as climate trauma.

Screen Shot 2016-09-03 at 8.20.07 PM.png
twitter search (2016)

In dissecting the language used I discovered the sentiment to be one that sat somewhere between disillusionment and apathy. The tweeters expressed a sense of empowerment in relation to the speakers, but a sense of hopelessness and anger when speaking about specific issues including Indigenous rights and environmental trauma.

Summing it up

As I mentioned in a previous post, social media doesn’t always focus on what’s salient to our progress as a human race, but there’s still a great deal to be learnt. In fact, the absence of much discussion around issues such as climate change is telling of the attitudes and emotions that surround it. Perhaps its not so much disinterest as it is disengagement or distress.

People feel reluctant to play with concepts that are too daunting. My own analysis of social media (while minute) has shown me that a great majority of people do understand the seriousness of it – on the twitter platform anyway. What doesn’t emerge are any viable solutions.

If I were to re-do this task, I’d love to have to time to learn some basic coding. I think that the creation of a twitter bot which could create a kind of poetic, generative dataset would be really insightful.

To expand upon my earlier primary research (cultural probe), I’d create a bot that looked at twitter users descriptions of sounds, sights, smells and sensations within a particular location. I think that the sensory experience is fundamentally important when looking at why people do/don’t react to climate change as you think they would. It would be set out as a kind of general “sydney is (x) today #proof” which plays on the use of the hashtag by climate change denialists.



Header Image: 

Ozaslan, M. 2014, Step, Saatchi Art, viewed 22nd September 2016, <;

%d bloggers like this: