Post 6: Scraping the web for data

The internet has long been a space through which people have used to voice opinions and thoughts whether it be about the most banal or the most controversial and politically charged of issues. In my scraping of data I look specifically toward Twitter due to the succinct and large variety of data and media that can be uploaded and more importantly for me, accessed with greater ease due to the highly categorised nature of tweets. Furthermore, through the advanced search function on Twitter I am able to access information with specifics such as certain hashtags, words, locations, accounts and times whilst also being able to block and deny other channels of information.

In my first data scrape using the Twitter Archiver I searched for the hashtag #climatechange along with tweets containing the word “Australia” to better localise the information I gathered. The time frame from which I gathered data spans over the last 5 days (moving backward from September 5th):

Screen Shot 2016-09-05 at 8.31.48 PMScreen shot of the most recent twitter results relating to #climatechange and “Australia”

Points to be made from my initial data scrape:

1. A large portion of the tweets that I gathered were retweets from Greenpeace, WWF and other environmentally focused organisations as opposed to opinion tweets. This makes it difficult to gauge the emotional landscape of users regarding climate change as information isn’t personalised and is often just headlines that are generalised or can be sensationalised for effect.

2.  Many of these were also centred around issues regarding the Great Barrier Reef and oceanic issues regarding Australia, as well as policy and government action regarding climate change.

3. Many of the tweets did not necessary have an Australian origin. This could probably have been avoided by using the location function on the data scraper, however I came across many problems trying to use the geotag feature hence I resorted to using Australia as a search term instead.

On the contrary I did a more generalised global data scrape and found that the following were the top hashtags that accompanied tweets containing #climatechange.

  1. G20
  2. cop21
  3. climate
  4. iucncongress
  5. auspol
  6. betterbusinesssx17
  7. parisagreement
  8. unfao
  9. actonclimate
  10. science

It was interesting to see that discussion of climate change tended to emerge in response to the activity of climate organisations or government and political bodies as opposed to being active, self generated discussion. What this means is that despite climate change being a big global issue, it is not something that individuals necessarily act upon or are actively conscious of day to day. Even so, it does reveal that people are however when stimulated are interested in how bigger powers are looking to reverse the effects of climate change and are also engaged in a more global discussion of the issues as opposed to a more localised mindset.