Post 6: scraping the web

For this task, I have chosen to scrape data from Twitter. Twitter is known as a microblogging social media site, in which users have profiles from which they can send Tweets, i.e. messages of no more than 140 characters. Users can include links, images, and videos in their tweets. On each user’s home page is a ‘timeline’, i.e. a feed comprised of tweets from users they ‘follow’. Users can interact with others by adding others’ tweets to their favourites, ‘retweeting’ (broadcasting that tweet from their own account), and replying to tweets. Users can also directly tweet to other users by including the recpient’s username in the tweet.

The unique qualities of Twitter are those proffered by its ease of access and strict character limit. It is very, very easy and quick to send a Tweet, and so Twitter is a perfect tool for live, ‘in-the-moment’ updates on a particular event, situation, or state of mind. Its character limit also tends to force brevity; messages are condensed, increasing immediacy often at the expense of nuance and complexity. This latter quality leads to the highly polarised nature of many dialogues on Twitter.

The process I used to scrape Twitter for data comprised of using the advanced search function to find tweets in which people used the exact phrase “my mental health.” I used this phrase because I was curious about people’s personal perspectives on their own mental health.

I first limited my search results to those near Sydney, New South Wales, then exapnded it to include tweets from all over Australia, then eventually all over the world (so long as Tweets were in English) when it turned out that the former was a little limiting.

Below is a five-point summary of my findings:

  1. Most of the tweets I viewed mentioned mental health in some kind of negative context. This is unsurprising, as those who do not suffer issues with mental health are probably less likely to talk about it.
  2. An interestingly large number of tweets cited things/activities that improved their mental health that are not typically thought of (i.e. often viewed as non-medical solutions), e.g. chocolate, music, going on walks.
  3. Quite a few people talked about non-participation due to their mental health, e.g. missing out on a concert, not engaging with negative interactions from other users, dropping out of classes. This shows that mental health, just like physical health, can stop people from going about their lives as they otherwise might. It also shows that these users, in particular have come to understand that certain actions they take may exacerbate their problems and they have acted accordingly.
  4. There were quite a few tweets correlating exercise with improvements in mental health. Some of these were personal experiences, others were in the form of links to articles, etc.
  5. Not all the tweets were discussing mental health in a totally serious/medical context; some were jokes unrelated to the medical phenomenon of mental health, others were jokes regarding the phenomenon, etc. Some of these, I believe, were jokes unrelated to the issue; others appeared more to be jokes as way coping and self-expression by those suffering mental health issues.

I imagine that I could use this type of data to create a visual design response in the form of a chart or graph that visualises the various types of tweets about mental health, separated into various categories e.g. talking about their current bad mental health, talking about their mental health being good/improving, tweets with humour, tweets without. It would be interesting to map the correlations between various qualities. Mapping age and location might also reveal interesting insights.

In this task, I focussed on the phrase “my mental health”, which revealed very personal tweets. I imagine it would also be interesting to generate a visual design response comparing personal and impersonal data on the topic and comparing the difference in tone/content in each.

%d bloggers like this: