POST 6: Scraping The Web For Data


In our current world, we interact and communicate with each other primarily through social media. The online network is one of the strongest platforms in the world where people from anywhere can connect, share and value each other’s opinions.

For this exercise, I have chosen twitter, which is one of the most popular online social networking media platforms. Twitter was created by Jack Dorsey, Evan Williams, Biz Stone, and Noah Glass in March 2006 and published in July 2006. It allows users to post up to 140-character long messages, called “Tweets”. Users can share their opinions, values, engage in debate, and just enjoy their time socialising on Twitter. Twitter is rapidly growing its popularity worldwide and becoming a strong voice in the online platform arena. In 2012 more than 100 million users posted 340 million tweets online and it was the most visited websites in 2013, which is why it has been called “The SMS of the internet”. Right now in 2016, Twitter has more than 310 million users who are active.

Twitter is one of the unique platforms, acting as a critical outlet especially for the media, businesses and politicians. In the past, any political party or personal used to address any issue or their opinion usually to a small target group. But today they use twitter as a platform to express their opinion and anyone in the world following the politician or party can also share their thoughts on the matter and possibly participate in a debate.

It is a platform where people can discuss their opinions and perspectives without fear of being recognised. Journalists are now using Twitter as one of their most active platforms to create awareness and spread the news around the world. Fans of popular and famous celebrities also use this platform to follow their idols to gauge their lifestyle. Twitter uses the concept of hashtagging, which makes any particular topic or event globally accessible.  Any twitter user can search for the highest trending hash tags to find out the most discussed a topic of the day.


Scraping the web and through the Twitter archive, I was interested in learning how to create a Google and an Excel spread sheet. I learnt both ways and the twitter archive was really easy to follow to step. All I needed to do create a search rule for Twitter. At this point of my research, I did a very broad search, which I thought would be too difficult as it involved finding a minor issue of data privacy.



However I wanted to expand my knowledge of data scraping through a different tool and I found ‘DATAPIPELINE’, which was easy to navigate and is much easier to find resources for creating a spreadsheet, showing the results instantly. It didn’t require a Google account. I just simply entered keywords for the issue I was looking for and it showed all the results. It was also easy to refresh the search engine and once satisfied, I could export the spreadsheet. It also allowed me to search different categories of contents such as Top Re-tweets, top favourite and top hashtags, among other things. The spreadsheet also helped me to investigate other information such as the tweet creation time, geolocation etc.



A flow chart of web scrapping research task through DATAPIPELINE:

Step 1: Issue – Online Privacy
Step 2: Refine Keywords – Social Media privacy
Step 3: Find out top re-tweets, top favourites, top hashtags, top mentions, top URLs, top users.
Step 4: Visit top re-tweets, hashtags, URLs
Step 5: Export to excel
Step 5: Refine key issue through the ‘Advance Search’ feature.

One the first scrap data analysis, I found 500 tweets relating to online social media privacy. The initial research for twitter allowed me to observe the current issues and concern about online privacy in general. There were contrasting opinions on the issue, some commenters having a more jovial view on it, while others had considered it more seriously. It is was a great learning exercise for me, to find out the critical issues and public opinions surrounding this issue. This research showed me that this topic isn’t an isolated one, rather something that connects people worldwide, through Twitter.



One of the top re-tweets was regarding privacy, and specifically talking about its violation regarding women. It was interesting to see how people debate their privacy concerns. Some people argued that posting photos on social media and feeling violated by others, we should think about the type of photos we are posting. However another argument is that as an example of freedom, we should feel safe to post whatever we like without any fear, it is a question of choice. Our society and our privacy need to change and I strongly feel we are losing our empathy.

These are other tweets I found took a humorous, which related online privacy to bovine privacy. This could be a light-hearted way to draw attention to people, I think it is a very clever approach as most people like to enjoy humour.

After this initial research, I was interested to find out more on the filter used on photos on social media. Is there any privacy concerns about them? And how they are trying to save their privacy and what is the debate?

So for my advance search I tried to identify one of these minor issues. I put more emphasis on social media photos privacy #socialmedia, #dataprivacy and #onlineprivacy.


It was interesting to note that I found 97 tweets about photos privacy, and where the top tweets related to the topic of parents publishing their child photos on social media without their permission and how government and intelligence agencies are constantly hacking and collecting our photos through social media. There is an enormous reaction on an online platform where most of the people are really concerned about this issue.


The twitter search also shows the top most recent issue where a 18years old women who sue her parents to publish her photos on Facebook. Before she opened her own online social media platform she has already 500 pictures on facebook either she is sitting on the toilet or lying on the cot (Huggler, J 2016). Her every moment, all photos made in public and she was keep asking her parents to delete them but they refused. So after she turned 18 she had to take legal action.

In France, there are strict laws for posting photographs on social media without permission. The fine can be up to US$38,00 to US$45,000. In the UK, a recent survey shows that the average parent posts at least 1500 pictures of their child before they turn 5. The survey also finds out how parents are not concerned about online social media privacy, they never check any privacy or believe someone can easily hack and take their photos. It shows 79% parents believed those photographs can’t be seen by strangers.




The research also shows there are close to 4600 votes who agree that parents shouldn’t post photographs online without their permission.

Throughout the whole research its clearly shows how people react and view online social media privacy. It is an important and critical issue in our society. If both the teen and senior generations are not aware of these issues, it creates a gaping divide.I that if I can visualise an emotional response, either in a humorous manner or a more serious tone, to raise awareness for online privacy, as a designer I will feel satisfied that I have had a hand in a small contribution to this society and future generations.

I believe if I can visualise an emotional response, either in a humorous manner or a more serious tone, to raise awareness for online privacy, as a designer, I will feel satisfied that I have had a hand in a small contribution to this society and future generations.

by Ayesha Mira


Wikipedia n.d., Wikipedia, viewed 3 September 2016, <;.

Johnson, M. 2013, The History of Twitter, viewed 3 September 2016, <;.

Emma Watson 2014, ‘Even worse than seeing women’s privacy violated on social media is reading the accompanying comments that show such a lack of empathy’, 1 September, Twitter post, viewed 11 September 2016, <;.

BBC Cambridgeshire 2016, ‘Street moo: Online bovine privacy protected at Google Street View blurs face of Cambridge cow’, 17 September, Twitter post,viewed 17 September 2016, <;.

Supreme Dark Lord 2016, ‘Do NOT post pictures of your children on social media. It is a violation of their privacy’, 14 September, Twitter post, viewed 17 September 2016, <;.

Huggler, J. 2016, Austrian teenager sues parents for ‘violating privacy’ with childhood Facebook pictures, viewed 11 September 2016, <;.



%d bloggers like this: