Post 6: Scraping the web for data

Internet and social media play a particular role in the LGBT community. It is often used as a retreat for most LGBT people as it provides safety and support, which may not be available to them in the real world. Social media is commonly used for self-affirmation. A process of bringing awareness that is important to ones self. This process of self-affirmation allows individuals to become more open-minded and less defensive.

Twitter is a social media platform that allows registered users to share a single message to one or multiple other users. The major function of twitter is the principle of followers. This principle allows the user to customize their feed to their interest. Tweet is a twitter term for message. This is an essential feature on twitter, which allows users to share or send message to fellow twitters. By default, tweets are publicly visible to all users. However user can change their setting to privatize message delivery.

Twitter’s unique quality is its short 140 character limit tweets. They do not try to be all-in-one social platform but rather focus on their simple, real time source, text based communication. It has become one of the best social media outlets for venting out for satisfaction and live blogging. The result of this made twitter the fastest news breaking social platform. These special characteristics became my main reasons for choosing Twitter for data scraping on my chosen issue, LGBT youth.

Screen Shot 2016-09-05 at 2.16.48 pm.png
Twitter Archiver extracting tweets that is associated with LGBT Youth
Screen Shot 2016-09-05 at 2.50.35 pm
Filtered by Most Retweet
Screen Shot 2016-09-05 at 2.56.43 pm
Filtered by Most Followers

In this exercise I wanted to extract all LGBT Youth related tweets and divide them into involvement with the four main stakeholders, which I have identified in pervious posts. However in order to gain some basic insight into twitter user’s attitude towards LGBT youth I used Twitter Archiver to filter through what the most retweeted LGBT Youth related tweets were. Following that I went onto extracting most followed twitter account that associated themselves with LGBT. My data scrapping process consisted of identifying the who, what, why and how.

flow chart-01
Flow chart graphic
flow chart with image
Visualised example of Flow chart process

I wanted my scrapped data categorized into positive, negative or questioning, as mentioned in Twitter’s advanced search. This first step allowed me to gain access to individual’s emotional opinion towards LGBT in a positive, negative or questioning (confused) emotion. The outcome of this process motivated me to dig deeper to understand cause and reason behind the intention emotion behind the tweet.



By April Bae

%d bloggers like this: