In the past one decade, there has been an exponential surge in the online activity of people across the globe. The volume of posts that are made on the web every second runs into millions. To add to this, the rise of social media platforms has led to flooding to content on the internet.

Social media is not just a platform where people talk to each other, but it has become very vast and serves many more purposes. It has become a medium where people

  • Express their interests.
  • Share their views.
  • Share their displeasures.
  • Compliment companies for good and poor services.

So in this article, we are going to learn how we can analyze what people are posting on social networks (Twitter) to come up a great application which helps companies to understand about their customers.

Before we drive further, let’s look at the table of contents of this article.

Table of contents:

  • People emotions to how customers felt about the product
  • How to create the twitter app
  • Sentiment analysis using twitter tweets
    • Why sentiment analysis?
  • Challenges in performing sentiment analysis on twitter tweets
  • Implementing sentiment analysis application in R
    • Extracting tweets using Twitter application
    • Cleaning the tweets for further analysis
    • Getting sentiment score for each tweet
    • Segregating positive and negative tweets
  • Conclusion

People emotions to how customers felt about the product

Social networks has grown from a mere chatting platform to a storehouse of data which could help companies solve many problems.

Which could help companies understand their customers better. What competitors are doing. Which could help companies understand what customers are talking about it.

Though at prima facie, it looks like a storehouse of insights it may not be as easy to extract the relevant information out of the unstructured text. Analyzing textual data is always difficult because of the inherent ways in which people write their posts.

Nevertheless, posts made by people on social media can be very expressive and help us understand their sentiments and emotions. Twitter, being one of the most popular social media platforms, is a platform where people often resort to express their emotions and sentiments about a brand, a product or a service.

How to create the Twitter app?

Twitter has made the task of analyzing tweets posted by users easier by developing an API which people can use to extract tweets and underlying metadata.

This API helps us extract twitter data in a very structured format which can then be cleaned and processed further for analysis.

To create a Twitter app, you first need to have a Twitter account. Once you have created a Twitter account, visit Twitter’s app page (Click here) and create an application.

Write the basic details such as application name, description along with a website name. You may enter any test website name as well. Once you have entered these details, you will get keys and access tokens. You will get 4 keys and tokens:

  1. Consumer Key (API Key)
  2. Consumer Secret (API Secret)
  3. Access Token
  4. Access Token Secret

These keys and tokens will be used to extract data from Twitter in R.

Sentiment Analysis Using Twitter tweets

Before going a step further into the technical aspect of sentiment analysis, let’s first understand why do we even need sentiment analysis.

Why sentiment analysis?

Let’s look from a company’s perspective and understand why would a company want to invest time and effort in analyzing sentiments of the posts. Analyzing each post and understanding the sentiment associated with that post helps us find out which are the key topics or themes which resonate well with the audience.

If the sentiment around the post is very positive, then people want to talk about the topic in that post. The topic could be a product or a service or a social message or any other thing. Understanding this can help us decide the kind of posts the company needs to put on social media platforms to increase the user engagement.

Also, analyzing the sentiment of a company over a period could help us relate its sales data with the overall sentiment. Was there a negative campaign at some time which resulted in the negative sentiment of the company.

Addressing questions

  • Thereby, resulting in the decline in sales during that period?
  • Was there a huge spike in positive sentiment because a celebrity talked about company’s product?
  • Did that positive spike result in positive sales?
  • Understanding the posts with negative sentiment could help us find the common themes in these posts?
  • Is customer service a common topic among posts which have high negative emotion?

All these questions could help us understand how customers are perceiving the company. What they are talking about the company product. What are they liking and what are they disliking.

I am sure, you will agree with me if I say, “Sentiment analysis of tweets or social media posts can help companies better analyze customer feedback and opinion, and better position their strategy.”

Challenges in performing sentiment analysis on twitter tweets

Given all the use cases of sentiment analysis, there are a few challenges in analyzing tweets for sentiment analysis. The first one is data quality. The Twitter application helps us in overcoming this problem to an extent.

After basic cleaning of data extracted from the Twitter app, we can use it to generate sentiment score for tweets. The second problem comes in understanding and analyzing slangs used on Twitter.

People have a different way of writing and while posting on Twitter, people are least bothered about the correct spelling of words or they may use a lot of slangs which are not proper English words but are used in informal conversations.

There is a lot of research going on in this area and a lot of people have been able to develop slang dictionaries to understand their meaning. We won’t be focusing on this part in this article; we will use the standard dictionaries and packages available in R for sentiment analysis.

The third and the biggest problem in sentiment analysis is decoding sarcasm. Since sentiment analysis works on the semantics of words, it becomes difficult to decode if the post has a sarcasm.

Implementing sentiment analysis application in R

Now, we will try to analyze the sentiments of tweets made by a Twitter handle. We will develop the code in  R step by step and see the practical implementation of sentiment analysis in R.

The code is divided into following parts:

  1. Extracting tweets using Twitter application
  2. Cleaning the tweets for further analysis
  3. Getting sentiment score for each tweet
  4. Segregating positive and negative tweets

Extracting tweets using Twitter application

We will first install the relevant packages that we need. To extract tweets from Twitter, we will need package ‘twitteR’.

‘Syuzhet’ package will be used for sentiment analysis; while ‘tm’ and ‘SnowballC’ packages are used for text mining and analysis.

Next, we will invoke Twitter API using the app we have created and using the keys and access tokens we got through the app.

We have invoked the Twitter app and extracted data from the twitter handle ‘@realDonaldTrump’. We will now see what format we have got the extract and what all steps do we need to take to clean the data.

Cleaning the tweets for further analysis

We get a total of 16 variables using ‘userTimeline’ function, snapshot of the sample data is shown below.

Twitter Sentiment analysis using R

Twitter Sentiment analysis using R

The field ‘text’ contains the tweet part, hashtags, and URLs. We need to remove hashtags and URLs from the text field so that we are left only with the main tweet part to run our sentiment analysis.

Our current text field looks like below:

This contains a lot of URLs, hashtags and other twitter handles. We will remove all these using the gsub function.

Our output now looks like below:

Now, we have only the relevant part of the tweets and we can run our sentiment analysis part on the data.

Getting sentiment score for each tweet

We will first try to get the emotion score for each of the tweets. ‘Syuzhet’ breaks the emotion into 10 different emotions – anger, anticipation, disgust, fear, joy, sadness, surprise, trust, negative and positive.

The above output shows us the different emotions present in each of the tweets.
Now, we will use the get_sentiment function to extract sentiment score for each of the tweets.

Let us see how the score of each of the tweets has been calculated. In all, there are 154 tweets that we are evaluating, so there should be 154 positive/negative scores, one for each of the tweets.

Segregating positive and negative tweets

Now, we will segregate positive and negative tweets based on the score assigned to each of the tweets.

So, now we have analyzed the twitter handle of Donald Trump and got the sentiment around tweets. The break of total number of tweets by sentiment is

Conclusion

I’m sure you can now easily relate to the significance of sentiment analysis that I have discussed at the beginning of the article.

Sentiment analysis could be extended to a far greater extent, even to images as well. Though there are a lot of tools available in the market already but having practical knowledge of how does the entire process works is beneficial.

Moreover, the available tools are very expensive and do not offer the level of flexibility and customization that you can develop using R.


Submit a Comment

Your email address will not be published. Required fields are marked *