piton

Sentiment Analysis with TextBlob and Python

Sentiment Analysis with TextBlob and Python
In this lesson, we will use one of the excellent Python package - TextBlob, to build a simple sentimental analyser. We all know that tweets are one of the favorite example datasets when it comes to text analysis in data science and machine learning. This is because Tweets are real-time (if needed), publicly available (mostly) and represents true human behavior (probably). That is why tweets are usually used while doing any type of proof of concepts or tutorials related to Natural Language Processing (NLP) and text analysis.

Using TextBlob in Industry

Just like it sounds, TextBlob is a Python package to perform simple and complex text analysis operations on textual data like speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. Although there a lot more use-cases for TextBlob which we might cover in other blogs, this one covers analysing Tweets for their sentiments.

Analysis sentiments have a great practical usage for many number of scenarios:

Getting Started with TextBlob

We know that you came here to see some practical code related to a sentimental analyser with TextBlob. That is why we will keep this section extremely short for introducing TextBlob for new readers. Just a note before starting is that we use a virtual environment for this lesson which we made with the following command

python -m virtualenv textblob
source textblob/bin/activate

Once the virtual environment is active, we can install TextBlob library within the virtual env so that examples we create next can be executed:

pip install -U textblob

Once you run the above command, that's not it. TextBlob also need access to some training data which can be downloaded with the following command:

python -m textblob.download_corpora

You will see something like this by downloading the data it required:

You can use Anaconda as well to run these examples which is easier. If you want to install it on your machine, look at the lesson which describes “How to Install Anaconda Python on Ubuntu 18.04 LTS” and share your feedback.

To show a very quick example for TextBlob, here is an example directly from its documentation:

from textblob import TextBlob
text = "'
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
"'
blob = TextBlob(text)
print(blob.tags)
print(blob.noun_phrases)
for sentence in blob.sentences:
print(sentence.sentiment.polarity)
blob.translate(to="es")

When we run the above program, we will get the following tag words and finally the emotions the two sentences in the example text demonstrates:

Tag words and emotions helps us to identify the main words which actually make an effect on the sentiment calculation and the polarity of the sentence provided to the. This is because that meaning and sentiment of the words change in the order they are used so all of this needs to be kept dynamic.

Lexicon based Sentiment Analysis

Any Sentiment can simply be defined as a function of semantic orientation and intensity of words used in a sentence. With lexicon based approach for identifying emotions in a given words or sentences, each word is associated with a score which describes the emotion the word exhibits (or at least tries to exhibit). Usually, most of the words have a pre-defined dictionary about their lexical score but when it comes to human, there is always sarcasm intended, so, those dictionaries are not something we can rely on 100%. The WordStat Sentiment Dictionary includes more than 9164 negative and 4847 positive word patterns.

Finally, there is another method to perform sentiment analysis (out of scope for this lesson) which is a Machine Learning technique but we cannot make use of all words in an ML algorithm as we will surely face problems with overfitting. We can apply one of the feature selection algorithm like Chi Square or Mutual Information before we train the algorithm. We will limit the discussion of ML approach to this text only.

Using Twitter API

To start getting tweets directly from Twitter, visit the app developer homepage here:

https://developer.twitter.com/en/apps

Register your application by completing the form given like this:

Once you have the all the token available in the “Keys and Tokens” tab:

We can make use of the keys to get the required tweets from Twitter API but we need to install just one more Python package which does the heavy lifting for us in obtaining the Twitter data:

pip install tweepy

The above package will be used for complete all the heavy-lifting communication with the Twitter API. The advantage for Tweepy is that we don't have to write much code when we want to authenticate our application for interacting with Twitter data and it is automatically wrapped in a very simple API exposed through the Tweepy package. We can import the above package in our program as:

import tweepy

After this, we just need to define appropriate variables where we can hold the Twitter keys we received from the developer console:

consumer_key = '[consumer_key]'
consumer_key_secret = '[consumer_key_secret]'
access_token = '[access_token]'
access_token_secret = '[access_token_secret]'

Now that we defined secrets for Twitter in the code, we're finally ready to establish a connection with Twitter to receive the Tweets and judge them, I mean, analyse them. Of course, the connection to Twitter is to be established using OAuth standard and Tweepy package will come in handy to establish the connection as well:

twitter_auth = tweepy.OAuthHandler(consumer_key, consumer_key_secret)

Finally we need the connection:

api = tweepy.API(twitter_auth)

Using the API instance, we can search Twitter for any topic we pass to it. It can be a single word or multiple words. Even though we will recommend using as few words for precision as possible. Let's try an example here:

pm_tweets = api.search("India")

The above search give us many Tweets but we will limit the number of tweets we get back so that the call doesn't take too much time, as it needs to be later processed by TextBlob package as well:

pm_tweets = api.search("India", count=10)

Finally, we can print the text of each Tweet and the sentiment associated with it:

for tweet in pm_tweets:
print(tweet.text)
analysis = TextBlob(tweet.text)
print(analysis.sentiment)

Once we run the above script, we will start getting the last 10 mentions of the mentioned query and each tweet will be analysed for sentiment value. Here is the output we received for the same:

Note that you could also make a streaming sentiment analysis bot with TextBlob and Tweepy as well. Tweepy allows to establish a websocket streaming connection with the Twitter API and allows to stream Twitter data in real time.

Conclusion

In this lesson, we looked at an excellent textual analysis package which allows us to analyse textual sentiments and much more. TextBlob is popular because of the way it allows us to simply work with textual data without any hassle of complex API calls. We also integrated Tweepy to make use of Twitter data. We can easily modify the usage to a streaming use-case with the same package and very few changes in the code itself.

Please share your feedback freely about the lesson on Twitter with @linuxhint and @sbmaggarwal (that's me!).

Remap your mouse buttons differently for different software with X-Mouse Button Control
Maybe you need a tool that could make your mouse's control change with every application that you use. If this is the case, you can try out an applica...
Microsoft Sculpt Touch Wireless Mouse Review
I recently read about the Microsoft Sculpt Touch wireless mouse and decided to buy it. After using it for a while, I decided to share my experience wi...
AppyMouse On-screen Trackpad and Mouse Pointer for Windows Tablets
Tablet users often miss the mouse pointer, especially when they are habitual to using the laptops. The touchscreen Smartphones and tablets come with m...