Using Python for Sentiment Analysis in Tableau

This weeks Makeover Monday's data set was the Top 100 Song's Lyrics. After just returning from Tableau's annual conference and being eager to try their new feature, TabPy, this seemed like the perfect opportunity to test it out. In this blog post, I'm going to offer a step-by-step guide on how I did this. If you haven't used Python before, have no fear - this is definitely achievable for novices - read on!

For some context before I begin, I have limited experience with Python. I recently completed a challenging but great course through edX that I'd highly recommend if you are looking for foundational knowledge - Introduction to Computer Science and Programming Using Python. The syllabus included advanced Python including Classes and thinking about algorithmic complexity. However, to run the analysis I did, it would be helpful to look up and understand at a high level:

basic for loops

lists

dictionaries

importing libraries

The libraries I used for this, should you want to look up additional documentation, are:

pandas

nltk

time (this one isn't really necessary - I just used it to test computation time differences between TabPy and local processing.)

I have a Mac so if you're trying to reproduce with a PC, you'll find install instructions here as well.

Part 1 - Setting Up Your Environment

Make sure you are using Tableau v10.1
Open TDE with Top 100 Songs data
Install TabPy

Read through the install directions. Here's my simplified version for those not comfortable with GitHub or command line:

Click the green "Clone or Download" button.
Select Download
Unzip the file and save locally (I moved mine to my desktop)
Open your Terminal and navigate to your TabPy folder. Run these commands:

If you see this after your install finished, you're all set!

Part 2 - Connecting to TabPy in Tableau

Now it's time to setup your TabPy in Tableau. In Tableau 10.1 go to:

Help > Settings and Performance > Manage External Connection

and enter localhost since you're running TabPy on your own computer. The default port is 9004 so unless you manually changed you should leave it at that

Part 3 - Creating your TabPy Calculation

The TabPy Github page has extensive documentation you should review on using python in Tableau calculations. I simply repurposed one of the calcs they demoed during the TabPy session at #data16 - catch the replay here.

Using the Top 100 songs data set, create this calculated field.

Everything following # is a comment just to help make sense of what the code is doing. Feel free to remove that text.

Now you can use this calculated field in views with [Word] to process the sentiment score! The downside is that since this is a table calculation and also uses ATTR, you cannot use this within a level of detail calculation (LOD). So unfortunately, you cannot sum of the sentiment on the level of detail of song using this example and data structure. With some data manipulation it is possible but I won't be diving into that.

TabPy vs. Pre-Processing Data for Tableau

Unfortunately, you cannot publish vizzes using TabPy to Tableau Public. If you want to download the .twbx version I made using TabPy, you can do so here.

However, you could run this analysis outside of Tableau and simply import the output and create your viz that way. I did this which also gave me more flexibility with LODs since I no longer was using TabPy.

TabPy definitely took me less time and required less code. However, it did take

~2.5 minutes*** to process 8,668 words whereas when I ran my code (below) outside of Tableau it took under 1 second to get the scores and write them back to a CSV.

***11/17 Update: Bora Beran made a great point; be mindful of how you're addressing your TabPy Table Calc - "If you have all your dimensions in addressing we will make a single call to Python and pass all the data at once which will be much faster. Otherwise we make one call per partition. If e.g. song title is on partitioning we would send a separate request for each song. If word is on partitioning we will send a separate request per word."

At the time of posting this blog, I was addressing all dimensions in view and on a few occasions when working with this data I experienced a very slow result return time as stated. However, today when running this calc it took the same time in Tableau as I stated outside of Tableau. I don't have a clear idea as to why but I was running that query on my local machine and think it might have simply been to limited resources to process the analysis at the time.

This is what the code would like like outside of TabPy. You can run this code in a Jupyter notebook or another IDE - I used Spyder only because I used that for my class.

You can download my Tableau Public viz which uses the output of the below code to inspect further!

Here's the final viz - half of it is cut off so be sure to view it in Tableau Public:

Comments

AnonymousNovember 17, 2016 at 12:39 AM
Hi Brit,
I think for this type of analysis, as you also said, it is a good idea to preprocess since it looks like data is not dynamic. But I was wondering.. What were you using for your addressing table calc setting on the Python calculated field?

If you have all your dimensions in addressing we will make a single call to Python and pass all the data at once which will be much faster. Otherwise we make one call per partition. If e.g. song title is on partitioning we would send a separate request for each song. If word is on partitioning we will send a separate request per word.

In the GIF it looked like we're sending a large number of requests. Do you mind trying with everything on addressing? This should log only one entry in your console and I would expect it to be noticeably faster.

For the TC demo if I recall correctly we were running sentiment analysis on the fly on 18K tweets and it was less than 1.5 seconds.

Thanks,

Bora
ReplyDelete
Replies
UnknownNovember 17, 2016 at 2:12 AM
Hi Brit,

Very interesting blog. I am new to python. Could you please explain the below lines of code

1) word_score_dict[words[i]] = scores[i]

2) Why are you using list and .iteritems while creating the below dataframe. Can't we just pass the word_score_dict as is
df = pd.DataFrame(list(word_score_dict.iteritems()), columns=['word','score'])

Floyd
ReplyDelete
Replies
Ahmead DedatFebruary 1, 2017 at 1:37 AM
This comment has been removed by the author.
ReplyDelete
Replies
Ahmead DedatFebruary 1, 2017 at 1:40 AM
Thanks for you post on tableau and python.Expecting some more articles from you blog.
Tableau Training in Hyderabad
ReplyDelete
Replies
AnonymousMarch 9, 2017 at 1:19 PM
Hi Brit, which one is your calculated field? I couldn't find it in your workbook. Where did you store the following?

#SCRIPT_REAL is a function in Tableau which returns a result from an external service script. It's in this function we pass the python code.

SCRIPT_REAL("from nltk.sentiment import SentimentIntensityAnalyzer

text = _arg1 #you have to use _arg1 to reference the data column you're analyzing, in this case [Word]. It gets word further down after the ,
scores = [] #this is a python list where the scores will get stored
sid = SentimentIntensityAnalyzer() #this is a class from the nltk (Natural Language Toolkit) library. We'll pass our words through this to return the score

for word in text: # this loops through each row in the column you pass via _arg1; in this case [Word]
ss = sid.polarity_scores(word) #passes the word through the sentiment analyzer to get the score
scores.append(ss['compound']) #appends the score to the list of scores

return scores #returns the scores
"
,ATTR([Word]))
ReplyDelete
Replies
Lewis ClarkMarch 29, 2017 at 12:44 PM
Very nice
ReplyDelete
Replies
UnknownApril 10, 2017 at 3:45 AM
Thanks for sharing the information about the Tableauand keep updating us.This information is really useful
ReplyDelete
Replies
UnknownMay 11, 2017 at 10:00 PM
The blog gave me idea to use python for sentiment analysis My sincere thanks for sharing this post Thanking you
Python Training in Chennai
ReplyDelete
Replies
venkatMay 14, 2017 at 9:48 PM
Thank you so much for sharing this worth able content with us. The concept taken here will be useful for my future programs and i will surely implement them in my study. Keep blogging article like this.

Python Training In Bangalore
ReplyDelete
Replies
AnonymousMay 15, 2017 at 2:47 AM
Thanks for splitting your comprehension with us. It’s really useful to me & I hope it helps the people who in need of this vital information.

Software Testing Training in chennai
ReplyDelete
Replies
AnonymousMay 15, 2017 at 2:51 AM
Crisp.. I have decided to follow your blog so that I can myself updated.

Software Testing Training in chennai
ReplyDelete
Replies
Priya KannanMay 24, 2017 at 1:49 AM
This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.
Python Training in Chennai
ReplyDelete
Replies
ranaMay 28, 2017 at 11:31 PM
Thanks for sharing this useful information. I read your blog completely.It is crispy to study. I gather lot of information about python with the help of this blog.
Thanks for sharing..want more informaion about python.

Python Online Training
ReplyDelete
Replies
Susana LaurenciaMay 29, 2017 at 12:24 AM
Hi all dear!
I like your pages and i would like to share this post with your collection.
Thank you!!!

จีคลับ
goldenslot mobile

ReplyDelete
Replies
AnonymousJune 2, 2017 at 6:13 PM
hello everyone.....
thank the good topic.
Welcome To Casino online Please Click the website
thank you.
ทางเข้าจีคลับ
gclub casino
goldenslot slots casino
ReplyDelete
Replies
UnknownJune 5, 2017 at 12:36 AM
Hi Brit,
I think for this type of analysis, as you also said, it is a good idea to preprocess since it looks like data is not dynamic. But I was wondering..Thanks for sharing..,

Python Online Training
ReplyDelete
Replies
ranaJune 5, 2017 at 4:07 AM
Hi admin..,
Very nice blog.I understand the concept you put it in the blog. you are put it very crizpy information. Thanks for sharing..

Python Online Training
ReplyDelete
Replies
amarJune 6, 2017 at 11:32 PM
Very nice
ReplyDelete
Replies
soumyaJune 12, 2017 at 1:44 AM
Thank you for sharing this content
python online training
ReplyDelete
Replies
venkatJune 16, 2017 at 10:14 PM
Really cool post, highly informative and professionally written and I am glad to be a visitor of this perfect blog, thank you for this rare info!

Tableau Online Training
ReplyDelete
Replies
AnonymousJune 19, 2017 at 6:22 PM
Very nice blog.I understand the concept you put it in the blog. you are put it very crizpy information. Thanks for sharing..

goldenslot casino
บาคาร่าออนไลน์
gclub casino

ReplyDelete
Replies
UnknownJune 20, 2017 at 5:22 AM
Thank you for sharing this its very nice python online training
ReplyDelete
Replies
UnknownJune 23, 2017 at 9:15 PM
Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end.
Tableau online training
ReplyDelete
Replies
Priya BJuly 2, 2017 at 2:59 AM
Very useful & Informative
Best Android Training in Chennai, Velachery
Best ios Training in Chennai, Velachery
Best PHP Training in Chennai, Velachery
Best Dot Net Training in Chennai, Velachery
ReplyDelete
Replies
Rajasekar LJuly 3, 2017 at 8:29 AM
Really it was an awesome article. Very useful & Informative
Freshers Jobs in Chennai
ReplyDelete
Replies
venkatJuly 6, 2017 at 11:06 PM
The great service in this blog and the nice technology is visible in this blog. I am really very happy for the nice approach is visible in this blog and thank you very much for using the nice technology in this blog
Data Science Online Training

Hadoop Online Training

ReplyDelete
Replies
venkatJuly 7, 2017 at 1:12 AM
This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.

Tableau Online Training|
SAS Online Training |
R Programming Online Training|
ReplyDelete
Replies
UnknownJuly 10, 2017 at 2:33 AM
Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article.thank you for sharing such a great blog with us.
Python Training in Chennai | Best Python Training in Chennai | Big Data Analytics Training in Chennai
ReplyDelete
Replies
UnknownJuly 13, 2017 at 11:52 PM
Thanks for sharing great information datascience online training in hyderabad
ReplyDelete
Replies
vaiyboraJuly 26, 2017 at 7:37 PM
I really like you post,Thanks for your sharing.

ดูหนังออนไลน์
ReplyDelete
Replies
AnonymousJuly 31, 2017 at 12:09 AM
This comment has been removed by the author.
ReplyDelete
Replies
AnonymousJuly 31, 2017 at 12:25 AM
Thanks for the information. The information you provided is very helpful for Tableau Learners. https://mindmajix.com/tableau-advanced-training
ReplyDelete
Replies
UnknownAugust 4, 2017 at 5:02 AM
"Nice info!”Thanks for sharing great information.
Tableau online training |Tableau online course
ReplyDelete
Replies
UnknownAugust 10, 2017 at 12:19 AM
Hello,
The Article on Using Python for Sentiment Analysis in Tableau is nice .It give detail information about Phython for Sentiment Analysis. data science consulting
ReplyDelete
Replies
UnknownAugust 13, 2017 at 10:52 PM
Thanks for sharing nice information DATASCIENCE ONLINE TRAINING IN HYDERABAD
ReplyDelete
Replies
abril josephAugust 18, 2017 at 3:04 AM
Besant Technologies is a leading Python Training . we offer this course through online we have great experience in succeeding students through online courses. we can calculate our performance through their honest comments in our sites in supporting our services. we have referral program so candidates can earn money through referral. you can share your live experience with other can generate you some money.
Selenium Training in Bangalore |
Python Training in Bangalore |
ReplyDelete
Replies
vaiyboraAugust 29, 2017 at 10:16 AM
Your blog is very useful for me.I really like you post.Thanks for sharing.

ดูหนังผี
ReplyDelete
Replies
UnknownSeptember 15, 2017 at 11:29 PM
Very nice blog helpful to everyone Python training in bangalore
ReplyDelete
Replies
UnknownSeptember 29, 2017 at 5:29 AM
Article is quite good. Pegasi Media is a b2b marketing firm that has worked with many top organizations. Availing its email list is fast, simple, convenient and efficient. Appending services adds the new record as well as fills up the fields that are missing. Pegasi Media Group also perform Data Refinement, Data building, Data Enchancement, and Data De-Duplication. Database marketing is a form of direct market in which the customers are contacted through their email addresses with the help of the database. There is a scope for email marketing to generate personalized communication with the clients in order to promote your sales.
Big Data Users

ReplyDelete
Replies
UnknownOctober 3, 2017 at 1:01 AM
REALLY GOOD! i like it so much<3 Thanks for the Good Artickle.
sanadomino

ReplyDelete
Replies
UnknownOctober 3, 2017 at 2:13 AM
very good blog helpful to everyone python training in bangalore
ReplyDelete
Replies
vaiyboraOctober 20, 2017 at 6:23 PM
Thank you for sharing valuable information. Nice post. I enjoyed reading this post.

หนังจีน
ReplyDelete
Replies
RIA Institute of TechnologyOctober 22, 2017 at 10:55 PM
I really enjoyed reading the Post. It was very informative and useful for me.

Best Software Training Institute in bangalore
Best SAP Training in bangalore
Best Advanced Excel Training in bangalore
ReplyDelete
Replies
UnknownOctober 28, 2017 at 3:54 AM
very good !!! poker online betting

ReplyDelete
Replies
UnknownNovember 6, 2017 at 3:25 AM
Very nice blog python interview questions
AWS interview questions
manual testing interview questions
software testing interview questions
ReplyDelete
Replies
Ishu SathyaNovember 8, 2017 at 1:15 AM
Excellent article on the importance of R programming in tableau tool. I am working in the tableau related project. I gain some new updated regarding the tableau tool R Programming. Keep updating the recent updates of R. Thank you admin.

Regards:

R Programming Training in Chennai |
R Training in Chennai
ReplyDelete
Replies
AnonymousNovember 19, 2017 at 10:52 AM
intext:"I gotta bookmark this website it seems extremely helpful very helpful"
judi bola online terbesar
ReplyDelete
Replies
UnknownNovember 23, 2017 at 1:00 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownDecember 1, 2017 at 1:01 AM
the blog is about Using Python for Sentiment Analysis in Tableau #Python it is useful for students and Python Developers for more updates on python

follow the link

Python Online Training

For more info on other technologies go with below links

tableau online training hyderabad

ServiceNow Online Training

mulesoft Online Training

java Online Training

dot net Online Training
ReplyDelete
Replies
UnknownDecember 3, 2017 at 3:36 AM
intext:"I gotta bookmark this website it seems extremely helpful very helpful"
judi bola
ReplyDelete
Replies
nivedhithaDecember 4, 2017 at 1:39 AM
very nice article python training
ReplyDelete
Replies
SureshDecember 4, 2017 at 4:28 AM
This comment has been removed by the author.
ReplyDelete
Replies
NobelDecember 11, 2017 at 3:43 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownDecember 12, 2017 at 2:42 AM
well done! the blog is good and Interactive and it is about Using Python for Sentiment Analysis in Tableau it is useful for students and tableau Developers for more updates on Tableau follow the link

tableau online Course

For more info on other technologies go with below links

Python Online Training

ServiceNow Online Training

mulesoft Online Training
ReplyDelete
Replies
vinod_priyaadsDecember 15, 2017 at 1:00 AM
nice information thank you for sharing.
Digital Marketing
ReplyDelete
Replies
AptronDecember 19, 2017 at 4:52 AM
Best and informative info on the Data Science training
ReplyDelete
Replies
UnknownDecember 21, 2017 at 12:57 AM

Well Done ! the blog is great and Interactive it is about Python for Data Analysis : Reading and Writing Data it is useful for students and Python Developers for more updates on python follow the link

Python Online Training

For more info on other technologies go with below links

tableau online course Bangalore

ServiceNow Online Training

mulesoft Online Training
ReplyDelete
Replies
UnknownDecember 22, 2017 at 12:22 AM
intext:"I wan to meet you it seems extremely helpful very helpful"
bandar judi online
ReplyDelete
Replies
srinithyaDecember 26, 2017 at 3:38 AM
Helpful post, Continue sharing more like this.
Advanced Excel Training | Excel Training in Chennai
ReplyDelete
Replies
UnknownJanuary 5, 2018 at 9:54 PM
Thanks for sharing the information ,keep updating us.This information is really useful to me
Best Tableau Training Institute in Hyderabad
ReplyDelete
Replies
UnknownJanuary 10, 2018 at 3:09 AM

Well Done ! the blog is great and Interactive it is about Using Python for Sentiment Analysis in Tableau books it is useful for students and Python Developers for more updates on python follow the link

https://onlineitguru.com/python-online-training.html
ReplyDelete
Replies

Data + Brit

Search This Blog

Using Python for Sentiment Analysis in Tableau

Comments

Post a Comment

Popular posts from this blog

#MakeoverMonday: Data Science Degrees and Tile Maps

Open Data Sets