Using Tableau to create word clouds with ease.
A Word cloud, also known as a Tag cloud, is a visual representation of text data, typically used to depict keyword metadata (tags) on websites or to visualize free form text[Wikipedia]. Word clouds are a popular type
of infographic with the help of which we can show the relative frequency of words in our data. This can be depicted either by the size or the color of the chosen fields in the data. They are a pretty powerful feature to draw attention to your presentation or story
Objective
Tableau is a data analytics and a visualization tool widely used in the industry today. Tableau provides a native feature to create Word Clouds with a few mouse clicks. This is going to be a pretty short article emphasizing on the steps required to create a word cloud in Tableau. In case you want a more detailed article to begin with Tableau, make sure you go through my article Data Visualisation with Tableau first.
Even though this article is focussed on word clouds, I would also be mentioning some visual best practices with respect to using word clouds. Word clouds look cool but there are some better substitutes that convey the same message but in a more clear and accurate way.
Data
The data pertains to the top 20 movies of 2018 in the US having been ranked by domestic box office earnings in billions of dollars. The data also contains the Metacritic scores for each movie. Metacritic is a website that aggregates reviews of media products including movies.
All the worksheets and Tableau Workbooks can be accessed from its associated repository here.
Creating a Word Cloud in Tableau
Movies as per their Domestic Gross Earnings
- Open the Tableau Desktop and connect to the data source. You can choose any data format but here we are using an excel file which has the desired data.
- Drag the desired dimension to Text on the Marks card. Here I am going to drag the Movie Title to the Text since I want to know which movie performed well in terms of the Box office collections.
- Drag the Domestic Gross earnings on to the Size on the Marks card.
- Now drag the Domestic Gross earnings on to the Color on the Marks card since we want the color to reflect the earning pattern.
- Change the Mark type from Automatic to Text.
- Next, you can hide the title, change the view and the background as per your likings and you have your word cloud ready.
Movies as per their Metacritic Score
The steps remain the same as above except that we use the Metacritic Score instead of the earnings.
Improving the Word Cloud
The above examples deal with a simple and refined dataset having limited fields. What if the data contained a paragraph or some passage from a book and we were required to create a word cloud for that. Let’s see an example of such an instance.
Dataset
For the demonstration purpose, I have taken the entire passage from one of my medium articles. I copied the entire text irrespective of the content and placed it in a text.txt
file. Then I ran a small python script to save the words and their frequencies into a CSV file. You can use any dataset of your choice.
from collections import Counter
def word(fname):
with open(fname) as f:
return Counter(f.read().split())
print(word_count("text.txt"))
import csv
my_dict = word_count("text.txt")
with open('test.csv', 'w') as f:
for key in my_dict.keys():
f.write("%s,%sn"%(key,my_dict[key]))
text.csv
is the file that contains our dataset and will appear like this:
Now switch to Tableau.
- Create a word cloud as explained above using the words in the
text.csv
file.
- If you want to limit the number of entries, you can use the word count as a filter and show only the words with minimum frequency.
- Remove the most common words — Even after filtering by the word count, we see that there are words like ‘the’, ‘in’ etc which do not hold much significance but are appearing all over the worksheet. Let’s get rid of them. We will begin by creating a list of common words in English which can be accessed from here. The list contains words having a rank associated with them which we will use as a measure for filtering.
- Now let’s add this sheet into our workbook. The two sheets will be blended against the Words column since that is common to both sources.
- Create a new parameter and name it as “Words to be excluded” with the following settings:
- Show the Parameter Control and exclude the most common words from the cloud by filtering.
Now adjust the settings and you can have a better-looking word cloud with filter options.
When not to use a Word Cloud
Marti A. Hearst’s guest post “What’s up with Tag Clouds” is worth a read when discussing about word clouds. Word clouds are definitely eye-catching and provide a sort of overview or first insight and since they are quite popular, people usually have one or two in their presentations.
On the other hand, word clouds do not provide a clear differentiation between words of similar sizes, unlike a bar chart. Also, words belonging to the same category may lie far apart from each other and the smaller ones may be overlooked.
Alternatives to Word Cloud
- TreeMap
A Treemap may provide a better, but not the best, idea when compared to a word cloud. Treemaps are sometimes regarded as rectangular cousins of Pie Chart and may not be ideal when displaying detailed information.
- Bar Chart
A sorted Bar Chart definitely provides better and more accurate information since it gives a baseline for comparison.
Conclusion
Word Clouds are surely catchy and help the presentation to shine out but when it comes to serious data analysis there are better tools that can be tried out. However, the main aim of this article was to show how to create word clouds in Tableau with minimalistic effort. So you can try out building your own word clouds with data of your choice.