Proportion Plots in Tableau
Proportion Plots in
Tableau
Last week, the incredible Stephanie Evergreen shared a blog post discussing a new chart she calls a “Proportion Plot” which “help us compare the share of a population between two metrics. It uses length on the left and right side of the chart and connects the lengths by a band in the middle that swoops a lot if there is disproportionality and stays pretty even if the proportions are the same.” Here’s an example from her blog:

Chart by Stephanie Evergreen
A few days after she shared the blog, the Chart Chat team
discussed the chart, as well as some alternatives (if you haven’t had a chance
to watch, be sure to watch the video). So,
this new proportion chart is kind of the hot data viz topic at the moment, so
why not share a blog about creating it in Tableau!!
Interestingly, this is not the first time I’ve seen this
chart, nor the first time I’ve built it. In September, 2018, one of my favorite
people, Vince Baumel, reached
out to me with the following idea.

I was so intrigued by the idea, that I worked on it that same
evening then messaged Vince the following morning.
This was exactly what Vince was looking for and he was able
to implement it using the template I provided. After helping Vince, I published
this on Tableau Public and had planned to write a blog about it, but the truth was
that I just wasn’t sure whether it had many use cases. I felt like there were
other charts that might be just as effective and much easier to build so, like
many projects, I added it to my ever-growing list and moved onto other things.
Every once in a while, I’d come back to it and think about a blog, but I just
never pulled the trigger. A few weeks ago, I decided to ask Kevin for his
thoughts on it and, after some brief conversation, I made the decision to
finally write a blog and share a reusable template for creating these in Tableau.
We’ll get to that shortly, but one of the key parts of that conversation included the need for an exploration of some alternatives to this type of chart.
So, before I share the template, I’m going share a handful of alternatives,
including some that were mentioned in Chart Chat.
Alternatives
For these different alternatives, I’m going to be using data
that compares the racial makeup of the NBA (from www.theundefeated.com) to the US
population of males 20-39 years of age (from www.census.gov).
Let’s start by looking at the proportion chart.

There are a couple of issues I see with this chart, one
which has to do with readability and the other which deals with its creation.
Let’s start with the latter—building this in Tableau (or just about any
platform, for that matter) is really hard. It’s conceptually quite simple, but
the curves require data densification and a fair number of table calculations. Building
this from scratch would take quite a bit of time, so I wondered if we could
create a version that trades the curves for straight lines.

While this looks quite similar to the first proportion chart, it’s quite a bit easier to create. The left and right sides use simple stacked bar charts, while the center is an area chart (On Twitter, Rody Zakovich also shared a technique for building this using a single sheet).
I will say that there’s just something about the smoothness of the transitions in the curvy version, but for an easy-to-create option, this works really well. However, both versions have one issue—because the two measures do not share a common axis, it can be difficult to compare the values. In the example above, black is at the bottom and white is at the top, so we can simply compare the height of the two bars to understand the differences. This type of comparison is not that easy with Other because the bars start at different vertical positions. To address this, we can add some interactivity that allows you to send one dimension to the bottom (I like this technique from Ryan Sleeper: Howto Reorder Stacked Bars on the Fly in Tableau) which will allow you to better compare that particular dimension.
While that technique is definitely helpful, I still wonder
if there are any other alternatives that might be more effective. On Chart
Chat, Jeff showed a great example of a Bar Mekko chart. Here’s a version using
my data.

The height of the bar indicates the share of NBA players
while the width indicates the share of US population. The Chart Chat team had
some really good discussion about this chart, so be sure to watch the video. I
will say that I tend to agree with Amanda Makulec that data viz people may understand this chart fairly well, but lay
people generally do not. I personally find that I have to work a little harder to
compare the widths and heights while remembering what each represents. I simply
find the proportion charts to be much more intuitive, despite their flaws. 
One of my go-to’s when comparing two different measures for
a single dimension is a barbell (or dumbbell) chart, so I created one with this
data.

This chart plots each of the two measures as a shape along
the axis then draws a line between them. I’ve used icons to indicate the two measures—a
basketball for the share of NBA players and a globe for the share of the US
population. While I typically like this chart, I find it somewhat less intuitive
than I had expected. I can clearly see that there are wide differences between
the proportions for black and white players, but it takes me a few minutes to
realize that the two are polar opposites. I tried numerous alternatives such as
comet charts, gradient colors, etc., but just couldn’t find a great way to add
clarity. 
The next chart I tried was inspired by the work of Ryan Soares. Ryan created the
following viz on Premier League transfer spending which compared spend to sales.

I absolutely love this viz. It’s easy to understand and
beautifully designed, like all of Ryan’s work. The charts compare spend to
sales using what are essentially lollipop charts with an area chart between
them. So, I tried to apply this concept to my data.

I have to say that I quite like this option. It does a great
job of showing the difference in each of the measures and it solves the barbell
issue as the slope of each plot clearly indicates which of the two measures is
larger. And, since the dimensions aren’t stacked, both measures share an axis,
so we can easily compare the measures for any of the dimensions without having
to shuffle dimensions to the bottom. I think that this could be a great
alternative to the proportion chart. Perhaps the only issue with this is that I
cannot see easily see the proportions within a specific measure. The 100%
stacking of the proportion chart makes it really easy to see that part-to-whole
relationship, where this does not.
My final attempt is one of my favorite types of charts for
this kind of data—a bar-in-bar chart. 

I love this chart and use it frequently to visualize two
measures that share a common scale, such as percentages. This is a very simple
visual, which is both easy to build and easy to understand. We can clearly see
that the proportion of black NBA players is much larger than the proportion of
the population; and we can see that the opposite is true with white players. 
But it’s boring, you might say! I disagree, bar charts are probably
the most versatile chart we have and, with a little bit of good design, they
can be incredibly beautiful too. Don’t believe me? Check out the blog by Eva Murray which shows some useful bar chart variations, which are both insightful and lovely. 
At this point, I’ve shared curvy and straight-lined proportion
plots as well as four alternatives for visualizing this type of data. So, having
done this, what do I think of the proportion chart? I actually think it’s quite
useful. You can look at it and instantly get an overall feel for the
differences in the measures and well as the proportions within each measure. While
it does sacrifice a bit in its ability to precisely compare the two measures, that
can be mitigated through some basic sorting. That said, I think it can be a
useful tool in our data viz arsenal. As I’ve done here, it can often be quite
valuable to explore a variety of alternatives then choose the one that best addresses
your data, your audience, and the story you’re trying to tell.
The Template
OK, now that we’ve explored a variety of options, let’s talk
a bit about the reusable template I’ve created. Like most of our chart templates,
this one includes two components—an Excel spreadsheet and a Tableau workbook.
The Excel spreadsheet has two sheets, Data and Model. Model
is used to handle the data densification needed to draw the curves. You don’t
need to worry too much about this sheet—just make sure it’s in your
spreadsheet. Data is used to populate your data. It contains columns for
your dimension and two measures.
Once you’ve populated the spreadsheet
with your data, download the Tableau template. Edit the data source and connect
it to your Excel file. The workbook should update automatically to reflect your
data. 
The workbook comes with three different
curve types—Sigmoid, Sine, and Cubic (thanks to Chris DeMartini for his work on different curve types). By default, the curve is set to use Sine, but you can change it using
the Curve Type parameter. In addition, I've included a Whitespace parameter which allows you to make adjustments to the spacing between each curve.
To allow you to dynamically push a
dimension to the bottom (so that both measures share the same axis), I’ve used
the Ryan’s tip as well as a selection tip by Brian Moore (to prevent the item from remaining selected). When using the workbook,
you can click on any of the elements on the chart and that will send that
dimension to the bottom.
And that’s pretty much it. From here,
you can do whatever you like with the chart—change the colors, add filters,
update tooltips, etc. just as you normally would. 
I’ve placed all the files in the
following publicly-accessible location. There are two workbooks
Wrap-Up
Well, that was fun! I really enjoyed
exploring this new chart type and a variety of alternatives. If you decide to use
my proportion plot template, as always, I’d just suggest that you carefully
consider your data, your audience, and the message you’re trying to convey, as
well as considering other alternatives that may be more effective and/or easier
to build. Thanks for reading!
Ken Flerlage, February 1, 2021


 









 
 
 
 
 
 
 
 
 
 
 
I was just thinking to myself, 'I wonder if that promised proportion plot walkthrough is up yet?' Nice!
ReplyDeleteHaha! I had to strike while people were still interested.
DeleteThanks Ken for the amazing blog.
ReplyDeleteHi Ken, I am glad I found your interesting site.
ReplyDeleteI am not an expert in data presentation so you can find my question to be redundant but here it goes:
Is there a chart which incorporates both proportion and Sankey plots togerther ?
For example, lets imagine the flow of assets of an individual keeping his assets in 3 Banks A, B, C, with time. The horizontal axis will be time, the height of the BARS will represent asset valuation and will NOT BE fixed.
Specifically,
At time t1 he kept stocks in the value $1000 in Bank A, in the value $2000 in Bank B and no assets in Bank C.
At time t2, the value of his asstes at Bank A increased to the value $1500 and correspondingly in Bank B decreased to $1800 and at that time he decided to transfer assets in the value of $600 from Bank A to Bank C.
Is there a chart that will describe it ?
The horizontal axis will be time. instead of source and target in the Sankey plot there will be different time events. The width of the "channels" at each time will stand for the amount of the quantity
In this case, I think a stacked area chart or even a line chart would suffice.
DeleteI think this would be really useful for me, though I am struggling to adapt this to server connection for a live dashboard. I have been able to recreate the Dimensions, Measure 1 and Measure 2 table correctly, but I am stuck amalgamating that together into the Model. Is this possible with a server connection? This is an excellent graphic and would prove very useful!
ReplyDeleteWhen you say "Server Connection", do you mean a Published Data Source? Might be best to email me. flerlagekr@gmail.com
DeleteHi Ken, this looks amazing! I have the same issue as the previous user though, how do Data and Model talk to each other? Neither a Union or a Join seem correct - if I do the data replace I lost the fields I need cause I can only select one sheet of the two
ReplyDeleteCould you email me? flerlagekr@gmail.com
Delete