Proportion Plots in Tableau

Proportion Plots in Tableau

Last week, the incredible Stephanie Evergreen shared a blog post discussing a new chart she calls a “Proportion Plot” which “help us compare the share of a population between two metrics. It uses length on the left and right side of the chart and connects the lengths by a band in the middle that swoops a lot if there is disproportionality and stays pretty even if the proportions are the same.” Here’s an example from her blog:


Chart by Stephanie Evergreen

 

A few days after she shared the blog, the Chart Chat team discussed the chart, as well as some alternatives (if you haven’t had a chance to watch, be sure to watch the video). So, this new proportion chart is kind of the hot data viz topic at the moment, so why not share a blog about creating it in Tableau!!

 

Interestingly, this is not the first time I’ve seen this chart, nor the first time I’ve built it. In September, 2018, one of my favorite people, Vince Baumel, reached out to me with the following idea.

 

 

I was so intrigued by the idea, that I worked on it that same evening then messaged Vince the following morning.

 

 

This was exactly what Vince was looking for and he was able to implement it using the template I provided. After helping Vince, I published this on Tableau Public and had planned to write a blog about it, but the truth was that I just wasn’t sure whether it had many use cases. I felt like there were other charts that might be just as effective and much easier to build so, like many projects, I added it to my ever-growing list and moved onto other things. Every once in a while, I’d come back to it and think about a blog, but I just never pulled the trigger. A few weeks ago, I decided to ask Kevin for his thoughts on it and, after some brief conversation, I made the decision to finally write a blog and share a reusable template for creating these in Tableau. We’ll get to that shortly, but one of the key parts of that conversation included the need for an exploration of some alternatives to this type of chart. So, before I share the template, I’m going share a handful of alternatives, including some that were mentioned in Chart Chat.

 

Alternatives

For these different alternatives, I’m going to be using data that compares the racial makeup of the NBA (from www.theundefeated.com) to the US population of males 20-39 years of age (from www.census.gov). Let’s start by looking at the proportion chart.

 

 

There are a couple of issues I see with this chart, one which has to do with readability and the other which deals with its creation. Let’s start with the latter—building this in Tableau (or just about any platform, for that matter) is really hard. It’s conceptually quite simple, but the curves require data densification and a fair number of table calculations. Building this from scratch would take quite a bit of time, so I wondered if we could create a version that trades the curves for straight lines.

 

 

While this looks quite similar to the first proportion chart, it’s quite a bit easier to create. The left and right sides use simple stacked bar charts, while the center is an area chart (On Twitter, Rody Zakovich also shared a technique for building this using a single sheet). 


I will say that there’s just something about the smoothness of the transitions in the curvy version, but for an easy-to-create option, this works really well. However, both versions have one issue—because the two measures do not share a common axis, it can be difficult to compare the values. In the example above, black is at the bottom and white is at the top, so we can simply compare the height of the two bars to understand the differences. This type of comparison is not that easy with Other because the bars start at different vertical positions. To address this, we can add some interactivity that allows you to send one dimension to the bottom (I like this technique from Ryan Sleeper: Howto Reorder Stacked Bars on the Fly in Tableau) which will allow you to better compare that particular dimension.

 

While that technique is definitely helpful, I still wonder if there are any other alternatives that might be more effective. On Chart Chat, Jeff showed a great example of a Bar Mekko chart. Here’s a version using my data.

 

 

The height of the bar indicates the share of NBA players while the width indicates the share of US population. The Chart Chat team had some really good discussion about this chart, so be sure to watch the video. I will say that I tend to agree with Amanda Makulec that data viz people may understand this chart fairly well, but lay people generally do not. I personally find that I have to work a little harder to compare the widths and heights while remembering what each represents. I simply find the proportion charts to be much more intuitive, despite their flaws.

 

One of my go-to’s when comparing two different measures for a single dimension is a barbell (or dumbbell) chart, so I created one with this data.

 

 

This chart plots each of the two measures as a shape along the axis then draws a line between them. I’ve used icons to indicate the two measures—a basketball for the share of NBA players and a globe for the share of the US population. While I typically like this chart, I find it somewhat less intuitive than I had expected. I can clearly see that there are wide differences between the proportions for black and white players, but it takes me a few minutes to realize that the two are polar opposites. I tried numerous alternatives such as comet charts, gradient colors, etc., but just couldn’t find a great way to add clarity.

 

The next chart I tried was inspired by the work of Ryan Soares. Ryan created the following viz on Premier League transfer spending which compared spend to sales.

 

 

I absolutely love this viz. It’s easy to understand and beautifully designed, like all of Ryan’s work. The charts compare spend to sales using what are essentially lollipop charts with an area chart between them. So, I tried to apply this concept to my data.

 

 

I have to say that I quite like this option. It does a great job of showing the difference in each of the measures and it solves the barbell issue as the slope of each plot clearly indicates which of the two measures is larger. And, since the dimensions aren’t stacked, both measures share an axis, so we can easily compare the measures for any of the dimensions without having to shuffle dimensions to the bottom. I think that this could be a great alternative to the proportion chart. Perhaps the only issue with this is that I cannot see easily see the proportions within a specific measure. The 100% stacking of the proportion chart makes it really easy to see that part-to-whole relationship, where this does not.

 

My final attempt is one of my favorite types of charts for this kind of data—a bar-in-bar chart.

 

 

I love this chart and use it frequently to visualize two measures that share a common scale, such as percentages. This is a very simple visual, which is both easy to build and easy to understand. We can clearly see that the proportion of black NBA players is much larger than the proportion of the population; and we can see that the opposite is true with white players.

 

But it’s boring, you might say! I disagree, bar charts are probably the most versatile chart we have and, with a little bit of good design, they can be incredibly beautiful too. Don’t believe me? Check out the blog by Eva Murray which shows some useful bar chart variations, which are both insightful and lovely.

 

At this point, I’ve shared curvy and straight-lined proportion plots as well as four alternatives for visualizing this type of data. So, having done this, what do I think of the proportion chart? I actually think it’s quite useful. You can look at it and instantly get an overall feel for the differences in the measures and well as the proportions within each measure. While it does sacrifice a bit in its ability to precisely compare the two measures, that can be mitigated through some basic sorting. That said, I think it can be a useful tool in our data viz arsenal. As I’ve done here, it can often be quite valuable to explore a variety of alternatives then choose the one that best addresses your data, your audience, and the story you’re trying to tell.

 

The Template

OK, now that we’ve explored a variety of options, let’s talk a bit about the reusable template I’ve created. Like most of our chart templates, this one includes two components—an Excel spreadsheet and a Tableau workbook. The Excel spreadsheet has two sheets, Data and Model. Model is used to handle the data densification needed to draw the curves. You don’t need to worry too much about this sheet—just make sure it’s in your spreadsheet. Data is used to populate your data. It contains columns for your dimension and two measures.

 

Once you’ve populated the spreadsheet with your data, download the Tableau template. Edit the data source and connect it to your Excel file. The workbook should update automatically to reflect your data.

 

The workbook comes with three different curve types—Sigmoid, Sine, and Cubic (thanks to Chris DeMartini for his work on different curve types). By default, the curve is set to use Sine, but you can change it using the Curve Type parameter. In addition, I've included a Whitespace parameter which allows you to make adjustments to the spacing between each curve.

 

To allow you to dynamically push a dimension to the bottom (so that both measures share the same axis), I’ve used the Ryan’s tip as well as a selection tip by Brian Moore (to prevent the item from remaining selected). When using the workbook, you can click on any of the elements on the chart and that will send that dimension to the bottom.

 

And that’s pretty much it. From here, you can do whatever you like with the chart—change the colors, add filters, update tooltips, etc. just as you normally would.

 

I’ve placed all the files in the following publicly-accessible location. There are two workbooks—one in 2020.3 format which uses relationships and animation and another in 2019.2 format, which uses the old data model. If you need a older version, please let me know.


Files Here

 

Wrap-Up

Well, that was fun! I really enjoyed exploring this new chart type and a variety of alternatives. If you decide to use my proportion plot template, as always, I’d just suggest that you carefully consider your data, your audience, and the message you’re trying to convey, as well as considering other alternatives that may be more effective and/or easier to build. Thanks for reading!

 

Ken Flerlage, February 1, 2021

 

9 comments:

  1. I was just thinking to myself, 'I wonder if that promised proportion plot walkthrough is up yet?' Nice!

    ReplyDelete
    Replies
    1. Haha! I had to strike while people were still interested.

      Delete
  2. Thanks Ken for the amazing blog.

    ReplyDelete
  3. Hi Ken, I am glad I found your interesting site.
    I am not an expert in data presentation so you can find my question to be redundant but here it goes:
    Is there a chart which incorporates both proportion and Sankey plots togerther ?
    For example, lets imagine the flow of assets of an individual keeping his assets in 3 Banks A, B, C, with time. The horizontal axis will be time, the height of the BARS will represent asset valuation and will NOT BE fixed.
    Specifically,
    At time t1 he kept stocks in the value $1000 in Bank A, in the value $2000 in Bank B and no assets in Bank C.
    At time t2, the value of his asstes at Bank A increased to the value $1500 and correspondingly in Bank B decreased to $1800 and at that time he decided to transfer assets in the value of $600 from Bank A to Bank C.
    Is there a chart that will describe it ?

    The horizontal axis will be time. instead of source and target in the Sankey plot there will be different time events. The width of the "channels" at each time will stand for the amount of the quantity

    ReplyDelete
    Replies
    1. In this case, I think a stacked area chart or even a line chart would suffice.

      Delete
  4. I think this would be really useful for me, though I am struggling to adapt this to server connection for a live dashboard. I have been able to recreate the Dimensions, Measure 1 and Measure 2 table correctly, but I am stuck amalgamating that together into the Model. Is this possible with a server connection? This is an excellent graphic and would prove very useful!

    ReplyDelete
    Replies
    1. When you say "Server Connection", do you mean a Published Data Source? Might be best to email me. flerlagekr@gmail.com

      Delete
  5. Hi Ken, this looks amazing! I have the same issue as the previous user though, how do Data and Model talk to each other? Neither a Union or a Join seem correct - if I do the data replace I lost the fields I need cause I can only select one sheet of the two

    ReplyDelete

Powered by Blogger.