Ads Top

More Sankey Templates: Multi-Level, Traceable, Gradient, and More!!


Note: On January 6, I posted a blog which details a new approach to drawing sankey curves which ensures that the width of flows remain constant from start to finish (rather than narrowing in the middle). I suggest transitioning to this new method to ensure greater analytical integrity. Equal-Width Sankey: A New Approach to Drawing Sankey Curves

Sankey charts are often criticized in the data visualization community, largely because they are very regularly misused (perhaps more often than they are used properly), but I love them nonetheless. When used in the right way and for the right use case, they are incredibly insightful, not to mention visually stunning.


That said, these charts are pretty difficult to create in Tableau. There is no shortage of tutorials, but they inevitably get into some pretty tough table calculations and other tricky business. So, to make them a bit easier, last year I posted a template for creating sankeys, based on the polygon sankeys built by Olivier Catherin and Jeffrey Shaffer. To date, this has been one of my most popular blog posts. But I often receive questions from people asking how to make various adjustments. These have led to this post where I’m going to share six new sankey templates which attempt to address many of these questions. So let’s get started!!

Before I start, I want to say a big thanks to my brother, Kevin Flerlage (you'll hear from him directly soon) for his help in testing these templates. His feedback was critical to making these templates as useful as possible.

Quick Note: I’m going to run through these six new sankeys first before showing you how to create them. Many of them use the exact same template and process as my original template, but others require some slight adjustments. I’ll address all of this at the end of the blog. That said, if you just want to skip to the files, you can find them here: Sankey Template Files

Adjustable Whitespace
The sankey from my original post looked like this:



One problem with this is that there is a lot of space between each of the bars on the left and right sides, so a common question has been how to reduce that whitespace. I’ve generally provided a hacky solution which involves changing one of the calculated fields then adjusting the axes on all the sheets, but I’ve never loved that solution and really wanted to make it adjustable via a parameter. Unfortunately, this was not exactly a straightforward change and required some pretty fundamental changes to almost all of the calculations. But, in the end, I was able to create a sankey that allows you to adjust the amount of whitespace anywhere from zero, which will look like this:



…to 1, which because the sankey is drawn on a scale from 0 to 1, means that it’s pretty much all whitespace (and completely useless). But you could make the bars really small, if you’d like:



In most cases, you will probably want something in between. It is important to note here that the whitespace defined in the parameter is the totalamount of whitespace added to the bars on the left and right sides. That whitespace is distributed evenly between the bars. For example, if you specified a whitespace amount of 0.3 and you had four bars, the template would place 0.1 whitespace in the three gaps between them. But, if you only had two bars, the entire 0.3 whitespace would fill the single gap between them. This ensures that the bars’ sizes are not compressed which would lead to changing flow sizes from the left to the right.

Okay, so this one isn’t really a new template so to speak—it’s really more of a feature I’ve added to make it more flexible. That being said, I’ve included this feature in all the additional templates to follow.

New Look and Feel
A second comment I often get is that people don’t like is that, when you highlight a flow, the bars’ labels are not highlighted, so it’s difficult to see where the flow is going, as shown below:


This has a pretty simple fix—you just need to adjust the highlight action so that the bars aren’t a target. But I wanted to come up with a different solution to this problem. One day while looking around the web for sankeys, I came across this beauty by Stefania Guerra (see Visualizing Ageing - Issue mapping for an ageing Europe for this and more of her related work.)


There is so much to like about this sankey, but one thing that caught my eye was that the labels were separated and placed to the left/right of small thin colored bars. So, I decided to build a template that looks very similar to Stefania’s work.



The highlight actions are set up to not include the text as a target, so that means they’ll always be visible, thereby addressing the original problem. Plus, it just looks really nice.

Multi-Level Sankey
Another common request I receive is for help creating a multiple-level sankey so that a larger flow, with multiple steps, can be visualized. The design of the original sankey template is such that you can create a multi-level sankey by copying many of the calculated fields, copying the sheets, adjusting the table calculations, and adding them all to the dashboard. But that can prove to be a ton of work and an understanding of the table calculations used is pretty important. So I decided to create a multi-level sankey template (click the image to see the interactive version).


Those colors are pretty bold, but you’re free to change them however you like. You will notice that I chose to stop at 5 levels. I thought that would be enough, but if you need more, feel free to reach out to me and I can provide some direction on adding more.

Traceable Multi-Level Sankey
The multi-level sankey above was, admittedly, the first time I’d ever built a sankey with more than two levels and, as I was looking at it, there was one thing that bothered me about it—there’s no easy way to trace a record through then entire flow. Granted, a sankey is an aggregated chart, so it’s purpose is to give you an overall idea of the flow, at an aggregate level, but I just though it would be nice to have the ability to trace a single item (or person or order or whatever) through the flow. So, to address this, I created a traceable multi-level sankey (click the image to see the interactive version).


In this sankey, every detailed record in your data set results as its own separate flow. All of the record identifiers are available in the dropdown list (a parameter). When you select an item, it is highlighted in a different color. When you hover over a specific bar on any of the levels, it will highlight the entire flow, but hovering over the flow itself will highlight the individual record through the entire flow. The combination of these features give you the best of both worlds.

I’d also point out that, if you look closely at the image, you can see that there is a faint white border around each individual flow. These were not intended and are not visible in Desktop. But, when published to Public or Server, they show due to how HTML5 Canvas renders adjacent polygons on the web. However, I really like them in this case as they add some depth to the chart and make it clear that there are multiple “strands” in each flow. Note: Keep this in mind as it will impact our next chart.

In addition to the ability to trace a single item through the flow, I created version that allows you to trace an entire flow. You can select which step you'd like to trace then which value. For example, below I've chosen to trace from the first step, then trace the D value. 



Gradient Sankey
The final template is a gradient colored sankey. If you’re a regular reader of my blog, you probably saw my recent post on how to create gradient bar and area charts. So, after spending all this time experimenting with sankeys, I decided to see if I could do the same here.

In all of the sankeys I’ve shown previously, each flow is a single polygon in Tableau (except for the traceable sankey, of course). So, similar to my approach for creating gradient bar charts, my idea was to break each of those polygons into small slices then color each slice a slightly different color, creating the gradient effect. This required some changes to the underlying data model built into the template (I’ll get to that shortly). In the end, I was able to create this:



But there’s a problem. Notice those thin white lines between each individual polygon. This, of course, is the same problem noted earlier, but in this case, I don’t care for the way it looks. Unfortunately, we can’t eliminate the lines altogether, but there are a couple of things we can do to reduce the impact. First, we can create a slight overlap between each polygon in an attempt to cover up the white lines. So, I added a parameter that allows you to specify the amount of overlap you’d like to have. A value of zero will give you what I showed above, but you can then tune it a bit to reduce the impact of the white lines. For instance, here’s the sankey with 0.0073 overlap.



That kind of pushes it the other direction, creating darker lines instead of lighter ones, but I prefer this over the light lines. But what’s nice about the flexibility of this parameter is that you can really embrace the lines. For instance, you can make them a bolder dark color:



Or you can make the value negative to create larger white gaps:



But, if you really want a smooth color transition, there’s one more option. Jeffrey Shaffer’s website, Data + Science, includes a blog detailing methods to create high resolution images from your Tableau workbooks. If you’ve never read this, then do yourself a favor and go read it right now. It’s incredibly valuable when you need a high res image, be it for an image on your blog, a desktop background, or to print yourself some wall art. My personal favorite is the pixelratio trick, which essentially increases the number of pixels displayed. Interestingly, you can use this trick to help reduce some of the impact of the thin white lines. For example, here’s the sankey with a pixel ratio of 15:



The drawback of this, however, is that it increases the load time of the chart. This sankey is already somewhat complex, so that may not always be ideal.

How to Use The Templates
Okay, now that you’ve seen all six new templates, let’s talk about how to use them. The first two templates—adjustable whitespaceand new look and feel—follow the exact same process as shared in my original blog, so go check out the steps documented there. The remaining three follow the same basic process with a few slight modifications.

Multi-Level
The multi-level sankeys require a slightly different template, which is different than the original in a two ways. First, instead of two fields, Step 1 and Step 2, it now has Steps 1-5. The second difference is the addition of an ID field. This field is to be used to uniquely identify a record. The purpose of this field is to enable traceability in the traceable multi-level sankey. Strictly speaking, it isn’t required in the regular multi-level sankey, so it can be left blank, unless you’d like to have record ID’s for filtering or other purposes.

If you’re building a traceable sankey, then once you’ve loaded your own data into Tableau, you’ll need to populate the Select IDparameter. You can use the Add From Field button on the parameter to populate the list from the ID field.

Before I move on to the gradient sankey, I would like to point out that my original blog suggested that you aggregate your data before populating the Excel template. This isn’t really necessary as the template is built to aggregate the data automatically. So, unless you are concerned about the number of records, feel free to populate the template with detailed data.

Gradient
The process for creating the gradient sankey is the same as the original blog, but requires a slightly modified Excel template. However, the tab on which you populate your data will remain unchanged. The only difference in the template is an adjusted Model tab.

Coloring the flows with the gradient color requires a bit of explanation. If you look at the marks card, you’ll see that the flows are colored by Step 1, Step 2, and Polygon.


Step 1 and Step 2 are the to and from fields that define a flow overall; polygon is a numeric field which identifies each of the 120 polygons used to break the flow into multiple pieces which are then colored individually. To color a flow, click the color card, then click Edit Colors. You’ll see something like this:


Select all 120 polygons for a given set of Step 1/Step 2 values.


Finally, select your favorite continuous or diverging color palette then click Apply. Tableau will the automatically distribute that palette across all 120 values, giving you the nice gradient color we’re looking for. 



You’ll then need to repeat this step for all of your individual flows.

The Files
You can find all the Excel and Tableau templates using the following link. I’ve included the Tableau workbooks in both 2019.1 and 10.4 formats, so that you can use them even if you are not using the latest and greatest version of Tableau. If you need an even older version, then please reach out to me. 

All the Files

Using in Real-Life
Finally, before closing this out, I wanted to note that I often have people tell me they can’t use these templates because they can’t put their data in Excel. But that’s not exactly true. If you can get your data into the basic structure of the template, either through use of data prep tools, such as Tableau Prep, or custom SQL, then you can easily replace the data with your own. Not convinced? Here's a quick testimonial from my brother, Kevin Flerlage:


I recently utilized one of Ken’s Sankey templates at work, specifically the multi-level Sankey template. I simply structured my data using SQL to include the data points shown in his Excel template. I brought that into the Tableau workbook template as Custom SQL, replaced the “Data” sheet, and created a join calculation of 1 = 1. From there, all I had to do was to replace references of his column names to my column names.  The process was quite simple and it only took about 20 minutes from start to finish. 

If you need to do this and get stuck, please feel free to contact me and I’d be happy to help you out.

Different Curve Types
Okay, one last thing before you go. All of these templates use a curve type called a sigmoid, but other types of curves can be used as well and, in many cases, these other curves can work better than sigmoids. If you want to learn how to leverage these curve types in your sankeys, then check out the amazing work of Chris DeMartiniMore options for your Tableau Sankey Diagram


Update February 4, 2020: One question I get pretty regularly is how you can extend the template so that it has additional flows. I won't go into a lot of detail here, but the process basically entails the following:

1) Add a new "Step" to the spreadsheet--you can just call it Step 6 for consistency. 
2) In Tableau, copy the "Bar" calculations. For instance, copy the calcs in the "Bar 5" folder to create "Bar 6" calculations. Then edit each of those new calculations to refer to the "6" version. 
3) Similarly, create copies of the calcs in the "Curve 4-5" folder to create "Curve 5-6" calculations. 
4) Copy the Bar 5 sheet to one called Bar 6 and Curve 4-5 to Curve 5-6, then edit the fields used on those sheets to use the Bar 6 and Curve 5-6 calculated fields.  

A note of warning: Often, when you copy the sheets then switch the calcs to use the new ones you've created, you may find that one of the table calcs turns red, indicating an error. If you hover over the calc, it will say that a field is missing. Despite all your efforts, you won't find a way to fix it. I've found that advanced table calculations, especially those with lots of nested calcs, sometimes get errors like this even though everything looks good under the covers. My best guess is that there is some missing pointer somewhere that causes it to get confused. The only way I've found to deal with this is to right-click on the red pill then set it to compute using some field (any field will do). You then have to go back and edit the table calculation, setting all the nested calcs to compute as desired. It can be a bit of a painful process, but it works.


Ken Flerlage, April 13, 2019



68 comments:

  1. Excellent article, Ken. NOt really a fan of gradient fills, but the white-gapped gradient version is quite striking.

    ReplyDelete
  2. Ken, I used your last Sankey template to create a multi-level (with 3 sets of curves!), and whilst it was a bit tricky, the layout of your templated dataset made it much less troublesome than I thought it would be. Now, with this multi-level template, I can maybe relax a bit more next time.

    One point I would like to add, is that the sankey I was making did not have 100% of step 1 flowing all the way through to step 4, and actually, new outside influences came in at step 2 and 3. I had to do some creative resizing to make it work (because I am hopeless at the math behind this!) - is this something that you think could be possible to create effectively with the calcs?

    Great work on this post, the gradients look pretty cool!

    ReplyDelete
    Replies
    1. Thanks Stewart. That is certainly a challenge, especially the addition of new influences. I don't have any thoughts on how to approach that at the moment.

      Delete
    2. I have the same issue with new sources coming in after level 1. What I've done is made a bunch of NULL rows at Level 1, renamed it to something like "No Level 1 source" and worked through the rest as usual.

      Ken, your amazing work has made my life SO much easier :D I can trial new datasets and flows easily without tearing my hair out with new diagrams every time!

      Delete
  3. Hi Ken, this is a great post, definitely helped a lot with getting a Sankey to display with multiple levels in Tableau. I'm just wondering what I would need to do in order to adjust the position of some of the nodes, i.e. reorder them vertically. I'm also wondering if you have any suggestions around how I may be able to terminate some flows once they get to a certain node?
    Lastly, how would I change the colours of a flow between two nodes in the aggregate traceable multi-sankey?
    Thanks

    ReplyDelete
    Replies
    1. Glad it was helpful! Your questions would probably be better addressed offline. Would you be able to send me an email at flerlagekr@gmail.com?

      Delete
    2. This too would be incredible to figure out. Especially the termination of flows on a null level 2, 3 etc.

      Delete
    3. I'm actually working on something exactly like that!

      Delete
  4. Hi Ken,

    This is indeed a great post. I'm so glad I came across this post when trying to wrap my head around the functioning of the Sankey diagram at work. The best part is that the example you mentioned about it being suitable for colleges to see where students go(in your original post) is exactly my use case because I work at the University of Cincinnati. I am trying to revamp an existing Sankey and upgrade it a little to allow for the nice features that you have presented in the aggregate traceable multi-level sankey. Ended up deciding to re-do it altogether using your incredible template. So easy!
    I have a couple of questions:
    1. I want to show multiple lines of flow with one line for each student, is that what is essentially happening where you have the unique identifier 'thing001' column?
    2. Also, what best way do you suggest to improve performance of the Sankey as my dataset is really large and I plan to add filters to be able to filter by each major.

    Thank you in advance.
    Roshana

    ReplyDelete
    Replies
    1. That's great to hear. I'm originally from the Cincy area myself!

      I'm not quite sure I understand what you're saying on the first question. Can you explain that a bit further? For performance, that's tough because this template sort of explodes the data set's size a bit. In some cases, you can pre-aggregate (if you don't need individual students, for example). In any case, might be better to have this conversation offline. Would you be able to send me an email? flerlagekr@gmail.com

      Delete
  5. Hi Ken,

    I have found your templates really helpful but have run into something with your Aggregate Traceable Multi-Level Sankey. I have all my data in and it works really well, except it doesn't seem to actually be aggregating up to the fields I am showing. In other words I have multiple lines that are actually the same paths but they aren't combining together (likely because of other fields that are not on the dashboard). Do you have any solutions for this? I tried to remove the ID to see if that would make it aggregate to a higher level but it didn't seem to work.

    ReplyDelete
    Replies
    1. I haven't run into that and would probably need to see the workbook. Any chance you could send me a sample? My email is flerlagekr@gmail.com

      Delete
  6. Hi Ken,

    I have date fields for my flows, and it feels like a waste of a valuable data point. How do you suggest using dates into Sankeys (if at all)? I have looked online and Sankeys with dates are not a generally used thing.

    One method I've thought of is using aggregated Date months on the Y-axis on one of the segments.

    Any suggestions would be appreciated!


    Cheers!

    ReplyDelete
    Replies
    1. Probably depends on the use case and the data. I'd be happy to explore further offline. Feel free to send me an email. flerlagekr@gmail.com

      Delete
  7. Hi Ken,

    Cannot thank you enough for this post! This is amazing and exactly what I was looking for. I've been asked to do a bit of customisation, including ordering step 1 and 2 differently. Can you advise?

    ReplyDelete
    Replies
    1. Yes. If you need to change the sort order of Step 1, you just need to set the order of the Step 1 pill on the first bar, then do the same on the curve. Same thing for Step 2. Just remember that, whenever you change the sort order of a bar, you need to do the same thing on each of the attached curves. Give it a try. If you get stuck, feel free to contact me via email. flerlagekr@gmail.com

      Delete
  8. Hi Ken! Do you have any supporting documentation on your Sankey Funnel template? Reference:
    https://public.tableau.com/profile/ken.flerlage#!/vizhome/SankeyFunnelTemplate/Sankey

    I've seen in some articles the process of duplicating the data. Was that done in this example?
    A second part to my question, what are the calculations that you ETL'd for the following columns:
    Min or Max
    Path (Model)
    Size
    t (Model)

    ReplyDelete
    Replies
    1. The sankey funnel is coming soon. Not sure how soon, but I should have a blog about it at some point. In the meantime, if you'd like to know how to create it, feel free to email me at flerlagekr@gmail.com.

      In this example, data is duplicated using a cross-join to a model data set. I explain that process in a bit more detail, though generalized, here: https://www.kenflerlage.com/2019/05/intro-to-data-densification.html

      Those are all part of the model data set. If you download the Excel template, there is a sheet in there called Model which has all of these fields (except Size, which comes from the Data sheet).

      Happy to talk more via email.

      Delete
  9. Hi Ken,

    I'm having a weird issue where the bars don't line up to where the record is. For example, A-B but the bar points at C. Any thoughts on how to resolve that?

    ReplyDelete
    Replies
    1. I'd need to see the workbook. Any chance you could send me an email? flerlagekr@gmail.com

      Delete
  10. Hello Ken, thanks for providing the amazing templates! I was able to get my viz working perfectly and easily using the Sankey layout inspired by Stefania Guerra. I do have one question that I hope you can help me with. Is there a way to have the source displayed similarly to this design here - https://public.tableau.com/en-us/gallery/flow-human-migration? I want to keep a horizontal layout but would live to see if I can have the source start from one point and flow out versus a block. Thanks in advance for any assistance you can provide.

    ReplyDelete
    Replies
    1. That would probably be doable. Any chance you can send me an email so we can chat offline? flerlagekr@gmail.com

      Delete
  11. I love all your detailed work and explanations, but none of your Sankey diagam satisfy me because the flow lines do not maintain a constant width which is, I think a defining characteristic of Sankey diagrams. Notice that Stefania Guerra's have constant width (compare to your reproduction of her graph). Is that possible within Tableau?

    ReplyDelete
    Replies
    1. Yep, you are exactly correct. Jeff Shaffer wrote a great post about this here: https://www.dataplusscience.com/Sigmoid.html and gave some alternatives, such as different curve types. However, the issue exists with this method no matter what type of curve you use. It's a fundamental problem with this approach. I am in the middle of a project to address this problem. It's going to take some time to work out all the details and implement a solution, but keep an eye on my blog.

      Delete
    2. Hey Nick. Check out my new blog which addresses this issue: https://www.flerlagetwins.com/2020/01/equal-width-sankey.html

      Delete
  12. Hi Ken,
    Thanks so much for these templates! I was wondering if it is possible to assign different non-highlight colors to different flows in the Traceable Multilevel Sankey Template. I was using the Funnel Template before and was able to do so, but I can't figure it out here...
    Thanks in advance for your time!

    ReplyDelete
    Replies
    1. I'm not entirely sure I follow you. Might be best to take this offline. Can you email me? flerlagekr@gmail.com

      Delete
  13. Hi Ken! I was wondering, how did you create traceability? I am trying to do something similar in a different viz (network graph). I'm not sure where to start. I've looked into your workbook and see you've created a bunch of parameters for this?

    ReplyDelete
    Replies
    1. This required a combination of data modeling and a bunch of table calcs. Happy to help if you'd like to reach out to me via email. flerlagekr@gmail.com

      Delete
  14. Hello, first of all, thanks a lot for sharing all these templates! I'm using the multi-level template with the slight modification to exclude null entries:
    IF ISNULL(ATTR([Step 1])) or ATTR([Step 1])="" THEN
    NULL
    ELSE
    RUNNING_SUM([N1 Flow Size]+ [N1 Whitespace]) - [N1 Flow Size] - [N1 Whitespace]/2
    END

    I use the multi-level Sankey diagram to represent connections between different systems (5 levels). In some cases, I just need to represent a connection between level 3 and level 4 (without having a line going from level 1 all the way to level 5). I can make that part to work without any problems, however things get completed when I have let’s say 1 line coming to Bar #3 and 2 lines going out, for instance:

    1 2 3 4 5 Value
    - A B - - 1
    - - B C - 1
    - - B D - 1

    In this case the size of B is the sum of all the values coming in and the ones going out, here 3.
    What should I change so that the size of B is the greater between what’s coming in and what’s going out, here 2? Is this something that can be done?
    Thanks!

    ReplyDelete
    Replies
    1. This sounds like it could be tricky. It's hard to have these discussion on this comments section, so could we take this offline? Could you email me at flerlagekr@gmail.com.

      Delete
    2. Sounds good. I sent you an email. Thanks!

      Delete
  15. This is absolutely great. Thanks a lot!

    ReplyDelete
  16. Hi Ken,
    Would love to know how to create a gradient Sankey with an additional "Step 3"... Do you think this is possible?

    ReplyDelete
    Replies
    1. Yes, but I start to wonder if the juice is worth the squeeze as your are adding a lot of complexity to the chart to get these colors and I'm just not sure that they add a lot of value. Happy to help if you'd like to give it a try and run into problems. Feel free to email me at flerlagekr@gmail.com

      Delete
  17. Love the how this is setup but I can't seem to reproduce it, does this work in 2018.3? When I try to add the Curve Polygon it says I am missing a field but it looks exactly like your workbook.

    ReplyDelete
    Replies
    1. Yes, this will work in 2018.3 (and lots of versions prior to that). See the "A note of warning" section at the very end of the blog. Try the suggestions there. If that doesn't work, let me know.

      Delete
  18. Hi Ken, this is a great post, definitely helped a lot with getting a Sankey to display with multiple levels in Tableau. I'm wondering if you have a published answer around how I may be able to terminate flows once they get to a certain node?

    Eg.
    A B C 10 ids
    A B 5 ids
    A B D 10 ids

    I'm having difficulty with the sankey on the AB flow that ends.

    ReplyDelete
    Replies
    1. I think you may find my Sankey Funnel post useful: https://www.flerlagetwins.com/2019/11/sankey-funnel.html. Let me know if that works for you. If not, please reach out to me at flerlagekr@gmail.com.

      Delete
  19. Hi Ken, I am working on a Sankey diagram which requires 6-7 levels. Can you give me some direction on adding more? I tried to copy the worksheet, but always got a red pill. I also fixed table calculation, but still no luck. Any help would be appreciated! Thank you!

    ReplyDelete
    Replies
    1. Please see "A note of warning" at the end of this blog. It addresses that issue. If you have problems after that, please let me know.

      Delete
    2. It works!!! Thank you so much, Ken!

      Delete
    3. Hi Ken! Sorry for bugging you again. I am facing another challenge. I have 2 Sankey diagrams: one is a high level lineage, one is a granular level lineage. Is it possible to connect these 2 diagrams together? Like creating a drill down in Sankey? Thanks a lot!

      Delete
    4. I'd need to see it to better understand what you mean. Any chance you can email me at flerlagekr@gmail.com?

      Delete
  20. Hi, I came across your templates when searching for ways of doing Sankey Diagrams in Tableau - it worked first time! Thanks for all your efforts putting this together. Just one thing is that I didn't see any tool tips and they seemed to be included in the Tableau workbook. Viz showing traffic flow through an area of London is here: https://public.tableau.com/views/KentishTownSankey/Sankey?:display_count=y&publish=yes&:origin=viz_share_link

    ReplyDelete
    Replies
    1. You can add them. Just go to the sheets, click on the tooltip card, turn them on, and add your fields.

      Delete
  21. Hello Ken, appreciate your work! I started studing Sankey earlier this week and whatever I learnt is coz of your efforts. Simple question for you though, 30 steps Sankey...worth the time? Trying to convert a Flowchart Data/Diagram for a warehouse into Sankey chart.

    ReplyDelete
    Replies
    1. 30 Steps!! Wow. That would be a huge sankey. One thing I'd be worried about is simply how you'd fit all that on screen. That in itself could be a challenge. Honestly, whether or not it's worth it all depends. Do you think you'll get a lot of value out of it? If so, then it may be worth the effort.

      Delete
  22. Hi Ken, this is really useful. I'm struggling with creating a step 6 though due the error missing field. I followed your instructions and it semi works but the flow isn't plotted correctly. As a test a duplicated Curve 4-5 and changed to Compute using then recreated the table function however even on Curve 4-5 it plots the curves weirdly. Any ideas? I'm using 2018.1. Thanks

    ReplyDelete
    Replies
    1. My guess is that you haven't updated all of the nested table calcs. There is a dropdown box at the top--you'll need to make sure to update all of those. Happy to help if you could send me an email. flerlagekr@gmail.com

      Delete
    2. Thank you very much Ken, the pointers in your mail really helped me out and IO have extended this to 7 levels. A key thing was I hadn't appreciated the multiple nested table calcs in the drop down.

      Delete
    3. Yep, that is something that is often missed. Glad you got it working!

      Delete
  23. Hi Ken, I have been playing around with your template, it's incredibly useful. Thank you so much for sharing! I am stuck on what I hope will have a simple solution- any idea how to calculate the percentages of each path coming from a source? For example, in the Multilevel Sankey, the percentages of Source A that flow to E, F, and G. Any help is greatly appreciated!

    ReplyDelete
    Replies
    1. Hi Samantha, I actually wanted to display percentages as well and found the following formula helpful:

      sum([Size])/sum({ FIXED: SUM([Size]) })

      Depending on how your data is structured and the purpose of the Sankey you may have to add some additional fields to the FIXED calculation.

      Best of luck!

      Delete
    2. Sorry, I missed your question from yesterday. Daniel's formula should work. Otherwise, could you send me an email? flerlagekr@gmail.com. It's just too difficult to address these questions in this comments section.

      Delete
  24. Hi Ken, I'm using your Multilevel Sankey for a report I'm working on and so for it looks great! My only concern at this point is performance. I understand that data duplication is necessary to produce those nice curves, but even reducing the duplication to 4x instead of 98x doesn't drastically boost performance. For this reason, I believe the lack of performance is more so due to the calculations themselves and the impact that filtering & other dashboard interactions have on those calculations.

    If you were to fine tune some of the calculations used in the Sankey, which calculations are the most resource intensive and is there a way to reduce some of that intensity while still maintaining a multi-level Sankey? Any suggestions would be helpful.

    Thanks,
    Daniel

    ReplyDelete
    Replies
    1. Performance issues with these sankeys tend to be due to the number of records. While this template will work with non-aggregated data, I highly recommend pre-aggregation of your data. If you can find ways to reduce the amount of data by aggregating, that should make a huge difference.

      Delete
    2. The data set that I'm working with is randomly generated sample data and only ~1k records. Even with the duplication this is only ~100k records for the Sankey, which usually isn't a lot for Tableau to handle. This is why I believe the performance issues are due to the calculations themselves and not necessarily the volume of data.

      If it makes any difference, I'm relating the model to the data set rather than joining them together. Though, joining them at the start doesn't seem to make any sense, since it would impact the rest of dashboard that doesn't need the duplication.

      Delete
    3. The calculations for this are super-complex, there's no doubt about that, but I still think it should perform pretty well. Is there any chance you could share a sample workbook with me via email? flerlagekr@gmail.com

      Delete
  25. Awesome templates! Exactly what I need. I so appreciate that the Tableau community shares its resources. Quick question: Do you know if anybody has been able to associate and display conversion rates on the curves? i.e. X% of Source A went to Target D

    ReplyDelete
    Replies
    1. Yes, this is definitely possible. Please email me. flerlagekr@gmail.com

      Delete
  26. Thanks for the templates. I'm having difficulty with one though, the Multi-Level Sankey. I'm using Prep to clean up my data from a SQL database and was able to replace the template data source with mine. However, the Curve pages are giving me trouble and an error on the Curve1-2 Polygon pill. When I edit the table calc, I'm seeing a selection for the Path, but that's not in the template as a choice. what are the formulas for Path(model) and t(model) in your template, maybe I can add those to my prep flow and then it will work. Thanks!

    ReplyDelete
    Replies
    1. That's a secondary data source that gets joined to your business data. Take a look at the Excel template included above and it'll show you the structure of this data. You'll then need to either join the Excel "Model" tab to your SQL data or create a table in SQL with this structure.

      Delete
    2. That worked great, Thanks!

      Delete
  27. Hi Ken, I love these templates! I used a combination of these to create a mash up of Stefania's Sankey and a multi-level traceable Sankey with two curves. I have it set up such that the user can select categories from Bar 1 and Bar 2 through the use of two parameters. The curves and thin bars would only highlight those that fall in those specific categories. However, I'm having trouble getting Bar 3 to highlight, since it is dependent on the parameter selections. I would appreciate any insights you might have in getting that last bar to highlight correctly. Thanks!!

    ReplyDelete
    Replies
    1. Hmmm. I'd probably need to see an example in order to better understand what you're trying to do. Any chance you could send me an email? flerlagekr@gmail.com

      Delete

Powered by Blogger.