Introducing the Snowflake Chart
In August, I wrote a post about visualizing hierarchies (you can find the original post here). I began that post by talking about a beautiful infographic I saw a number of years ago, which showed a hierarchical breakdown of world religions. Here’s the infographic.
I was intrigued by this both because I have always been interested in the world’s different religious faiths, but also, of course, because I love data. I went on to talk about a number of different ways to visualize hierarchies—rectangular treemaps, sunbursts, packed bubbles, circle packing, dendrograms, and a very interesting implementation of a Voronoi treemap called “foamtree.” I wrapped up the post theorizing about the possibility of a visualization similar to the World of Religion infographic shown above, something more like a “snowflake”:
Most of the visualizations above are “inward”—showing increasingly smaller subdivisions of the whole in an inward fashion. But, they can be very difficult to view because we’re packing increasingly smaller sets of data (and associated labels) into increasingly smaller spaces. As the number of hierarchical levels increases, this becomes very problematic. Of course, tools like foamtree help to mitigate this problem by allowing you to zoom into each level. But I wonder if there is another method that might work better. What if we could create something like the “World of Religion” infographic, where we start in the middle and grow outward in sort of a snowflake pattern? This type of visualization would certainly help to reduce the problem noted above (though I wonder if we’d have the opposite problem where we simply start to run out of space horizontally and vertically). I have personally not yet seen anyone create a visualization like this, but if you know of a way to create it, I’d love to hear about it.
Since that post, I have continued to look for such a visualization. I’ve found a few things that came pretty close, including a force layout diagram like the one below, which was created using D3.js:
But, D3 requires code to be written, where I’d really like to be able to create my visualization directly in Tableau or Power BI or similar tools.
I also found something called Galaxy Charts (www.galaxycharts.com). Here’s one I created with the World of Religions data set:
Not bad, but it has some flaws as well. The free version only allows you to create image files—you cannot interact with it, rearrange the nodes, etc. The paid version allows you to create editable objects in Excel, but that’s still a painful process and, like the D3 chart, cannot be created in Tableau or other visualization packages.
I also looked around the Tableau community to see if anyone had created similar charts. There were some implementations of Force diagrams, but none seemed fully capable of meeting my needs.
So I was stuck. I really wanted to try to recreate the World of Religions visualization, but there just didn’t seem to be any automated way to create it. But, I suppose this is the beauty of products like Tableau—you can basically create anything you could possibly imagine. So, I set to work on creating a completely custom World of Religion chart in Tableau. I’m pretty pleased with the results and think it looks pretty darn similar to the original infographic:
I did not want to clutter the visualization, so I hid the labels for some of the smaller nodes and moved the descriptions to the tooltips. Of course, the real value comes in actually interacting with the visualization, which you can do here.
Though I was really satisfied with the results, this visualization was a lot of work as it required me to manually determine the X and Y coordinates of each node. But the bigger problem is that it is not even remotely reusable—creating a similar visualization for a different set of hierarchical data would take just as much time. I still wanted to find an automated approach so I decided to see if I could invent a new type of chart which could satisfy this need. The result is my new “snowflake” chart.
The bulk of the work is done within an Excel template. Here are the basics of how it works:
The bulk of the work is done within an Excel template. Here are the basics of how it works:
- The template first determines the number of child nodes connected to each parent. Based on that number, it calculates the angle at which each child will “hang off” of the parent (based on the 360 degrees available around the parent). For example, if there are 8 child nodes, then they would be spaced 45° apart, thus being placed at 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315°
- Based on a configurable line length and the angle, an imaginary right triangle is drawn. Trigonometry is then used to calculate the “length” of the sides of that triangle, which translate into the X and Y coordinates of the child node (Admittedly, I have not used trigonometry since I took it in High School and had to look up the formulas, but they worked perfectly and just as I remember!).
- The template repeats the process for each parent/child relationship.
- Slight adjustments are made to angle calculations to account for the line connecting a node to its parent (we don’t want to put a node right on top of this line).
From here, it’s just a matter of visualization it in Tableau. A dual axis chart is used with one axis handling the nodes and the other handling the connecting lines.
There is certainly much more to it and I don’t want to bore you with the details in this post, but if you are interested, I’d be happy to provide a more detailed explanation of both the Excel and Tableau templates. That being said, I have tried to design the templates to be usable even without understanding all the underlying details. It’s really just a matter of plugging your data into Excel and then feeding the spreadsheet into Tableau. The following fields are required in the Excel template:
- ID – Unique identifier of each node.
- Level – Level where the node exists in the hierarchy. The main node will be level 0, its children will be level 1 and so on.
- Parent – The node’s parent ID.
- Magnitude – The size of the node. This should be based on the data you’re visualizing. Using the World of Religion example, the magnitude is the number of adherents.
- Line Length – Configurable length for each line. By making this configurable, you can tune the look and feel of your visualization and ensure there is no overlap in your nodes.
- Label – The label that appears on each node.
I’ve also included a number of other fields which will allow you to enrich your data set and add details to tooltips. They include name, category, and ten user-defined “Info” fields into which you can put any data you wish.
So, with the new snowflake chart designed, it was time to give it a test run. I plugged my World of Religion data into the Excel template, then fed the spreadsheet into Tableau. Here’s the result:
Granted, this is not quite as visually engaging as the fully custom visualization, but as was my goal, it is much more automated, allowing me to easily create similar visualizations with other hierarchical data, regardless of the number of levels, the number of nodes, etc.
Granted, this is not quite as visually engaging as the fully custom visualization, but as was my goal, it is much more automated, allowing me to easily create similar visualizations with other hierarchical data, regardless of the number of levels, the number of nodes, etc.
If you think this new “snowflake” visualization could be of use to you, then feel free to check it out. The Excel template can be found here and the Tableau template can be found here. If you have any questions or would like more detail about how this works, let me know and I’d be happy to give you further information.
Ken Flerlage, December 5, 2016
Website: www.kenflerlage.com
Tableau Public: https://public.tableau.com/profile/ken.flerlage#!/
Thanks for introducing to us this snowflakes chart. I have able to see some of the good features from this material and it looks to be more efficient use for learning.
ReplyDeleteI am your fan <3 love from Pakistan!
ReplyDeleteWow thank you!
DeleteHi Ken, thanks for providing such reusable template to create the chart!
ReplyDeleteI would like to ask if it is possible to show a child node that has more than 1 parent node?
- Shin
Currently, not possible, but I'm sure it could be modified to work that way.
DeleteAnd also, I notice there's a Path Order and Flow Amount fields in the template. May I understand how these fields are being used?
ReplyDeleteI'm not using these. Looks like these were remnants of some earlier experimentation and I just forgot to remove them from the template.
DeleteThanks for this! Do you have a blog post with more details on how you created the Excel sheet? Love your work!
ReplyDeletehey, I am also looking for same. were you able to solve it?
DeleteIs the Excel template still available? The link above took me to an empty folder.
ReplyDeleteOops. I changed my cloud provider for these files recently and must have missed this one. I just updated the link.
Delete