Small Multiples: Just a Matter of Long Division


    

Small Multiples: Just a Matter of Long Division


The Tableau community has released a ton of great material regarding small multiples.  I’d like to add my two cents to this knowledge base, but direct it to those folks that might be relatively new to Tableau or have never created small multiples.  As part of this, I will go in depth with how the calculations themselves actually work so you fully understand them and can recreate them on your own.  I will also throw in a very handy trick that I use when creating my own small multiples.  

WHAT ARE SMALL MULTIPLES?

First, what are small multiples?  Small multiples (otherwise known as a panel chart) are just multiple charts displaying the same data but for different dimensions.  As an example, Andy Kriebeldoes an incredible job of using small multiples with his NHL Attendance visualization.  You’ll see that his line charts each measure attendance, but each one is specific to a different team.


Another example is my Ryder Cup #SportsVizSunday viz that looks at US wins versus Europe wins (yeah, I know…pie charts, but it was a fun way to visualize the data).  Each “golf ball chart” measures US wins versus Europe wins, but each one represents a different year.



Small multiples are a great way to show a lot of information in one view while providing context.  Basically, you can quickly provide comparisons between different dimensions of your data.

WHAT DO I NEED TO KNOW BEFORE CREATING SMALL MULTIPLES?

In order to understand how to create small multiples, lets first talk about X and Y coordinates.  Ken Flerlage does an excellent job explaining X and Y coordinates in his Beyond Show Me blog.  The blog describes how pretty much anything can be drawn in Tableau by using points on a Cartesian plane ( x / y plane).  Take a look at his example using The Periodic Table.  In that example, he basically plots out the points for each element using a Cartesian plane.

Small multiples will be constructed in a very similar way.  We are essentially going to choose the specific locations of our charts (or panels) by plotting them.  (As a side note, you could create small multiples by creating a bunch of charts and laying them next to each other on a dashboard.  However, this can be quite time consuming and is not very flexible.  This method will describe how to build them in a single sheet).  

Let’s stick with the sports theme for an example.  Assume you were looking at concession sales for the <insert your favorite NFL team> (as long as it’s not the Pittsburgh Steelers) for the past nine years (2010 – 2018).  You want to show the changes over time and decide to use small multiples, which means you’ll show nine charts, one for each year. In this case, you might decide to plot the small multiples with three in each row and three in each column (3 x 3).  To do so, you would plot points that look something like the following where point #1 would represent the 2010 chart, point #2 would represent 2011, and so on:
If your data set was in Excel, then you could simply add columns for your X and Y coordinates and fill out each row manually.  However, this is rarely something we are able to do in the real world and it is not flexible in any way.  So, it is best that this work be done within Tableau.

There are a few terms and functions that we must first understand before attacking this within Tableau.  These are Modulo and INDEX.

Modulo returns the remainder in a division problem and ignores the quotient (now you are probably starting to understand the title of this blog post – small multiples and long division J).  For example, if you were interested in 9 divided by 4, then the quotient would be 2 (4 goes into 9 two times for a total of 8) with a remainder of 1.  In this problem, the modulo is the remainder of 1 (and we don’t care at all about the quotient).  How about one more example.  If you were interested in 42 divided by 10, then the quotient would be 4 and the remainder (or modulo) would be 2.  Again, the quotient is meaningless to this function; we are only concerned with the remainder. 

In Tableau, modulo is represented by the dividend, a percentage symbol, and the divisor.  So in our example of 9 divided by 4, we would write it as 9%4.  This means that we want to obtain the modulo of that equation and as we discussed, that result is 1.  For the other problem we introduced (42 / 10), we would write it as 42%10 and the result would be 2. 

Next, we must introduce INDEX. Tableau says the following about INDEX: “Returns the INDEX of the current row in the partition”.  A very basic way to look at this is that INDEX will number the rows of data for you.  (Please note that INDEX is a lot more complex than this and has so many applications.  For more on INDEX, I encourage you to watch Lilach Manheim's Tableau Fringe Festival presentation: "One Table Calc to Rule them All").  Okay, let’s take a look at an example using Superstore data.  If you look first at Order Date using Years only, then bring in INDEX, you will yield the following result:


It’s pretty simple.  INDEX numbers the first row shown as 1 and the second as 2.  Now let’s show both year and quarter:


You’ll now see that INDEX changes.  It is still numbering the rows, but it is doing it differently than it did when only years were showing.  Now, let’s add month as well:


Again, it is numbering the rows for us, but doing so based on this particular view.  The reason it changes is that INDEX is actually a table calculation.  Don’t freak out! 

In Tableau, INDEX is represented as “INDEX()”.  Since it is a table calculation, the way in which it is calculated can be changed (which will be handy for later).  To do change this, edit the calculation to bring up the calculated field window.  Next, click on the “Default Table Calculation” link at the bottom right:



From here, you will be provided with another pop-up window.  This window will allow you to change the value it is using to compute the INDEX.  Use the “compute using” drop down to see your options.  To tie back to our date example above, let’s choose “Order Date” from the drop down. 


You are now provided with more options in the two other drop down menus.  If you look at “at the level”, you can see that we can choose to calculate at year, month, etc.  We are going to ignore these additional drop-down menus. 

HOW DO WE CREATE SMALL MULTIPLES?

By now, I’m sure you are sick of all the setup…let just get to it already!  I couldn’t agree with you more.  For this example, I will be using Tableau’s sample superstore data.  The intention will be to build small multiple line charts tracking sales over time for each sub-category (sub-category will be used for the small multiples).  I encourage you to follow along by creating it yourself or by downloading the sample workbook.

Step 1 - build the original viz: bring Order Date onto Columns and set it as Month of Order Date.  Right-click and change this date to Discrete.  Bring Sales onto the Rows.  Then bring Sub-Category onto the Detail card.  For now, let’s just assume we are only interested in Furniture and Technology and filter out Office Supplies (we could easily do it with office supplies, but I am trying to remove some clutter for this example).  It should look something like the following:


Step 2 - create the calculations needed for the small multiples: earlier, we discussed how you could draw anything in Tableau using X and Y coordinates and that we will use that same concept for small multiples.  In the above example, we talked about a 3 x 3 grid.  As a starting point for our small multiples, we will need to figure out how we want this laid out (note that later in this blog post, I will reference a great video by Andy Kriebel that talks about how to do this automatically and I will also provide a nice trick to allow you to easily change your layout).  Furniture and Technology each have four sub-categories tied to them for a total of eight sub-categories.  So, we may decide to construct our small multiples as a 4 x 2 grid.  (You can do it however you like, but calculations will change based on your decision.  I will explain this in detail).  We now know we want two rows of four - four panels in the first row and four more panels in a second row.

So let’s start by determining what the X coordinate will be.  You’ll recall that we previously discussed INDEX and Modulo…well guess what…we need to use them both right now!!!  As we discussed, the INDEX of the first row will be 1, then 2, then 3, etc.  Well, couldn’t we just use INDEX for the X coordinate?  It would place the first panel at X coordinate of 1, the second at 2, the third at 3, the fourth and 4, the fifth at 5...wait a minute, we don’t want the fifth to be at five, we want it to start back at 1, but on the second row.  Well, how do we get it to start over?  If you were thinking modulo…well then you guessed it!  Let’s give it a try! 

What number do we need to be the divisor?  Well, it really simple.  You want four panels in a row, then we will use four as the divisor.  That guarantees that you only have 4 options of a resulting remainder (modulo).  Those options are 0, 1, 2, and 3.  For example, take a look at the table and the resulting Modulo figures:


So, the modulo result of ANY number divided by four is 0, 1, 2, 3.  So, we will place our panels at X coordinates corresponding to those modulo figures of 0, 1, 2, and 3.  Make sense?

So, what numbers do we take the Modulo of in order to get to 0, 1, 2, 3.  The resulting numbers we need are listed above: 0, 1, 2, 3, 4, 5, etc.  Well we have INDEX, but those are 1, 2, 3, 4, 5, 6, etc. rather than the desired 0, 1, 2, 3, 4, 5, etc.  What if we just subtract 1 from the INDEX?  I think that will work.  I’ve added INDEX and (INDEX – 1) to my table:


As you can see, (INDEX – 1) and Number from the previous table are the same.  So we can use (INDEX – 1).

As mentioned before, if we wanted to get the modulo of 9 divided by 4, we would write it as 9%4.  So in our case, to get to X coordinates of 0, 1, 2, and 3, we would write our calculation as “(INDEX() – 1)%4”.  So, for our first panel, the INDEX will be 1 and our calculation will look like the following (1-1)%4 or 0%4.  The result as shown above is 0.  For our second panel, our INDEX will be 2 and the calculation will look like the following (2-1)%4 or 1%4.  The result as shown above is 1.  For the heck of it, let’s try our 7thpanel which would be the third to last panel on the second row (remember our multiples are set to be placed in a 4 x 2 pattern).  This means it should have an X coordinate of 2 which is the third out of the list (0, 1, 2, 3).  So (7-1)%4 or 6%4 = 2.  I think we have it!  So go ahead and create a calculated field called “X Coordinate” and the calculation should be: (INDEX() – 1)%4.

Now that we have the X coordinate, we now need to find the Y coordinate.  This one is a bit trickier to think about logically.  Let’s talk through it.  First, we will only have two rows.  So similarly to the X Coordinate, we will try to get those to start with 0 and 1 (2 and 3 are not necessary since there are only two rows).  The first four panels will need to be placed at a Y coordinate of 0 and the next four at a Y coordinate of 1.  So how do we write a calculation to get 0 and 1?  Here’s a small hint: we will again use INDEX, but will not use modulo. 

Okay, so we will have INDEX of 1, 2, 3, 4 that need to be set at 0 (remember the first four panels will be need to be at a Y coordinate of 0) and INDEX of 5, 6, 7, 8 to be set at 1.  Since we want them in groups of four (four on each row), what if we divided the INDEX by four then round down to the nearest integer?  Let’s try it. 


We needed INDEX of 1, 2, 3, 4 to be 0.  1, 2, and 3.  1, 2, and 3 provide the correct result of 0, 4 does not…it results in 1.  That means we would have three panels on the first row, then the fourth would fall to the second row.  We certainly don’t want that.  So why don’t we use the trick we used in the calculation for the X Coordinate – let’s subtract 1 from the Index then do the Calculation. Below are two tables, one showing just INDEX and one showing (INDEX – 1).


Here you can see that INDEX by itself did not work, but (INDEX – 1) worked exactly like we wanted it to.  So now we have the (INDEX()-1)/4 piece, but how do we get Tableau to round down to the nearest integer?  There is no “rounddown” function like you may have used in Excel, but the integer function - INT() - will round down to the nearest integer.  So, we will simply add INT to the beginning of our calculation: INT((INDEX()-1)/4).  So, go ahead and create a calculated field called Y Coordinate with the calculation of INT((INDEX()-1)/4).

Step 3 – implement the small multiples in Tableau: we now have the X and Y coordinates in which to place our small multiples…and now we are dangerous!!!!  So let’s finish this off.  First, right click on X Coordinate and convert it to discrete.  Do the same with Y Coordinate. 

Now, you’ll recall that Index is a Table Calculation, so we need to make sure it is calculating in the way we want it to.   Don’t fret, this is very simple.  You will set it to calculate based on how you want your small multiples laid out.  In Andy’s NHL viz, he laid them out by team.  In my Ryder Cup viz, I laid them out by year.  In this Superstore example, we are laying them out by Sub-Category.  So, let’s make sure it is calculating properly.  Select X Coordinate and edit the calculation.  In the calculated field window, you should see highlighted at the top: “Results are computed along the Table (across)”.  As mentioned, we need this to be computed along “Sub-Category”.  Click on the “Default Table Calculation” link at the bottom right to bring up the Table Calculation window.  In the first drop-down, choose Sub-Category as follows:


Click OK and then your main calculation window should now read at the top: “Results are computed along Sub-Category”.  Follow the same exact steps with the Y Coordinate calculation.

Before moving onto the last step, you need to go to Analysis -> Table Layout -> and make sure Show Empty Rows and Show Empty Columns is checked.  If you do not do this, then certain chart types (like a bar chart) will try to "fill in" the missing data with other data.  


The last step is to move the X Coordinate to the Columns shelf and the Y Coordinate to the Rows shelf and make sure they are first on the shelf.  And that should be it!  Your worksheet should now look like the following (I took the liberty to put sub-category on color to break it up a bit, just to make it easier to see):


Please note, there are times where there are issues with data from certain values showing with other values.  In this case, you'll need to update the X and Y table calculations.  Check out this forums post where I address this exact need.  The same can be applied to your problem.

From there, you can clean it up by removing grid lines (you may want to keep the panes), removing headers, adding labels, etc.  If you have any issues, you can download and my sample workbook from Tableau Public.

Once you’ve learned how to do this and understand the calculations behind it, then it becomes fairly easily to replicate without too many issues.  Your X Coordinate is always (Index – 1) % and the number of columns.  Y Coordinate is always Int ((Index – 1)/ the number of columns).  Then these must be set to compute along the dimension in which you wish to display your multiples (in this case, sub-category). 

As you know, the above technique forces you to choose the layout, i.e. 3 x 3, 4 x 2, etc.  In most cases, people have an idea of what will look good, but you are not always right.  One cool trick that I ALWAYS IMPLEMENT WHEN DOING SMALL MULTIPLES is instead of hard-coding the number of columns (in this example, that number was four), I like create a parameter that you can modify to see how you like the layout of your panels.  I’ve done this with a separate X and Y coordinates calculations (X Coordinate with Parameter and Y Coordinate with Parameter) in my sample workbook.  Basically, my parameter replaces the hard-coded 4 number in all of my calculations.  Try changing the values to watch the small multiple layout change.


Andy Kriebel has also recorded an incredible video that shows how to allow Tableau to automatically choose the layout for you.  It can be viewed here and I certainly recommend watching it. 

Okay, last but not least...if you came here ONLY for the calculations, here they are:

  • Create a "Number of Columns" parameter.
  • X Coordinate: (INDEX()-1)%[Number of Columns]
  • Y Coordinate: INT((INDEX()-1)/[Number of Columns])
Thanks for reading this post and I hope you enjoyed it.  As always, if you have any questions, feel free to contact me at any time. 

Kevin Flerlage, January 14, 2019




4 comments:

  1. Very interesting and helpful post. Just to highlight the mathematical beauty of this process, I would just point out that the Y coordinate could also be defined as the quotient (or its integer part, more precisely) of the division of INDEX()-1 by the number of columns (the number of panels in each row).

    That is, once you have chosen the number of columns, when you divide each INDEX()-1 by that number, the quotient of this division will give you the Y coordinate and its reminder will give you the X coordinate.

    In yet other words, when INDEX()-1 is the dividend and the number of columns is the divisor, the quotient becomes the Y coordinate, and the reminder becomes the X coordinate.

    Also, I am not a Tableau user, but I looked it up and it seems that Tableau has a function to return the quotient, or the integer part of a division ("integer division"): the "DIV" function. So Y could also be calculated with the expression DIV(INDEX()-1, Number_Of_Columns).

    Great post.

    ReplyDelete
  2. Really helpful post. Always great to really understand the why and the logic behind. Thanks for that

    ReplyDelete
  3. I noticed that the dimension is being alphabetized vertically along the columns rather than horizontally along the rows. This is probably not the most elegant solutions but I fixed this by simply swapping the x and y coordinate fields and replaced the number of rows in those calculations with the number of columns. Worked perfectly

    ReplyDelete

Powered by Blogger.