In this post in the QGIS for Tableau users series, we’re going to tackle the classic problem of ‘what is nearby?’  While there are some features built into Tableau to help with this (e.g., field calculations for buffers and distance, as well as the spatial intersection join type)—there are some types of proximity analysis that they can’t do. But, some nice data prep in advance can help you calculate many of the attributes you might need to unlock some special spatial functionality in your Tableau workbook.

This post covers:

Spatial joins - assign attributes from one spatial dataset to another based on how they overlap in space. Tableau has point and polygon spatial joins built in, but what if you need to know what POLYGON every POLYGON overlaps?  That isn’t in Tableau (yet).

Voronoi or Thiessen polygons - make some new polygon geometry to show which area is closest to any given point

Lookup table for adjacent polygons - tell me every polygon that this polygon touches?  I want to select a county and see the sum of sales for all adjacent counties to compare! (I won’t actually use QGIS for this, but will show you a super fast way of calculating this with another neat little tool).

The Real Basics (Your Series Disclaimer Message!)

This post goes into how-to for specific tasks. If you need to take a step back and see where to even start (setting up QGIS, basics of adding and working with files, etc.) please refer back to QGIS for Tableau Users #1: Getting Started

There are also several great QGIS tutorials that will provide a broader (non-Tableau focused) introduction to the power of the software, such as QGIS Tutorials and Tips.

Now, we’ll get back to our regularly scheduled blog post message about awesome stuff you can do with QGIS!

Why Analyze What’s Nearby?

Maps are great for making sense of spatial patterns—and your eyes are the first line of attack in the exploration. But, sometimes you need attributes to back up what you’re seeing with your eyes. For instance, in the image below we can see the relationship between Madison County and it’s neighbors...but we don’t have any attribute that would let us calculate anything about those neighbors—for instance, maybe we want to know total number of customers in that county compared to the average number in the neighboring counties. To get that type of result, we need something that absolutely links these locations together. So, even though we can see that they are neighbors, Tableau can’t tell us that without help.

When You Don’t Need QGIS:

Most of today’s post is on topics where you really can’t do them in Tableau at this point in time—not even with some cheating. The things that you can do (or at at least kinda) in Tableau are:

·        Use the Tableau distance and buffer functions to find nearby locations (see this post on Proximity analysis in Tableau)

·        Assign spatial attributes  based on overlap using the Spatial Intersection join type in Tableau (2018.2 to 2021.2 support point to polygon intersections only; 2021.3+ support combinations of points, lines, and polygons).

·        If you want to make a table of distances from every point to every other point (distance matrix). You can just join the table to itself in Tableau with 1=1 as the join calculation and then do the distance calculation.

Otherwise, QGIS and other awesome spatial tools are your friend!

What I’ll Be Sharing

In this post, I’ll walk through a few ways to set up your data to get it ready for additional analysis and visualization in Tableau:

• Assign attributes from one dataset to another based on location (Spatial joins)
• Make polygons to find the region closest to any point in your dataset (Voronoi or Thiessen polygons)
• Make a lookup table with lists of all adjacent polygons

Spatial Joins

As of Tableau 2021.3, Tableau supports spatial joins between just about any combination of points, lines, and polygons. If you are working with an earlier version of Tableau and need to do a spatial join between polygons and polygons, or lines and polygons, for you will have to do that work outside of Tableau. Since I’m writing away on fun things to do with QGIS, I’ll just document how to do that here just in case you’re working with an earlier version of Tableau. It’s worth at least a quick read, because I’ll include ways to split your dataset at the intersections. Tableau will just look for the intersection and mark the entire feature as overlapping or not. With QGIS you can do the same, but also break your lines or polygons based on where they overlap (‘cracking’ the lines or polygons into separate features). This comes in handy if you want to be able to calculate area or length of just the overlapping segment, for instance.

Essentially, what we’re looking to do with the spatial join is just do a spatial check of whether or not points, lines, and polygons share space...and if so, assign the attributes based on their overlap.

As an example, if I had a set of rivers of the US from the National Weather Service and wanted to know which county each river overlapped. I can see it on the map, but I want each river to have an attribute so that I can select rivers based on the counties they cross:

To assign a county name attribute to each river, we have a few options

Vector Geoprocessing Tools Union

Using Union you will combine multiple datasets to create a new dataset with separate features for the overlapping and non-overlapping parts. In the case of our rivers and counties, the rivers will be split at county boundaries returning a separate segment for each county that they overlap. As part of the union, each feature will have the attributes of both files in the union. Here is what it would look like for a highlighted line segment that crosses into two counties. The original table had ONE line feature and no county name attribute. The unioned result has TWO line features and the county name for each county that the line overlaps.

Union is great for anywhere that you want to ensure that you’ll return separate geographies for each area of overlap!

Vector Data Management Tools Join attributes by location

Using Join attribute by location, you can choose (1) the specific type of geographic relationship between the two layers (e.g., should rivers fall completely within the counties? or just overlap? or intersect?) and (2) how you want the result to be returned—do you make a new feature for every match (e.g., a river that crosses multiple counties would return multiple records), or just the county name that overlaps the most, or just the name of the first county that QGIS encounters when it’s searching for overlaps?

Voronoi or Thiessen Polygons

What if you have a bunch of points and need polygons representing the entire region that is closest to any point?  That’s an easy one—you need Voronoi polygons (or you can call them Thiessen polygons...same thing)!

I’ll demonstrate how to make these polygons with the Boston public schools (x/y in projected coordinates SRID:2249) dataset that I’ve used earlier in this series of posts.

Just open your point dataset in QGIS and click on Vector Geometry Tools Voronoi Polygons...

You’ll need to pick your buffer region (e.g., how big of a bounding box around the extent of the data do you want...or should the polygons just go up to the edges of the bounding box around all points in the dataset), but otherwise you can just click run and see what happens!

If the result looks like what you want, just right click and export to use in Tableau. Each of the polygons will have all of the attributes of the point that it was based on. Now you can use these polygons for any further analysis! Since these are the polygons around schools, maybe you’d do a spatial join in Tableau to find all of the students in the district in each of the polygons around each school to check if they are assigned to the school that is geographically closest.

Periodically on the Tableau forums, we get questions like this one about needing to find out all of the polygons that touch or are adjacent to a selected location. That’s a fun spatial question! If you’re working with 2021.3, you can actually do this with the spatial intersection feature—just join the polygon dataset to itself and you’ll be able to identify which polygons share an edge (the intersection will catch polygons that overlap, but also polygons that share edges/have borders that overlap somewhere).

However, if you’re working with earlier Tableau versions, this isn’t something you can do easily with just the Tableau tools. However, there are easy ways to make tables of adjacency.

When I say “table of adjacency,” I’m talking about a table with a list of matches for every polygon that touches every other polygon. If a polygon has three neighbors, there are three lines in the table with that polygon ID & the ID of each of the three polygons it touches. So, in this example from Oklahoma, there are duplicates of each origin county (ORIGIN_CTY) to provide a list of each neighboring county (NEIGHBOR_CTY). In the table here, I have two types of location identifier: county/state name and the FIPS code (a unique identifier that you can use to map in Tableau without using the county name field)

While this series has been intended as a set of QGIS tutorials, QGIS is not my favorite tool for this particular type of analysis. So, we’re going to jump to a new and exciting option for a really quick calculation of adjacency for any set of polygons! As with all fun things geospatial, it’s all about finding the right tool to get the job done (and to do it in a reasonably easy and accurate way).

Note: If you happen to just be working with US counties, the Census has a nice table of county adjacency that you can just download—you may need to do a little cleaning to get it to fit your exact need, but the adjacency is already calculated and ready to go! So, maybe you don’t even need to use any special geo tools if that is your use case!

I realize that Python isn’t everyone’s preferred option, but this is a nice quick way to get some complex and useful results. For this problem, my tools of choice are the PySAL and GeoPandas Python libraries. PySAL does the heavy lifting and GeoPandas makes it easier for me to manipulate the result and dump it out into a table.

Here is the quick script example that demonstrates how to make the matrix. With this little script we generate the matrix and then export it as a table for use in Tableau.

from libpysal.weights import Queen

import geopandas as gpd

# where is the spatial file

shp_path = "tl_2019_53_tract.shp"

# read it into a geopandas geoDataFrame

# use a named ID Variable (GEOID is the unique ID in my test file)

w_queen_id = Queen.from_dataframe(gdf, idVariable='GEOID')

# run through some results and make a table!

# maybe cleaner ways to do this, but this was fast for me

rows = []

for key in w_queen_id.neighbors:

for id in w_queen_id.neighbors[key]:

rows.append([key, id])

# write the results to a csv

df = gpd.GeoDataFrame(rows, columns=["originID", "neighID"])

df.to_csv(r"neighbor_list.csv")

Using the Table in Tableau

It’s just a quick relationship between the tables in Tableau! All I have to do is take my spatial file and connect it to the list of neighbors generated with PySAL. The link between them is the ID that I used in generating the neighbor list (GEOID) and the Origin ID field in my neighbor_list table.

After that, it’s just setting up the interaction in Tableau. I have a workbook on Tableau Public that you can download to check it out!

Wrap-Up

And those are some basic (or maybe kinda advanced) techniques that I use to explore spatial proximity/relationships between my data to help with my analytics in Tableau.

For now, this is the last in my little series of QGIS to enhance the spatial analytics options that you can do in your Tableau workbook. I’m happy to talk more and think about other fun ways to manipulate spatial data! Feel free to reach out on the Tableau Community Forums or to follow more of the random Tableau spatial thoughts that I share on Twitter (@mapsOverlord)...or to share the great maps that you’re making in Tableau!

Sarah Battersby

November 29, 2021