The Power of Place: Unleashing Census Data in Your Tableau Analytics

Kevin and I are incredibly excited to have Sarah Battersby join us today as for another guest blog (if she keeps writing for us, we may have to make her an honorary Flerlage Twin!!). Sarah has been a member of Tableau Research since 2014. Her primary area of focus is cartography, with an emphasis on cognition. Her work is focused on helping everyone to visualize and use spatial information more effectively—without the need for an advanced degree in geospatial. Sarah holds a PhD in GIScience from the University of California at Santa Barbara. She is a member of the International Cartographic Association Commission on Map Projections, and is a past President of the Cartography and Geographic Information Society (CaGIS). And, perhaps her greatest skill—she can identify Kevin and Ken correctly 50% of the time! Sarah can be contacted at sbattersby@tableau.com or on Twitter @mapsOverlord.

 

Problem: You have a ton of US data—and it has a spatial aspect to it. It’s easy to map your data in Tableau, but you want to know more detail about the geographic locations where you have lots of data. Maybe you want to know about the total number of people and their race or socioeconomic status? Or the median income or age? Or how about level of education? The US Census tracks all sorts of social, economic, housing, and demographic details about the US population, but it can be hard to collect it all and get it into Tableau. Especially if you want the data at fine detail (e.g., Census Tract) and for large areas (e.g., more than one state).

 

Let’s fix this problem! In this post, I will discuss methods for collecting Census attributes and geographies to use in Tableau. We’re going to get into a bunch of different technologies along the way, but don’t let this stress you out. In most cases, you won’t need all of this. My goal is simply to give you everything you might possibly need in order to collect and map this data.

 

Collecting Attributes

You can always download attributes as individual tables from the census website but, if you want a lot of data, it might be easier to use the Census API and some code to automate the data collection. The Census API provides a programmatic method for accessing a number of their data products. It lets you directly tap into datasets by specifying the product, the geography, and the attributes. When you use Python (or the language of your choice) to automate your calls to the API, you can collect large quantities of data, combine it together, and write it to an output file (e.g., csv file) in one fell swoop. To that end, I’ve written a Jupyter Notebook (basically an interactive file that contains code, images, and narrative text) which contains all the code as well as detailed descriptions of what we are doing at each step. The basic process that the notebook takes is to iterate through calls to the API until we collect everything we need, after which we write the data to a csv file, which we’ll use in Tableau.

 

 

Mapping State/County in Tableau

If you’ve run through the Jupyter notebook, then you now have all your attributes in a csv file. If those attributes are at the county or state level, you can map those directly in Tableau using the built-in geocoding. We won’t need any fancy spatial files or anything like that.

 

 

If you have states, then all you need is the state name. Counties, on the other hand, can be mapped with a county name/state name combination OR with the 5-digit FIPS code. This code is just a combination of the 2-digit state FIPS code (e.g., 01 is the code for Alabama) and the 3-digit county FIPS code (e.g., 001 in Alabama is Autauga County). You should have all of this information with the data you downloaded. The trick is to make sure the FIPS codes are defined as strings (or else leading zeroes will be truncated), and then make sure you have them combined into a single 5 digit FIPS code (e.g., 01001 is the 5-digit code for Autauga County, AL). Once this is in Tableau, just be sure to make sure the field’s geographic role is set to “County”.

 

Different Levels of Detail

If you are mapping at another level of detail (e.g., census block, block group, tract, etc.) you will need to download the spatial boundary files from the Census. Fortunately, they’ve made all of this data available online. If you just need a few files, it’s easy to download them from the Cartographic Boundary Files page. There will often be different files for different years (census tracts, for example, change with each decennial census) so make sure the year you download aligns with the year of your data.

 

Another thing to consider is the level of spatial detail (generalization) of the spatial file. For instance, you can download US Counties as a 1:500k, 1:5m, or 1:20m file.

 

 

The more generalized, or simplified, the geographies, the smaller the filethe 1:20m file is less than 1MB in size, while the less generalized/more detailed file at 1:500k is 11MB. Tableau will have to work harder to process the more detailed files, so choose the one that meets the level of detail you need in your visualization.

 

Using the Census FTP Site

If you need a lot of spatial files—for instance, all Census tracts for 10 different states— you can use the Census FTP site to collect all of the files you need at once, rather than downloading them individually. Here’s how:

 

Connect to the Census FTP server and find the data. The 2019 TIGER data can all be found here. For Census tracts (TIGER 2019 data), I use my Windows file explorer to add a network location of ftp://ftp2.census.gov/geo/tiger/TIGER2019/TRACT/ 

 

 

This takes me to a directory with zip files for the tracts for each state. I can select all of the states that I want and copy all of the files I need to my local drive. You will need to know the state FIPS codes as the end of each file contains that code, rather than the state name (you can find a list of FIPS codes here).

 

 

With the zip files on  your local drive, select them all and unzip them. I like 7-Zip since it lets me easily unzip lots of files at once, but you can use whatever software you like, including the one that is built into Windows. Whatever you use, you’ll most likely end up with a separate folder for the geometry for each unzipped file. To make your life easier, it’s good to move all of the files into the same parent folder (that makes combining/unioning them easy in Tableau or GIS).

 

I use Windows, so I will tell you how I do this from a Windows command line. I’m sure you can do the same on a Mac and I’m sure that Mac users will tell me how infinitely better that method is :) All I can say is that if you use a Mac, just use Google to figure out how to move a bunch of files easily.

I unzipped all of my data into a single directory. Then, us
ing the command line, I first change to that directory. Next, I use a simple statement to recursively walk through each folder in the directory and move the files to a new directory located at C:\temp\Census - tiger\tracts_all\.

 

FOR /R ".\tracts" %i IN (*) DO MOVE "%i" "C:\temp\Census - tiger\tracts_all"

 

We now have a nice, single location with every freakin’ part of every freakin’ shapefile for Census tracts in every freakin’ state. That’s 357 files!  

 

 

Note: If you’ve used the Jupyter notebook then you may already be adept with Python. In that case, then feel free to do the above in Python if you wish.

 

Combining the Files in Tableau

To combine all of these files together, you can add all of these shapefiles in Tableau and union them together using Spatial union in Tableau, which is new to 2020.3. Just create a wildcard union then create a hyper extract. It’s important to note that these files can be quite big when you’re working with small geographies over a large spacefor instance, an extract containing all Census tracts for all US states is a 200MB file!).

 

Combing the Files in GIS

If you’re still on a pre-2020.3 version of Tableau, then you can use a GIS, such as the free/open source software QGIS, to union the files together. Just use the Vector Data Management tools to merge vector layers. You can click on three dots next to ‘Input Layers’ and add all of the *.shp files in the directory where you unzipped your shapefiles.

 

 

Small geometries come in as fairly detailed polygons—and as really big files so you may wish to simplify them a bit. A great tool for simplifying the geometry is the online software, MapShaper. Once you have your geometries collected (and combined together, if needed) you can use MapShaper to quickly and easily simplify them.

 

So, we now have all of our attributes and we’ve combined and simplified all of our geographies. The only thing left to do is join them together in Tableau. Once we’ve done that, we’re ready to map away!!

 

 

Wrap-Up

Okay, I know that I just threw a lot at you—we talked about Python, Jupyter notebooks, APIs, FTP sites, command line, and GIS tools. Fortunately, it’s likely that you won’t have to use all of this every time you need to collect census attributes. For instance, smaller projects may only require the Python code and Tableau. However, if and when you find yourself in a situation where you need lots of different attributes for lots of different geographies, I hope that this post acts as a helpful guide for the tools and steps you’ll need to take to go from idea to amazing Tableau map! Thanks for reading!

 

Sarah Battersby

March 8, 2021

2 comments:

  1. Hello Ken, can we get spatial files of another countries?

    ReplyDelete
    Replies
    1. This blog is focused on getting info from the US Census website, so the above is focused on the United States. However, other countries likely have similar sites you can use to obtain spatial data.

      Delete

Powered by Blogger.