Ads Top

Auto-Refreshing File-Based Data Sources in Tableau Server

 

 
Before jumping into this blog, I wanted to give a shout out to Michael Perillo for his help on this blog. Michael is a Tableau Ambassador, co-leader of the Phoenix Tableau User Group, and all-round great guy. My background is mostly in Windows so I reached out to Michael for some help ensuring that I didn’t gloss over the Linux world. While this blog is still has a Windows-bent, with Michael’s help, we’ve provided some resources to help with Linux environments as well. Michael also gave some great feedback that definitely helped to increase the quality of the blog. Thanks Michael!

From my time answering questions on the Tableau Community Forums, one question I have seen time and again is how to set up file-based data sources (Excel, csv, etc.) to automatically refresh on Tableau Server. This is particularly difficult for a number of reasons. So, in this blog, I’m going to discuss the basic steps required to auto-refresh your file-based data sources and, along the way, I’ll try to explain why all this is necessary.

File-Based Sources
Before we start, let’s define file-based data sources. These differ from database data sources in that they are physical files that reside on your computer (or some shared network-based drive). Since these are physical files, they can be emailed, moved, copied, etc. just like any other file on your computer. To connect to them in Tableau, you simply choose from the list of file connectors, then select your file.


Note: Selecting “More” will provide you with some additional options:


Databases are different as you connect to them using some sort of virtual “connection string” and, once connected, you can use any of the tables, views, etc. that are contained in the database. While databases are ultimately made up of physical files, you neither know nor care where those files are located.

In my personal opinion, it’s always best to get your data from a real database whenever possible, as those databases tend to be the “system of record” for the data. Files that have been exported from the database are out-of-date almost immediately. However, sometimes you have no choice but to work with a file-based data source. Unfortunately, as noted above, there are a number of steps required to ensure Tableau Server can automatically update these data sources. So let’s talk about those now.

The 3 Keys of Auto-Refresh
There are three main things you must do when attempting to set up Tableau Server to auto-refresh file-based data sources:

1) Store the files on a network-based file share.
2) Ensure Tableau Server has access/permissions to the files.
3) Leverage the Universal Naming Convention (UNC) path of the files when connecting via Tableau.

Let’s discuss each of these separately.

Network File Share
Note: This section is focused on Windows-based file servers. If your company uses Linux-based file, these steps will be a bit different.

Most companies set up a file server which allows you to store files. One big advantage of using these file servers is that they are typically backed up frequently so that, if a file is lost or damaged, you can easily restore it. While it has also grown common for personal computers to be backed up, file server backups are generally much more resilient and reliable. Another advantage is that, with the correct permissions, file servers allow you to share files with others. And that’s the critical reason why we need to leverage them when auto-refreshing Tableau Server. Your files need to reside in a location that is both accessible to you and Tableau Server. If a file resides on your local computer then, generally, Tableau Server won’t be able to communicate with it.

Note: Strictly speaking, you can set up your computer so that Tableau Server can see it by sharing that specific file or the directory in which it resides (a shared directory is called a “file share” and it’s pretty much the same concept as is used on file servers). The problem with this is that your computer isn’t a server so it won’t be as fast, reliable, or resilient as a file server. And, in the case of laptops, we shut down our computers and take them with us at the end of the day. Tableau Server certainly cannot communicate with your computer if it’s powered down. So, while this is technically possible, I would suggest that you avoid this option at all costs as there are simply better ways to do this.

If you are going to be refreshing multiple files, then I generally recommend setting up a single location for each department or client you’re working with. You don’t want to set up one folder for everyone as those people will be able to access all of those files. But you also don’t want to set up a separate folder for each individual file, as that will require a lot of maintenance. So, work to find the right balance between manageability and security.

Note: Before moving on, I should note that there are some other alternatives to storing your files on a file share, including various cloud-based services for which Tableau has pre-built connectors.

Tableau Server Access
Note: For this section, you’ll need to know whether your Tableau Server runs on Windows or Linux, as the steps are slightly different. We’ll start with Windows-based Tableau Servers.

Now that your file resides on the file server, we need to make sure Tableau Server can access it. That’s where the Run As service account comes into play. This is basically a network user that runs the Tableau Server service. Since it runs the service, Tableau Server has the same permissions as the Run As account. So, if this account can access a file, then so can Tableau Server. Thus, we just need to ensure that the Run As account has access to the file share on which you’ve placed your file. Generally speaking, you’re going to need some help with this. Your Tableau Server administrator will need to provide the Run As username and your file server administrator will need to grant the access. Read-only access should be sufficient.

I don’t have much experience with Linux-based Tableau Servers, so that’s where I called in the help of Michael Perillo. The setup is a bit different and pretty technical, so you’re going to need to get your Tableau Server administrator on board. Be sure to clearly explain what you are trying to do. If you’re lucky, you administrator will already know the solution. If not, then the following document on provides detailed, step-by-step instructions on how to configure Tableau Server on Linux to connect to shared directories.

Use the UNC Path
Note: The following is Windows specific, but the same basic concepts should also apply to Macs.

Your computer will generally have a number of drive letters. For instance, C: is typically your hard drive and D: is usually a DVD-ROM drive. But many of your drive letters are “mapped network drives” which essentially point to a share on a network file server.

So, let’s assume that you’ve placed your files in the following folder:

N:\Departments\Sales\Tableau Server\

You open Tableau and connect to your files using this path. You build your data model then publish the data source and set up an automatic refresh. But, the first time it attempts to refresh, it fails. You’ve placed your files on a file share and you’ve ensured that Tableau Server has access to the share—so why did it fail?

The problem is that this N drive is just a sort of a pointer to another location on the network. It’s a shortcut that makes it easier for you to access your files. Generally, your co-workers will have the same set of mapped drives, so they’ll be able to access files using this path. But servers are different for a couple of reasons. First, mapped drives are user-specific, so to map those drives on the Tableau Server, someone would need to log on as the Run As account, then map those drives. Second, mapping drives simply isn’t something that is typically done on a server. That being the case, when Tableau Server tries to connect to N:\Departments\Sales\Tableau Server\, it will fail because, from its perspective, no such drive exists.

To correct this problem, we need to make sure to use the full Universal Naming Convention (UNC). A UNC path will generally start with \\ followed by the server name, then the folders. 

There are a couple of methods for translating a drive letter to the UNC. One option is to find the drive letter (in Windows, use Windows Explorer). It should look something like this:


Notice the N drive at the bottom. The first part of this, “Share”, is the name of the file share on the server. The second part, which falls within the parentheses, is the server name. To turn this into a UNC path, we start with \\, then add the server name, then the share, followed by the rest of our file path. So…

N:\Departments\Sales\Tableau Server\

becomes

\\DESKTOP-C3FKLQO\Share\Departments\Sales\Tableau Server

Another option is to open a command prompt (Windows - hold the Windows Key and press R,  then type cmd and click OK). In the command prompt, simply type net use. This should provide an output that looks something like this:

Local     Remote                  
----------------------------------------------

N:        \\DESKTOP-C3FKLQO\Share

From here, follow the same steps I documented above to replace the drive letter on your path.

To make sure you have the right path, I generally recommend that you paste this final path into Windows Explorer and make sure you see your files.

Now, in Tableau Desktop, connect to your files using the UNC path instead of the drive letter. By doing this, you’ll be ensuring that Tableau Server is able to communicate with the files when it attempts the refresh. The rest of the process is pretty much the same as any other published data source. You’ll just specify a refresh interval and Tableau Server will do the rest. And, if you’ve done everything correctly, it should refresh without any errors!

IMPORTANTWhen publishing, Tableau will give you the option to "Include External Files". This option will cause Tableau to push a copy of the file up to temporary folder on Tableau Server, which it will then use in the future. Since we're setting up Tableau Server to automatically refresh the data source, we do not want to do this. Leave this option unchecked.

Windows vs Linux
As I said earlier, that this tutorial is definitely more focused on Windows environments. If you work in a pure Linux environment or a hybrid environment (Tableau Server and File Servers are a mix of Windows and Linux), then you may require a slightly different solution.

Accessing files from Windows and Linux is akin to speaking two different but similar languages. In some cases, the words and references are the same; in other cases, they are completely different. Being bi-lingual can be a great advantage, but if you only speak one language (Windows or Linux), seek help from those who are fluent. In such cases, I highly recommend communicating with your Tableau Server and File Server administrators. If you clearly explain what you need to do, then they should be able to help you find a viable solution. And, when you do find a solution, be sure to clearly document it for the next time you need to do something similar.

In addition to involving your system admins, there are many other great resources to help you succeed, including the Tableau Help docs, Community Forums and Knowledgebase. You can also learn a lot about other types of environments by networking with other Tableau users. A great way to do this is to join your local Tableau User Group. Attending these meeting can put you in touch with local experts who work in a variety of industries and environments, which can really help you to up your Tableau game.


I hope this blog has been valuable. If you have any questions, feel free to leave them in the comments section. Thanks again to Michael Perillo for his help.
Ken Flerlage, June 22, 2020


No comments:

Powered by Blogger.