Kirk Munroe: Tableau Next Part 2: Data Cloud & Tableau Semantics

 

Kevin and I are pleased to welcome back a regular contributor, Kirk Munroe. Kirk lives in Halifax, Nova Scotia, Canada and is a business analytics and performance management expert. He is currently one of the owners and principal consultants at Paint with Data, a visual analytics consulting firm, along with his business partner and wife, Candice Munroe, a Tableau User Group Ambassador and former board member for Viz for Social Good.

 

Kirk is also the author of Data Modeling with Tableau, an extensive guide, complete with step-by-step explanations of essential concepts, practical examples, and hands-on exercises. The book details the role that Tableau Prep Builder and Tableau Desktop each play in data modeling. It also explores the components of Tableau Server and Cloud that make data modeling more robust, secure, and performant.

 

This is the second in a three-part series about Tableau Next, in which Kirk is going to introduce us to Tableau Next and try to remove some of the confusion that exists in the Tableau community.

 

In Part 1, we set the stage. Now we’re going to dive into one of the biggest shifts in Tableau Next — the native data lakehouse (Salesforce Data Cloud) that sits underneath it (“Data Layer” in the image below), along with Tableau Semantics, the business layer that ties everything together (Semantic Layer in the image).

 

https://www.tableau.com/products/tableau-next

 

What is a Data Lakehouse?

First things first — what’s a “data lakehouse”? To answer that, let’s take a trip back in time…

 

Data warehouses were introduced way back in the 1970s and 1980s by Bill Inmon. They were popularized in the 1990s and remained the gold standard for storage of data for use in BI and Analytics for decades. They lived in relational databases, spoke SQL fluently, and were tailor-made for tools like Tableau. But they were rigid — changing the structure was slow and painful — and they almost always lived on-premise. That rigidity, combined with the fact that they weren’t cloud-first, eventually opened the door to something new.

 

Enter the data lake, which started gaining traction in the early-to-mid 2010s. Data lakes flipped the script: cloud-first, quick to adapt, and able to store both structured and unstructured data (something SQL-based warehouses aren’t great at). The trade-off? They were slow — painfully slow for interactive analysis. If you’ve ever tried using Tableau against Hive or a similar layer on top of Hadoop, you probably remember the frustration.

 

The data lakehouse came along in the early 2020s to bring the best of both worlds — the flexibility of a data lake with the performance of a data warehouse. Data lakes essentially combine what’s great about both platforms, giving you the best of both worlds. Databricks popularized the term “Data Lakehouse”, and now most major cloud platforms offer their own take on it.

 

So where does Tableau Next fit in?

 

Here’s the key difference: Tableau Next always comes with Data Cloud under the hood. While Data Cloud isn’t a pure lakehouse in the Databricks sense, for a Tableau analyst or developer it might actually be better. It comes preloaded with Salesforce data, and it’s easy to bring in other sources.

 

Let’s compare this to classic Tableau (Desktop, Server, Cloud). What’s great about Tableau is that it can connect to almost anything. However, if the data source isn’t structured well or doesn’t have what the business needs, you’re stuck. Sure, you can extract into Hyper for performance — and Hyper is a full-fledged database — but Tableau has never let us really work with Hyper directly. It’s more of a “black box” storage layer.

 

With Tableau Next, Data Cloud isn’t just a back-end — it’s a core part of the platform, built to make data usable from the start.

 

Data Cloud

Now that we’ve set the stage, let’s take a closer look at the components of Data Cloud. Data Cloud packs a ton of capability — predictive and generative models, calculated insights (think complex calculated fields), customer segments, segment activations, identity resolution, and more. We might get into those in a future post, but for this series we’re going to zero in on the three capabilities that matter most for understanding Tableau Next: Data Streams, Data Lake Objects, and Data Model Objects.

 

Data Streams

If you’re a Tableau user, think of a Data Stream as something like a data connector. It creates a connection to your source system and can either ingest the data into Data Cloud (extract) or leave the data where it is and query it in place using zero copy (live connection). The big differences from Tableau are

 

1) There are connectors for a lot more sources.

 

2) You can transform your data in the stream itself — before it’s even ingested (though you can still transform after ingestion too).

 

Data Lake Objects (DLOs)

A DLO is a container for data — either data stored in Data Cloud or a pointer to an external table via zero copy. It’s easiest to think of a DLO as a table, but keep in mind it doesn’t have to come from a relational database. A DLO can be built from a flat file, or even unstructured data like a PDF, audio file, or video.

 

Data Model Objects (DMOs)

DMOs don’t store data. They’re a virtual layer built on top of one or more DLOs, where you map fields and create a business-friendly view of your model. If DLOs are your raw tables, DMOs are your logical layer — the abstraction that makes raw data easier to work with and understand.

 

If you’re a Salesforce Sales, Service, or Marketing Cloud customer, here’s a nice bonus: Data Cloud comes with DLOs (your Salesforce objects/tables) already mapped to a C360 DMO for you. That’s your Customer 360 model out of the box —which can save you hundreds of hours of modeling work.

 

Tableau Semantics

Next up is the semantic layer, Tableau Semantics. My observation is that this is the part of Tableau Next causing the most anxiety in the #datafam. Here are two key points to consider:

 

1) If you are already using relationships and published data sources in Tableau Server/Cloud, you have nothing to worry about—you are already creating semantic models in the Tableau ecosystem and Tableau Semantics is going to feel really natural for you.

 

2) If you embed your data source in Tableau Desktop/Server/Cloud, and they are based on joins or a single, denormalized table, it would be a really good idea to learn how to build relationship-based models. They are much better for data governance and trust. You need them to use Tableau Pulse. They are much better for performance and the ability to answer the “negative” questions that are so important to business (i.e. what products haven’t sold in the last month). For more on relationships, see Tableau's New Data Model & Relationships and Relationships, Joins, Blends & When to Use Them.

 

Since the Tableau Next development team could build the platform from scratch, they had the advantage of being able to put Tableau Semantics as the basis for all things analytics. What do I mean? If we know that agents, metrics, and dashboards are all coming from the same source (with business logic and terms embedded), we can have a much higher level of trust in the answers we are getting. Also, it is much more efficient for people to not have to connect to the same data source over and over again, embedded differently in each workbook/dashboard.

 

The Tableau Semantics UX

There are a number of places to start a semantic model in Tableau Next. The method I’m going to show is creating one from scratch inside a Workspace. Think of a Workspace as a Project on Tableau Cloud/Server—a place to group related objects. In Tableau Next, there are currently four asset types you can add to a blank Workspace:

 

1) New Dashboard

2) New Visualization

3) New Semantic Model

4) New Data

 

We will look at visualizations and dashboards in part 3 of this series. For now, think of them as analogous to Sheets and Dashboards in classic Tableau. The interesting thing is there isn’t a concept of a workbook. This might be a good thing! Given that data sources are all the equivalent of published data sources, what is left in a workbook except sheets (visualizations) and dashboards. I like the “removal” of an unnecessary object. (Currently, all authoring is done in the browser—there isn’t a Tableau Desktop version of Tableau Next. If this happens, it could all change, of course). In time, I would also expect the equivalent of Tableau Prep “flows” to be in a workspace as well.

New Data—selecting this option allows you to upload files directly into Tableau Next. What actually happens is a data stream is automatically created for you and your files is dumped into a DSO. Nice touch for “one off” projects and demos! In production, you would want to create your own data stream so you could schedule refreshes and transform data, if needed. As of this writing, I believe you can only upload CSV files (and maybe other flat files) this way. You can’t upload Excel files for sure.

 

Let’s look at the Semantic Model option. I’ve already uploaded the Bookstore data files that Tableau Tim and I did in our data modeling masterclass. For this post, I’m not going to build that entire model, I will just build enough to show how it works.

 

Once we select new Semantic Model, we get a dialogue to pick our data tables from existing DMOs, DLOs, or Calculated Insights:

 

 

Using the bookstore data source, let me bring in Sales Q1 to start:

 


I can only bring in a single DLO/DMO/Calculated Insight to start (same as Tableau), but now I can bring in multiple to add to the model:

 

I select Book, Edition, Info, and Publisher DLOs and get the following:

 

 

You should notice two things: 1) there is an (AI) agent that makes suggestions on the fields to create relationships and 2) I can add multiple tables (DLOs in this case) to the canvas before I create relationships between them—neither of which can be done in classic Tableau.

Up to the time of writing, I haven’t had much luck with these suggestions, as you can see when I let Tableau Next make suggestions:

 

 

There are two things to consider, of course: 1) it might do a good job on native Salesforce objects where it is likely to have had more thorough training (I haven’t had a chance to test yet, but will!) and 2) like all things AI, this could get better in short order!

It is very straightforward to correct these/do them manually and it should feel pretty familiar to Tableau users:

 

For those who have used this data or watched the video with Tim and me, you will know that the Info table has the Book ID broken into two fields. The interface to handle that is also very similar to classic Tableau:

 

 

Once our model is built, we still have a number of options. In the top right-hand corner of the model, we can jump right into building a viz (again, we will do this in part 3) or we can test our model. I think this feature is going to be a big hit with people who create models. The UX feels similar to any number of query analyzer tools. This is very different from, say, creating a published data source in the web client on Tableau Cloud Server. In that case, you have a Scratchpad that you can test your model by dragging and dropping fields—very comfortable for a Tableau user, but super foreign to a “database person”. When I work with people who are new to Tableau and we start to build a model, they immediately want to jump back to their database tools to test the queries. This UX should feel a lot more familiar to them.

 

 

The UI even lets you copy the SQL query it generates! How many times have you wanted or been asked for that?


We can also add other components to our model:

 

 

We will look at Metrics in detail in part 3 of this series. For now, think of them as “Tableau Pulse for Tableau Next.” What I really like from a UX perspective is that you can create them directly from the semantic model. If you haven’t used Tableau Pulse, it requires a published data source, but you can’t create it from the published data source in the UI, which leads to a somewhat disjointed experience.

 

Calculated Fields and Logical Views
I also really like that Calculated Fields are promoted from within the model. In classic Tableau, I find it very typical that people who created published data sources still create their calculated fields in the workbook and not in the data source. I’m often guilty myself—I blame the UX :)! It makes so much more sense from a governance perspective to “share” them from within the data source…the semantic model.

The calculated field UX is comprehensive and intuitive. And my experience with the AI generated calculation has been really good—if you know how to phrase it properly. Take the following:

 

 

It wrote:

 

 

This brings me to a point that I’ve been making for the last couple of years… in my opinion, as a data analyst, being great at the semantics of a language is going to be much less important and knowing how the business works, the vocabulary, the goals, the applications that generate the data, and the questions people really want the answers too is going to be gold!

Check out the question asked in a slightly different way:

If I ask, “when did a book first sell”, I get:

 

 

In my opinion, the AI nailed it both times. Knowing how to ask the question matters.

The rest of the calculated field dialog is also nice because it brings a lot of things in a single dialog that require multiple dialogs in classic Tableau.




 

It is important to put in really good descriptions because this is what the Concierge agent will use (again, part 3—I really want you to read that post too!).

 

To this point, it probably sounds like I like (love?) Tableau Semantics a lot more than published data sources in Tableau Cloud/Server. In a lot of ways that is true. The Tableau Next team had the large advantage of taking 20 years of learning from the existing Tableau team/product and being able to start from scratch! One area I don’t love, however (though I expect it to get better with time), is how joins and unions work.

To create a join or union, you create a Logical View. A logical view takes your data and either unions or joins DLOs together in—I assume—a temporary view. I find the joins to currently be buggy and the unions don’t automatch field names, so it is a lot of manual work to get a union to work.

 

To be fair, I think the better option for a production application would be to transform your data into a single DLO in Data Cloud and bring that into the semantic model. Sometimes, it is nice to have a fast and easy option though.

AI First
Finally, to close out this post, let’s look at how Tableau Semantics is “AI first”. I expect your experience with LLMs (large language models such as ChatGPT) has been an alternating combination of “wow that is incredible” and “huh, that doesn’t make sense.” The quality of the response is a combination of the training model behind the LLM and your ability to ask unambiguous questions. Salesforce has done a really nice job of helping with these in the Semantic model.

First, there is a section for “business preferences”—you can think of this as a place to help with ambiguity of questions. A place to map all the vernacular and acronyms in your company to help the AI disambiguate the inevitable questions business users will ask.




 

Second, on “training the model”, you have to certify that you have followed best practices before you can even turn on the Concierge Agent:

 



I can’t tell you how often I see field names like, “rev_dol” instead of “Revenue (in $)”, how comments are almost always blank on fields, uncommented fields, and so on. How is the agent supposed to answer your questions accurately in that case?

Well, that’s it for part 2. As I’ve noted several times, I’ll be back very soon for the third and final part of this series! Thanks for reading. If you have any comments or questions, please let me know in the comments.

 

Kirk Munroe, September 8, 2025


No comments:

Powered by Blogger.