Ads Top

A Beginner’s Guide to IF Statements in Tableau

IF statements are a fundamental part of Tableau, as well as other analytics platforms, programming languages, and even tools like spreadsheets. Understanding what they do and how they work is critical when building calculated fields. So, in this blog, I’m going to take you through the basics of IF statements, touching on their components, some rules that govern them, the order of operations, etc. While this blog is targeted at Tableau users, particularly new users, most of what I’ll be sharing is applicable to any other platform or programming language that uses IF statements—for the most part, the only difference will be the syntax.

 

Components

An IF statement in Tableau looks something like this:

 

// Group the sales into three categories.

IF [Sales]<=500 THEN

    "Small"

ELSEIF [Sales]>500 AND [Sales]<=2000 THEN

    "Medium"

ELSE

    "Large"

END

 

If we break this down, we’ll see a handful of components. Let’s start with the first two lines:

 

IF [Sales]<=500 THEN

    "Small"

 

In plain English, this would read, “If the Sales are less than or equal to 500 then return the string value, “Small”. Note: I like to write the IF statement on one line then the return value on the next, prefixed by an indent as I think this makes it much easier to read.

 

The IF keyword tells Tableau we’re about to perform an IF statement. THEN indicates that we’re about to specify some return value. In between IF and THEN is what is referred to as a conditional expression. The conditional expression is essentially just a boolean statement—a comparison that returns either TRUE or FALSE. In this case, that statement is [Sales]<=500. This expression leverages a comparison operator to compare the two values, Sales and 500. There are six comparison operators available to us:

 

Operator

Meaning

Returns True If…

= or ==

Equal To

Both terms in the expression have exactly the same values.

<> or !=

Not Equal To

The terms have different values.

> 

Greater Than

The first term has a larger value than the second term.

< 

Less Than

The first term has a smaller value than the second term.

>=

Greater Than or Equal To

The first term has either a larger value or the same value as the second term.

<=

Less Than or Equal To

The first term has either a smaller value or the same value as the second term.

 

In our example, we are checking to see if Sales is less than or equal to 500. If it is, the IF statement will be TRUE. If not, it will be FALSE. TRUE and FALSE are the only possible results. If the boolean statement is true, then the value specified after THEN—in this case, “Small”—will be returned. If not, then we’ll continue onto the next line:

 

ELSEIF [Sales]>500 AND [Sales]<=2000 THEN

    "Medium"

 

We’ve already written IF so our next conditional expression must be prefixed with ELSEIF. In the above example, our conditional expression is [Sales]>500 AND [Sales]<=2000. Unlike the first example, this expression contains two separate comparisons, [Sales]>500 and [Sales]<=2000, separated by AND. In this case, AND is what’s called a logical operator. The most common logical operators are shown below:

 

Operator

Definition

AND

In order for the entire conditional expression to be true, the comparisons on the left and right side of the AND must both be true. If either of them is false, then the entire statement is false.

OR

In order for the entire conditional expression to be true, at least one of the comparisons on the left or right side of the OR must be true. The entire statement will only be false if both comparisons are false.

NOT

Unlike AND and OR, NOT is not used to combine multiple expressions. Instead, it is used on a single expression and basically returns the opposite of the expression’s result. In other words, if the expression is TRUE, NOT will make it FALSE and vice versa. NOT can often be avoided by using different types of comparison operators.

 

Note: It can get very complicated when you start to chain ANDs and ORs together in conditional expressions. We’ll deal with this later on in the Order of Operations section.

 

So, in order for our example statement to be true, the Sales must be both greater than 500 and less than/equal to 2000. If both are true, then the IF statement will return “Medium”. If not, it continues onto the next statement.

 

ELSE

    "Large"

 

When we use the keyword, ELSE, we do not need a conditional expression. ELSE simply means that, if none of the earlier conditional expressions were true, then we’ll return this value. So, in our case, if the Sales are greater than 2000 then the statement will return “Large”.

 

Finally, in order for the IF statement to be completed, it must be closed out with END.

 

 

Other than the basic operations detailed above, there are a few key things you should understand about IF statements, which are detailed below.

 

First Wins!

When it comes to IF statements, first always wins! In other words, IF statements end as soon as a conditional expression is found to be true. Let’s look at a modified version of our earlier statement.

 

// Group the sales into three categories.

IF [Sales]<=500 THEN

    "Small"

ELSEIF [Sales]<=2000 THEN

    "Medium"

ELSE

    "Large"

END

 

We’ve taken the following:

 

ELSEIF [Sales]>500 AND [Sales]<=2000 THEN

 

and changed it to:

 

ELSEIF [Sales]<=2000 THEN

 

While it might seem like this will change the results, it will not. To illustrate this, let’s assume that Sales = 100. Both IF [Sales]<=500 THEN and ELSEIF [Sales]<=2000 THEN are technically true, but because of the rule noted earlier, the statement will end as soon as it reaches one true expression. Since the first expression is true, the IF statement will return “Small” and end—it will never go on to evaluate the second expression. If, for example, Sales = 600, then the first statement would be false, so the IF statement would continue on to the second expression. Since we now know that Sales is greater than 500, it’s redundant to include it in the conditional expression, so we can exclude that comparison.

 

Single Row/Partition Only

Another important thing to understand is that an IF statement will only operate on a single row. Okay, this is technically not true as Tableau calculated fields and IF statements can compute over a larger partition of data when some sort of aggregation is being used. In that case, we then get into the viz level-of-detail and other more complex topics than I want to discuss here. So, for the sake of keeping this simple, we’re not going to do any sort of aggregation and, because of that, our IF statements will always operate at the row level.

 

Take, for example, this small sample data set.

 

Actor

Movie

Category

Adam Sandler

Billy Madison

Comedy

Mark Wahlberg

Ted

Comedy

Mark Wahlberg

Shooter

Action

 

and the following calculated field:

 

New Category

// Create single value for actors doing both action and comedy.

IF [Category]="Comedy" AND [Category]="Action" THEN

    "Comedy/Action"

ELSE

    [Category]

END

 

You might be tempted to think that Mark Wahlberg will result in a value of “Comedy/Action” since he has both the “Comedy” and “Action” categories. But this is not the case, since this IF statement will operate at the row-level only—it will only ever see the Category from that current row, resulting in the following.

 

Actor

Movie

Category

New Category

Adam Sandler

Billy Madison

Comedy

Comedy

Mark Wahlberg

Ted

Comedy

Comedy

Mark Wahlberg

Shooter

Action

Action

 

You could, of course, create a calculated field to combine the genres of movies that Mark Wahlberg stars in, but that would require some more complex calculations than we’re going to get into here.

 

Same Data Type

IF statements must always return the same data type. Take, for example, this version of our calculated field:

 

// Group the sales into three categories.

IF [Sales]<=500 THEN

    1

ELSEIF [Sales]<=2000 THEN

    "Medium"

END

 

If you were to try this in Tableau, it would give you the error, “Expected type integer, found string. Result types from ‘IF’ expressions must match.” While this is a somewhat cryptic error message, it’s simply stating that the first IF statement returned an integer, but the following ELSEIF returns a string. We can’t do this as both must be the same data type—we must choose either an integer return value or a string, not both.

 

There’s Always an ELSE

A final thing you should understand is that IF statements always include an ELSE, whether you explicitly include them or not. In our earlier example, we explicitly included an ELSE, but what if we excluded it as shown below?

 

// Group the sales into three categories.

IF [Sales]<=500 THEN

    "Small"

ELSEIF [Sales]<=2000 THEN

    "Medium"

END

 

In this case, any row with Sales greater than 2000 would return a value of NULL. Excluding the ELSE is essentially like writing ELSE NULL.

 

// Group the sales into three categories.

IF [Sales]<=500 THEN

    "Small"

ELSEIF [Sales]<=2000 THEN

    "Medium"

ELSE

    NULL

END

 

NULL is a special value that indicates that there is no value. It can also be any data type, which means that it’s always a valid return value.

 

Of course, there is nothing stopping you from explicitly writing ELSE NULL. In fact, I do this sometimes to make it easier for others to understand the logic of a particular IF statement or to make note of the fact that I realize there are other potential values, but I want those to return NULL.

 

Order of Operations

Now that we’ve explored the anatomy of IF statements as well as some of the rules that govern their usage, let’s talk a bit about some more complex logical issues we may encounter. Here’s an example I see quite frequently:

 

IF [Category]="Furniture" OR [Category]="Office Supplies" AND [Order Date]>=#01/01/2020# THEN

 

You might think that this will give you anything where the Order Date falls on/after January 1, 2020 and has a category of either “Furniture” or “Office Supplies”. But you’d be wrong as these different logical operators have a specific order of operations. This order of operations is just like the one used in equations in mathematics. The following graphic shows the order of precedence for logical statements:

 

 

In other words, the first thing to be computed in a logical statement are the parentheses, which group comparisons together. Next comes NOT, followed by AND, then OR. So, let’s use this to better understand the IF statement above. Since AND computes before OR, the two comparisons separated by AND will be evaluated first:

 

[Category]="Office Supplies" AND [Order Date]>=#01/01/2020#

 

So, this will give us anything with a category of “Office Supplies” and an order date on/after January 1, 2020.

 

The OR will then act as if the above statement is a singular expression. So, in English, the original statement will give us anything where 1) Category is “Furniture” OR 2) Category is “Office Supplies” and order date is on/after January 1, 2020. Furniture sales from 2019 will be included since they meet criteria # 1.

 

You can probably see how, as stated earlier, IF statements that mix different logical operators can get very confusing as you need to think about the order in which the operators will execute. Fortunately, we can use parentheses to group different comparisons. For example, if let’s change this statement to return anything where the Order date falls on/after January 1, 2020 and has a category of either “Furniture” or “Office Supplies”. To do this, we can add parentheses (shown in red below) to group the category conditions together.

 

([Category]=”Furniture” OR [Category]=”Office Supplies”) AND [Order Date]>=#01/01/2020#

 

These parentheses will force the IF statement to first evaluate the conditions between them—is the category “Furniture” or “Office Supplies”? Then it will evaluate this expression, as a whole, along with the expression on the other side of the AND, which is exactly what we want.

 

Since it can be difficult to think through the order of operations as you’re writing an IF statement, my personal recommendation is to always use parentheses to group your conditions any time you’re mixing logical operators. If all of your operators are ORs or all of them are ANDs, then this is not necessary, but as soon as you mix ANDs, ORs, and NOTs, it’s a good practice to explicitly insert parentheses so that you can be 100% of the order in which each condition is evaluated.

 

Variants of IF

Before I wrap this blog, I want to point out a few variants of IF that are available in Tableau and some other platforms/programming languages.

 

One such variant is IIF, which allows you perform a simple IF/ELSE statement within a concise function call. It takes three parameters—the condition, the result if true, and the result if false.

 

For example, let’s look at the following statement.

 

IF [Sales]<=500 THEN

    "Small"

ELSE

    "Medium"

END

 

We could convert this to IIF as follows:

 

IIF([Sales]<=500, "Small", "Medium")

 

While the function only supports one IF and one ELSE, you can use nested IIF statements to mimic the functionality of IF/ELSEIF/ELSE statements. However, nested IIFs can be very difficult to read, so in my opinion, you’d be much better off writing out the whole IF statement.

 

Note: If you come from an Excel background, then you’ll recognize this as it’s pretty much exactly how you write IF statements in Excel.

 

Another variant is the CASE statement. The functionality of case statements varies quite a bit between different platforms. In Tableau, it is simply a more concise, yet less powerful method for performing multiple comparisons. However, in platforms such as SQL, CASE allows you to do everything you can with an IF statement. In fact, pure ANSI standard SQL doesn’t even include IF statements at all—you must use CASE.

 

Let’s look at the following IF statement.

 

IF [Sub-Category] = 'Accessories' THEN 'Group A'

ELSEIF [Sub-Category] = 'Appliances' THEN 'Group A'

ELSEIF [Sub-Category] = 'Art' THEN 'Group B'

ELSEIF [Sub-Category] = 'Binders' THEN 'Group B'

ELSEIF [Sub-Category] = 'Bookcases' THEN 'Group C'

ELSEIF [Sub-Category] = 'Chairs' THEN 'Group D'

ELSEIF [Sub-Category] = 'Copiers' THEN 'Group D'

ELSEIF [Sub-Category] = 'Envelopes' THEN 'Group D'

ELSEIF [Sub-Category] = 'Fasteners' THEN 'Group E'

ELSEIF [Sub-Category] = 'Furnishings' THEN 'Group E'

ELSEIF [Sub-Category] = 'Labels' THEN 'Group F'

ELSEIF [Sub-Category] = 'Machines' THEN 'Group G'

ELSEIF [Sub-Category] = 'Paper' THEN 'Group H'

ELSEIF [Sub-Category] = 'Phones' THEN 'Group I'

ELSEIF [Sub-Category] = 'Storage' THEN 'Group I'

ELSEIF [Sub-Category] = 'Supplies' THEN 'Group I'

ELSEIF [Sub-Category] = 'Tables' THEN 'Group J'

END

 

This is a simple IF statement in that each IF/ELSEIF performs a simple equality comparison based on the same field, Sub-Category. In this scenario, we can replace it with a CASE statement:

 

CASE [Sub-Category]

WHEN 'Accessories' THEN 'Group A'

WHEN 'Appliances' THEN 'Group A'

WHEN 'Art' THEN 'Group B'

WHEN 'Binders' THEN 'Group B'

WHEN 'Bookcases' THEN 'Group C'

WHEN 'Chairs' THEN 'Group D'

WHEN 'Copiers' THEN 'Group D'

WHEN 'Envelopes' THEN 'Group D'

WHEN 'Fasteners' THEN 'Group E'

WHEN 'Furnishings' THEN 'Group E'

WHEN 'Labels' THEN 'Group F'

WHEN 'Machines' THEN 'Group G'

WHEN 'Paper' THEN 'Group H'

WHEN 'Phones' THEN 'Group I'

WHEN 'Storage' THEN 'Group I'

WHEN 'Supplies' THEN 'Group I'

WHEN 'Tables' THEN 'Group J'

END

 

As you can see, this requires a bit less text and is a little easier to write. However, as noted above, CASE statements in Tableau require 1) you are comparing to a single field, which is listed after the CASE keyword and 2) you are performing equality comparisons only—you cannot perform any inequality comparisons (>, <, etc.) in a CASE statement.


Note: Be sure to check out Kevin's Case Statement Generator if you need to crank out CASE statements with lots of values.

 

Conclusion

IF statements (and their variants) are incredibly powerful and an absolute requirement for anyone creating calculated fields in Tableau. But, if you don’t come from a programming or analytics background, they can be a bit daunting. While I’ve only just scratched the surface on what you can do with IF statements in Tableau (and other platforms), I’m hopeful that this short guide will help you to better understand how they work and will help you to avoid many of the pitfalls that others have encountered when first learning them.

 

Thanks for reading! If you have any comments, please feel free to leave them in the comments section below.

 

Ken Flerlage, February 15, 2021

Twitter | LinkedIn | GitHub | Tableau Public

5 comments:

  1. Hi Ken, good article, as usual.

    However, I would include that IF THEN and IIF treat NULLs differently by default and you should know that before deciding which one to use.

    The first part of this article summarizes it nicely.

    https://interworks.com/blog/tmcconnell/2015/02/10/case-statements-vs-if-statements-tableau

    One thing I would add that is not in the article is that the IIF function has a fourth argument to categorize the NULLS

    ReplyDelete
    Replies
    1. Great point. I'll add that. Thanks for the feedback!

      Delete
  2. Hola Ken. Felicitación como siempre tu ayuda es oportuna.
    Por favor para cuando tengas un tiempo, tendrás algún articulo para realizar actualización incremental deseo actualizar y remplazar solo los datos de la ultima semana. mi base de datos esta en SQL Server. y diariamente actualizo los dashboard, pero demora porque actualmente esta actualizando todos los años.

    ReplyDelete
  3. I agree about the null fields, it is important to know what formula to apply for a correct result. Excellent article.
    Greetings from Argentina @mariano35

    ReplyDelete
  4. Great article - Thanks for sharing!
    Kind regards from France ! Alex

    ReplyDelete

Powered by Blogger.