RFM Analysis Part 2: Data Prep
Welcome back to my series on Recency Frequency Monetary (RFM) analysis. In part 1, I shared some background about RFM analysis and showed you how to create the calculated fields in Tableau Desktop. But, after doing that, we ran into a problem. The calculated fields are complex, including both LODs and nested table calcs, which adds complexity, severely limits their usage, and impacts performance, while also creating several Order of Operations challenges. I concluded with my opinion that this method simply is not sustainable. The table calculations create too many limitations which will limit our ability to visualize this data and do so in a way that is performant. The solution to this problem is to push all of these calculations to the data layer, creating simple row-level data fields we can use instead of complex calculations.
In this post, I’ll share Python code
and Tableau Prep flows that perform the RFM scoring segmentation. I’ll also
share some alternative scoring methods that I think work better than the 1-5
scoring method shared in the first post.
Process
The idea here is relatively simple. Here
are the steps we’ll take:
1.
Read
a csv file of donations. Expected fields are:
·
Donor
ID
·
Donor
Segment
·
Donor
Full Name
·
Donor
Country
·
Donor
State
·
Donor
City
·
Donor
Zip
·
Gift
ID
·
Gift
Date
·
Gift
Amount
·
Channel
·
Campaign
Name
·
Fund/Designation
·
Gift
Type
2.
Enrich
the data, by calculating:
·
Last
Gift Date
·
Days
Since Last Gift
·
Total
Gifts
·
Total
Amount
3.
Perform
the Quintile calculations to get Recency, Frequency, and Monetary scores for
each donor (using similar logic to the calcs performed in Tableau Desktop in
part 1).
4.
Create
the combined score by concatenating R, F, and M scores.
5.
Label
each donor with descriptive text based on scores.
6.
Output
a file containing all the original fields plus:
·
Last
Gift
·
Days
Since Last Gift
·
Recency
Score
·
Total
Gifts
·
Frequency
Score
·
Total
Amount
·
Monetary
Score
·
RFM
Score
·
RFM
Segment
I’ve written Python code and a Tableau Prep flow and shared them
on Github. Feel free to take these and customize them to your needs.

Visualization Issues
With this nice clean data set, we no
longer have any of the issues we encountered in part 1. But we do quickly run
into some other potential issues. One thing I want to do with this data is show
all of the donors in a single plot allowing you to compare one donor with the
rest, kind of like this:

In this chart, the vertical position
doesn’t matter—it’s just using a jitter technique so that you can more clearly
see each dot. What really matters is the horizontal position, which shows the
score. And, because every donor is shown, we get an idea of the density and
distribution of scores and can easily compare the selected donor to all others.
But, remember that our current
scoring methodology assigns values 1-5 for each of the three metrics. So, when
we visualize one of the metrics, we see something more like this:

This is okay, but since there are
only 5 scores, it lacks quantitative nuance. And visualizing the overall score
is worse:

While this looks like a number, it’s
not quantitative data, but categorical data. For instance, the score 455 is
numerically less than 544, but 455 is probably a better score. So, we simply
cannot visualize the overall score in a quantitative manner.
Alternative Methodology
In order to create a bit more nuance
and ensure we can visualize this data in a quantitative way, I prefer an
alternative scoring method. Instead of binned 1-5 scores and a concatenated
overall score, we will:
1.
Score
each metric with a decimal value of 1-10 (no binning).
2.
Average
the scores to create an overall score of 1-10.
Since we are using decimal values, we
are not binning the scores into 10 values. This provides much more nuance. And
the combined average scoring will allow us to compare the data quantitatively.

You can find my modified Python code and Tableau Prep flow on Github.

Up Next
With this new scoring method, I’m
feeling pretty good about our ability to visualize this data, so in the next
and final post in this series, that’s what we’re going to do. I’ll share with
you the questions I would ask when analyzing this information, some of the key
charts I created, and a full dashboard that allows you to do a deep dive into
your donors and their RFM scores.
Thanks for reading and please be sure
to come back once part 3 is ready to go. If you have any questions, feel free
to drop them in the comments section below.
Ken Flerlage, March 9, 2026
Twitter | LinkedIn | GitHub | Tableau Public










No comments: