As Data and Business analysts, we have all encountered situations where we need to segment customers based on their engagement with the business. An RFM model has a few benefits. It enables marketers to increase revenue by targeting specific groups of existing customers called “segments”. It also gives the business the ability to send targeted messages and offers that are relevant to these segments based on data points unique to that particular set of behaviors. This leads to increased response rates, customer retention, customer satisfaction, and a better overall customer lifetime value. This blog is about dividing segments using three key deciding factors.
- How recent is a Customer
- How frequent is a Customer
- How much monetary value does a Customer hold
We will use SQL and a BI tool (Looker) to implement the following steps:
(**Prerequisites – Working knowledge of SQL and Looker)
Step 1 – Decide your key measures
[I am using the following KPIs to define RFM]
Recency – Days since last order
Frequency – Total number of orders by the customer
Monetary – Total amount of money paid by the customer
A basic SQL query would look like this if no joins are applied –
Select
user_id, --primary key
display_name,
user_email,
count(order_id) as total_orders,
datediff(day, max(order_date), current_date) as days_since_last_order,
sum(amount) as total_amount
from your_table
group by 1,2,3
I highly encourage you to include this in your database table itself, reason? Think for yourself 😛
Step 2 – Pull your fact table in Looker and create a .view file using lookerML.
Step 3 – Looker will automatically generate dimensions for you if it’s a new view
If not, manually add dimensions and measures like so –
dimension: user_id {
type: string
sql: ${TABLE}."USER_ID" ;;
primary_key: yes
}
dimension: user_email {
type: string
sql: ${TABLE}."USER_EMAIL" ;;
}
dimension: display_name {
type: string
sql: ${TABLE}."DISPLAY_NAME" ;;
}
## hiding the below dimension because we will create measures out of it.
dimension: days_since_last_order {
type: number
sql: ${TABLE}."DAYS_SINCE_LAST_ORDER" ;;
hidden: yes
}
dimension: orders {
type: number
sql: ${TABLE}."TOTAL_ORDERS" ;;
hidden: yes
}
dimension: amount {
type: number
sql: ${TABLE}."TOTAL_AMOUNT" ;;
value_format_name: usd_0
hidden: yes
}
## add the below measures in your visuals to have detailed data for the dimension
attributes
measure: total_amount {
type: sum
sql: ${amount} ;;
value_format_name: usd_0
}
measure: total_orders {
type: sum
sql: ${orders} ;;
}
measure: total_days_since_last_order {
type: sum
sql: ${days_since_last_order} ;;
}
Step 4 – Create dimensions for recency, frequency and monetary
You can define threshold values as per your own requirements. I have considered 3 Median values across all the records. How to find median? Refer this page. You can use your own statistical methods to define these threshold values.
Step 5 – Define score – Score is a combination of the recency, frequency and monetary.
So if a user has ranks like 3- recency, 2-frequency and 2- monetary then score would be 322.
dimension: score {
description: "A combination of recency frequency and monetary in this order"
type: string
sql: concat(${recency},${frequency},${monetary}) ;;
}
Step 6 – Based on the scores we will define the user segments.
**Tip – You can add many more!!
Step 7 – Build visuals and dashboards based on the above metrics.
Step 8 – You can do more than just user segmentation.
Deep dive into the similarities between a particular segment type to find a pattern and then work on it to enhance your business. For more reading or doing similar analysis on excel check out this article which inspired me in writing mine.
Thanks and Happy Analysing!
- Common pitfalls in ML projects and how to avoid them. - January 5, 2024
- Data Observability with Elementary - November 5, 2022
- Data Deduplication with ML - November 2, 2022