All About Machine Learning Clustering

May 3, 2021

Delve Goes Deeper: Data-Driven Segmentation

Deepen your understanding of Machine Learning Clustering, phase 2 of DELVE’s approach to customer segmentation.

This is the fourth post in our Data-Driven Segmentation series, exploring the methodology and business applications behind DELVE’s approach to customer segmentation. 

In Data-Driven Segmentation, variables are fed into a Machine Learning clustering algorithm that uncovers an underlying logic to the data and uses this logic to define customer segments without marketer input. Customers who look similar to each other (based on all available data) are grouped together. Dissimilar customers (once again, based on all available data) are grouped into different segments.

Why is implementing Machine Learning clustering advantageous to your marketing strategy? Because Machine Learning allows for significantly more variables to be factored into segment creation, which creates a more robust Customer Segmentation that in turn results in a more mature personalization and targeting strategy.

Clustering for Data-Driven Segmentation is performed using the K-means algorithm, a machine learning algorithm which finds complex patterns in data. This algorithm groups customers based on their demographic data: customers who are more similar are located closer together and customers who are more different are located further apart. The below image shows a simple example of K-means clustering output based on two input variables: age and income.

Chart, scatter chart

Description automatically generated

In practice, Data-Driven Segmentation generally involves far more than two variables. Different combinations of the up to 20 variables selected in the first phase are fed into the K-means algorithm to identify the subset of variables which leads to the most robust and useful segmentation results. The optimal number of resulting segments is also identified algorithmically, using either the Elbow-Method or X-means algorithm. The optimal number of segments is highly variable from brand to brand, but generally ranges from 4 to 15.

At the conclusion of Machine Learning Clustering (Phase 2), all customer records are assigned to one of the defined segments and prepared for further analysis. In our next article in this series, we discuss how this data helps uncover customer personas, behavioral trends and more.

Ready to take your ads, and your business, to the next level? Get in touch with the DELVE team today.