How to use the Cluster Tool
Published on
14 August 2023
Daniel Hernández image
Daniel Hernández
Market Researcher

Easily segment your audience based on preference data and other survey answers with Conjointly's Cluster Tool.


Conjointly’s Cluster Tool lets you automatically group your participants based on similar responses, enabling you to uncover potential market segments. It attempts to classify respondents into distinct groups of similar respondents, helping you identify patterns and design tailored strategies based on participants’ needs. With this tool, you can:

  • Automatically group participants based on shared preferences.
  • Uncover hidden market segments.
  • Design targeted strategies based on participants’ unique needs and preferences.
  • Make data-driven decisions with comprehensive market insights, optimizing product offerings and marketing campaigns.

Get started with the Cluster Tool

How to use the Cluster Tool

1: Prepare your data set

Please ensure input data is in CSV format with the first column labelled participant_id. You can include any desired questions for the clustering exercise, except for Gabor-Granger results and open-ended responses.

In the example below, columns BB to BF can be included in the clustering exercise, whereas columns BH and CJ need to be excluded.

Exclude Gabor-Granger and open-ended questions

Additionally, you can incorporate individual preferences in the clustering exercise. Please note that for multiple-choice questions, it’s necessary to include a column with text responses instead of binary columns.

For illustration, in the following data set, columns AH to AL need to be excluded as they contain binary responses, whereas column AM can be included in the clustering exercise.

Exclude binary columns

In summary:

  • Start the first column with participant_id.
  • Include any desired questions for the clustering exercise (except for Gabor-Granger results and Open-End answers).
  • You have the option to incorporate individual preferences in the clustering exercise.
  • Use a column with text responses instead of binary columns for multiple-choice questions.
  • Save your data set in .csv format.

The final data structure should look like:

Final data structure

2: Upload your data set

Open the Cluster Tool and upload your .csv file using the browse button. Once the file is uploaded, you can start performing the cluster analysis.

Upload your data set

3: Select the parameters for the analysis

The Conjointly Cluster Tool offers three independent methodologies to compute solutions based on your data and research objectives.

The available methods are:

  • Gaussian-Mixture Models (GMM): GMMs are statistical methods for clustering data based on Gaussian distribution, varying probabilities across different setups. This method is ideal for complex and overlapping groups.
  • K-means(GMM): The most popular algorithm. K-means splits data into K clusters by assigning points to the nearest centroids. It is efficient for large datasets but limited to spherical groups. In addition, you should consider the appropriate features as inputs to enhance its performance. This involves leveraging domain knowledge to identify influential features using feature importance scores. The iterative process of experimenting with various feature combinations can lead to more meaningful and interpretable cluster results.
  • Hierarchical Clustering(GMM): This algorithm builds a dendrogram of clusters by merging based on similarity measures. It’s well-suited for exploring customer segmentation in retail based on purchasing behaviour, identifying distinct customer personas in marketing based on demographics and preferences, and grouping similar user profiles in social networks based on interaction patterns.

Each methodology offers distinct advantages, allowing you to tailor your clustering approach to best suit the nature of your data. By default, the tool employs GMM to group participants based on their shared features.

GMM to group participants based on their shared features

4: Set up additional parameters for the analysis

You can also adjust the number of iterations for computing cluster solutions. Each iteration involves running the algorithm multiple times and assigning the final classification based on the group with the highest assignment achieved per iteration. (i.e. the mode obtained after n iterations).

In addition, you can adjust the number of clusters when using K-means and hierarchical methods. (GMM automatically computes the optimal number of groups.) Moreover, the platform suggests a maximum number of clusters based on your data. This value is calculated from the average of over ten different indices. However, this suggestion should be taken as a recommendation, as you are free to exceed the recommended number of clusters if you wish.

Set number of clusters

5: Perform the cluster analysis

The first output is a doughnut chart showing the clusters’ distribution, that is, the membership of each group within the overall sample. This chart will help you understand your data’s composition, identifying dominant segments, potential outliers, and areas that may require further exploration or targeted strategies.

This initial output can help you comprehensively explore your data’s underlying patterns and insights.

Doughnut chart output

In addition, the mosaic plot allows you to profile your respondents, presenting a visual representation tailored for categorical variables (numeric values are not supported). This aims to identify differences between groups, assuming statistical independence among variables. The “monthly or less frequent” category is distinguished with red and blue colouring in the example. The blue group (cluster 2) implies a higher prevalence of respondents selecting this option compared to a scenario with no cluster relationship (where groups are evenly distributed across categories) and conversely for the red group (cluster 1). Embracing the mosaic plot empowers you to understand better how categorical variables interact within your dataset.

Mosaic plot output

6: Export cluster solutions

Finally, with the cluster tool, you can download the complete dataset or only the cluster solutions, allowing you to conduct further analysis. When downloading the cluster solutions, you will obtain a CSV file containing all the clusters computed by the tool. This functionality offers the flexibility and freedom to delve deeper into the data, enabling you to explore additional insights and conduct customized analyses.

Further analysis

A great way to get more insight from these cluster memberships is to use them to define segments in a Conjointly report. You can then:

Do you have any feedback or ideas on how you would use the Cluster Tool in your research? You are welcome to Book a call to discuss with our team.


Read these articles next:

Simulating the difference in preference share

How to measure category preference share uplift?

Conjointly preference share simulator allows you to simulate shares of preferences for different scenarios. Here is how you can use it to discover the potential uplift of the NPD.

View article
Market Research Glossary

Market Research Glossary

Our market research glossary defines some key terms relating to a range of industry concepts. Learn some new consumer research lingo from basic research concepts to more advanced and technical definitions.

View article
4 Ways Behavioural Segmentation Aids Marketing

4 Ways Behavioural Segmentation Aids Marketing

Behavioural segmentation helps you understand how consumers make purchasing decisions. Learn how it benefits your product development and marketing efforts.

View article