Discovering Relevant Items MaxDiff

Sawtooth Software

Last updated: 28 May 2025

We all know how overwhelming it can be for respondents to evaluate long lists of items. So how do we make it easier and more relevant for respondents without missing out on any insights?

That’s where Relevant Items MaxDiff comes in: this approach lets respondents select the items they care about the most and then focus the MaxDiff trade-offs only on those particular items.

The result?

Smaller exercises which provide a more engaging experience for the respondent.
More focused results for the researcher.

Relevant Items MaxDiff: A Tailored Approach

Relevant Items MaxDiff starts with a simple idea: only ask respondents to rate the items that are actually relevant to them. Before the MaxDiff exercise begins, respondents complete a quick screener that flags which items matter, and which don’t. This can be done by asking them directly which items they believe are relevant or by determining it based on other specific variables, like for example what quota group or region they belong to. The resulting MaxDiff tasks are built from the items marked as being relevant.

Because each respondent sees a different subset of items, on-the-fly design generation methods are used during the survey. Rather than pre-building fixed MaxDiff sets, designs are created in real time as respondents take the survey because each respondent’s relevant item list is unique. Even though we are generating a new design for each respondent, we still ensure that each Relevant Items MaxDiff task remains balanced.

Much like a standard MaxDiff, we can calculate utility scores via HB (Hierarchical Bayes) or Aggregate Logit analysis—and even calculate scores on-the-fly!

However, when analyzing a Relevant Items MaxDiff, we must also consider how we want to handle utility calculation for items that were not shown.

Handling Missing Items in Relevant Items MaxDiff

There are two main ways to handle missing items:

1. Missing at Random

When used: If items are randomly dropped (Express MaxDiff).

How it works: No assumptions are made about the importance of dropped items. They provide no respondent-specific info, but HB may still impute a utility using population-level estimates.

Implication: “Missingness” does not affect population means; utilities are "borrowed" from others who saw the item.

2. Missing Inferior

When used: If respondents drop items explicitly marked as unimportant or less relevant (i.e., they didn’t select them as relevant in the prior selection question to the MaxDiff).

How it works: A “threshold item” is introduced to act as a reference. Each dropped item is paired against this item and loses, while included items win. This informs the model that dropped items are systematically worse.

Implication: This method restrains the utilities of dropped items to be lower but avoids extreme negative values, maintaining a robust model.

Understanding Reach (TURF) in Relevant Items MaxDiff

TURF (Total Unduplicated Reach and Frequency) is a method used to identify the combination of items that maximizes reach across a population. In a Relevant Items MaxDiff context, this means finding the optimal set of items ensuring that as many people as possible are “reached” by at least one item that passes a particular threshold.

When running TURF analysis on a traditional MaxDiff, it’s straightforward, as all respondents evaluate the same set of items—every item has a chance to reach everyone in the study. But with Relevant Items MaxDiff, we must take a few things into account.

Because each respondent only sees a personalized subset of the total items (based on what’s relevant to them), some items are potentially missing for certain respondents. This means that, for some respondents, those missing items should not influence the TURF result for those individuals.

To address this, we can treat any item not seen by a respondent as having zero reach for that individual. Behind the scenes, this is handled by assigning a very low utility (–30 for example), which effectively removes the item from consideration. Without this adjustment, TURF might wrongly assume a missing item could still contribute some reach, even though the respondent never evaluated it.

However, this adjustment isn't always necessary. For example, if items were missing at random, or if they weren’t shown for reasons unrelated to relevance (i.e., not removed due to being indicated as irrelevant), you won't need to treat them as having zero reach. It’s important to understand why items were missing to choose the right approach for TURF.

Real-World Example: Ice Cream Flavors

Let’s say you’re researching preferences for different ice cream flavours. Instead of forcing respondents to rank options they’d never eat, Relevant Items MaxDiff starts with a simple question:

"Which of these flavors would you consider eating?"

From this first selection question, only the chosen flavours are pulled into the MaxDiff task. So, if a respondent selects Peach, Raspberry, and Mango, the MaxDiff task will only feature trade-offs between those.

Relevant Items MaxDiff can also be set up the other way around, where we ask respondents to only choose flavours that they would never consider, and we simply remove those from the MaxDiff exercise itself.

Once we have collected our data, we then must make sure that prior to running our HB model, we first mark our missing items as either inferior or missing at random. In this example, since the items that were not included in the MaxDiff were specifically selected as not being relevant by the respondent, we would select inferior.

Once we run HB, we can generate our utilities for each item, as we would with any normal MaxDiff, where we can see the relative preference for each ice cream flavour ranked from most preferred to least preferred.

With those utilities generated, we can then switch to our TURF simulator, where we define the number of items we want to include in each portfolio and the total number of portfolios we want.

Before starting the simulation, make sure to select the checkbox for “Missing items have zero reach.” This means that in our TURF simulation, we are going to treat any item that was missing for the respondent as having no chance of being reached by the respondent.

However, if the unseen items had been excluded from the MaxDiff because they were missing at random, we can maintain the traditional reach calculation behaviour by just keeping this checkbox for “Missing items have zero reach” unchecked.

How It Compares: Express and Sparse MaxDiff

When handling high numbers of items in your MaxDiff, there are two other commonly used methods which, depending on the context of your research, can also serve as great alternatives.

Express MaxDiff

While Relevant Items MaxDiff is ideal when you have items that may not be relevant to all respondents, what if all your items are potentially relevant, but there are simply too many of them?

Express MaxDiff handles this by randomly selecting a subset of items (say, 30 out of 60) for each respondent, and showing each item 2–3 times. This results in more stable estimates at the individual level while avoiding the need to overload the respondent with too many tasks in an attempt to show all items multiple times.

When doing this, we need to ensure that “Missing at random” is selected as the item treatment for any items not seen by the respondent. With this setting, HB will still estimate a utility value for the missing item. However, this estimate is based on population-level average preferences and covariances. As a result, if most respondents rate the item highly, a respondent who didn’t see it will receive an imputed utility estimate similar to the average respondent’s preference.

Sparse MaxDiff

If the idea of having respondents not see all items doesn’t align well with your research goals, but you have far too many to show 2–3 times, then Sparse MaxDiff can be an excellent option.

Sparse MaxDiff takes a full-coverage, low-burden approach where each respondent sees all or most items only once. This allows you to cover a lot of ground across your full sample with a minimal amount of task burden. However, as the data we collect in this method is particularly sparse at the respondent level (with each item shown only once), the individual-level results of the HB analysis are likely to be highly influenced by the population-level averages.

If you want to dive deeper into how Relevant Items MaxDiff can help you manage long item lists, then you can build your own Relevant Items MaxDiff by simply creating a free Discover demo account here.