Adaptive Pooled Learning via MaxDiff

Last updated: 16 Jun 2022

Pooled Learning in Maxdiff

We can expand our insights capabilities and reduce survey data collection costs with pooled (group-based) adaptive learning.  This involves learning from early respondents such that we can customize MaxDiff questions shown to later respondents for greater precision.  This adaptive approach allows us to tackle more ambitious insights problems than were previously thought possible—with reasonable sample sizes and yet remarkably high precision. 

Quick Background on MaxDiff 

MaxDiff (best-worst scaling) has emerged over the last 20 years as a better way to measure people’s preferences and opinions than traditional rating scales. We can use it to ask people about: 

  • preferences for products/services, features, flavors, etc. 
  • the importance of features regarding products or services, etc. 
  • agreement with attitudes, psychographic profiles, political platforms, etc. 

Researchers often study around 16 to 24 total “items” in a MaxDiff, shown typically 4 or 5 at a time to a respondent, such as: 

Maxdiff Example1

A respondent typically completes 8 to 15 such MaxDiff questions, where across the questions the respondent sees each item typically 2 to 3 times.  Analysis leads to individual level scores that can be made to sum to 100. 

Adaptive Pooled Learning 

Sawtooth Software’s MaxDiff within the Lighthouse Studio platform gives the enterprising researcher some amazing tools for pooled adaptive learning.  If you use the “Bandit MaxDiff” option to create customized lists of items for your MaxDiff or for other questions in your survey, you are pooling the collective wisdom of past respondents to customize the tradeoffs for the next respondents who take the questionnaire. 

For example, after each respondent takes your MaxDiff survey, Lighthouse Studio stores those preferences for the items…and that information can be leveraged in clever ways for targeting the next respondents with more relevant questions, to: 

  • Gain much higher precision regarding the more preferred items 
  • Ask follow-up questions (such as open-ends) regarding the more preferred items 

We typically ask each respondent to evaluate around 20 to 30 of the total number of items (since we don’t want to burden any one respondent with evaluating all 120+ items in a very large MaxDiff study).  Again, each respondent gets typically 8 to 15 MaxDiff questions.  After surveying a few respondents, we are just beginning to understand the population’s preferences, so the strength of the pooled wisdom is still relatively weak.  But, after 30 or 50 respondents, the shared knowledge is gaining strength.  After 100 or 200 respondents, we typically have learned a great deal about population preferences and are in a position to exploit that knowledge to greatly accelerate the precision of the remaining data collection.  Subsequent respondents are focusing their attention on comparing mainly the items that are rising to the top. 

(Warning, some technical jargon ahead!)  The statistical procedure that makes this happen is called “Thompson Sampling” which is a respected algorithm for solving what statisticians call multi-armed bandit problems.  (For more details about this, please see: 

Three Examples for Motivation 

Examples 1a and 1b: You’ve got 120 product claims or in another case you have 1000 (gasp!) different graphics for package design and in both cases you want to know which are the top three or so that motivate the market.  Normally, you would think you’d need extremely large sample sizes of perhaps 1000 in the first case or 30000 respondents in the second case to get good answers.  But, with adaptive MaxDiff (Bandit MaxDiff), you can do this with MUCH smaller sample sizes than you’d probably guess. 

We’ve found that if the goal is to identify the top few items for the population among 120 items or more, adaptive Bandit MaxDiff can be 5x or more efficient in terms of reducing required sample size than standard MaxDiff.  That’s because whereas standard MaxDiff shows each item an equal number of times across respondents, adaptive MaxDiff leveraging pooled wisdom oversamples dramatically the most preferred items for the population.  So, later respondents are generally focusing their attention on comparing top items vs. top items. 

To put things more concretely, adaptive Bandit MaxDiff can accurately identify (with 90% accuracy) the true best three items out of 120 items with a sample size of right around 200 respondents.  It can identify the true best three items out of 300 items using right around 1000 respondents (again with 90% accuracy).  And, it can identify the true very few top items out of 1000 with 90% accuracy using about 5000 respondents.  (For more details about this, please see: 

Example 2: You want to obtain solid scores for all 24 items in a study for each respondent, while boosting the precision of the top few items.  Again, adaptive Bandit MaxDiff provides a solution…in the form of what we’ve referred to in our software documentation as “boosted Bandit MaxDiff”. 

With this example, each respondent sees each item at least 2x.  But, the top few items as judged by previous respondents are being evaluated 3x to 5x by each later respondent.  Each respondent doesn’t do any more work than for standard level-balanced MaxDiff.  But, you’ve gotten even more precise results for the population’s top few items of preference due to leveraging adaptive learning.  (For more details about this, please see: 

Example 3: You are studying 30 items using a standard MaxDiff, and you want to ask respondents to give open-end written opinions about the top few items as judged by the population.  Importantly, you don’t want to waste much time by asking respondents to type thoughtful open-end responses about items that are of interest only to a minority of respondents.   

In this case, we can employ standard MaxDiff that shows each item an equal number of times.  (Or we could employ the “boosted” Bandit MaxDiff as described in Example 2.)  At any point in the questionnaire, we can ask Lighthouse Studio to create a dynamic list of (say), two items, where those two items are drawn from among the top items out of the 30 items as judged by the previous respondents.  We then ask respondents to type open-end responses about those top very few items.  A twist on this same approach is to not start asking respondents to type open-end responses until 100 respondents have already completed the questionnaire. 

By leveraging pooled adaptive learning, open-end responses in this example are focused much more heavily on the items that are rising to the top.  You’re wasting comparably little respondent effort on writing open-end responses regarding the items that are falling to the bottom of the list in terms of group preference. 

Summary and Conclusion 

You can make your insights questionnaires much more efficient with adaptive algorithms for MaxDiff in Sawtooth Software’s Lighthouse Studio.  Based on the pooled insights from earlier respondents, later respondents’ questionnaires oversample the items rising to the top for the population.  This means we’re wasting relatively little time asking later respondents about inferior items for the population.   

Bandit algorithms in MaxDiff can help you tackle massive lists of items with practical sample sizes.  You can even leverage these algorithms to improve the accuracy of the top few items in the context of standard MaxDiff studies involving, say, 16 to 24 items. 

More Reading: 

Fairchild, Kenneth, Bryan Orme, and Eric Schwartz (2015), “Bandit Adaptive MaxDiff Designs for Huge Number of Items,” 2015 Sawtooth Software Conference Proceedings, pp. 105-118.  Access at: