MaxDiff Sample Size Calculation Best Practices

Last updated: 30 Sep 2024

Rows of amber glass bottles with one different cream colored bottle in the middle. Representing selecting sample.

What is Sample Size?

Researchers strive to understand the populations they study, but reaching every individual in a target population is rarely feasible. For example, a marketer of soft drinks may want to learn more about soft drink consumers in the United States. Interviewing the hundreds of millions of consumers would be extremely costly and time-consuming. This is where understanding sample size becomes essential. Sample size refers to the number of respondents chosen from the larger population to participate in a study. By carefully selecting a representative sample, researchers can still gain meaningful insights about the entire population without needing to interview everyone. A properly calculated sample size allows for accurate, reliable data that reflects the larger group’s behaviors, attitudes, and preferences. In this article, we will explore how to determine the right sample size for a MaxDiff survey to ensure your results are both valid and cost-effective.

Get Started with Market Research Today!

Ready for your next market research study? Get access to our free survey research tool. In just a few minutes, you can create powerful surveys with our easy-to-use interface.

Start Market Research for Free or Request Product Demo

Sample Size for MaxDiff: Simple Rule of Thumb

In some ways, sizing samples for MaxDiff may be easier than for other designed experiment methods like conjoint analysis. This is because MaxDiff scales down (and up) very nicely. MaxDiff can give you accurate results even for an interview with a single respondent. So, when we think about sample size, we (usually) don’t need to worry too much about our ability to estimate the model itself (unless we have a very large number of items we want to measure in the MaxDiff, like >30). For most MaxDiff studies, the general rule that you should have at least 300 respondents in total and at least 200 for every separately reportable subgroup works just fine. When you need to push the limits by having more than 30 or so items, you may want to increase this rule of thumb accordingly (e.g., if you have 60 items instead of 30, you might want to double the sample sizes above). 

Considerations for Right-sizing your MaxDiff Sample

Sometimes we want to be more statistically rigorous. We want to select the right size of sample that balances the cost of our research with its quality. The cost of your MaxDiff sample is easily expressed in whatever currency you’re using to pay the bill. When it comes to quality, however, the two forms of “currency” we use are confidence and power.

Confidence

When we want to express the accuracy our MaxDiff utilities in terms of a margin of error, we’re talking about statistical confidence. For example, we might want to have 95% confidence that our MaxDiff utility estimate for a given item is accurate to within 0.25 utility points. In other words, we want to be 95% confident in a precision of 0.25 utility points.

Power

While confidence is all about how precise our confidence intervals are, power has to do with our ability to detect significant differences in statistical testing. It’s common in research to seek 70% or 80% power, just like 95% confidence is usually the target in accounting for precision.

Need Sample for Your Research?

Let us connect you with your ideal audience! Reach out to us to request sample for your survey research.

Request Sample

Calculating MaxDiff Sample Size for Confidence

Assume our goal is to have a certain level of confidence that our utility estimate for a given item has a given margin of error. In that case we follow these steps: 

  1. Decide on the level of confidence we want. For this example, let’s use the common 95% confidence level. 
  2. Decide on the size of the margin of error we care about. Let’s say we care about when an item is 25% more likely to be selected than some other item. Using the logit choice rule, we find that when an item A has a utility of 0.223 than that of item B, it has a 25% greater chance of being chosen. (If you want the math, that’s exp(x+.223)=1.25(exp(x)).) 
  3. Create your MaxDiff design and generate random responses from 400 artificial respondents. In Sawtooth Software’s Lighthouse Studio program you can automate this in the Data Generator function.
  4. Estimate the utilities from those 400 artificial respondents. 
  5. Now we look at the standard error from those 400 respondents. Let’s say it’s 0.16. 
  6. So we want the radius of our 95%confidence interval to be no more than 0.223. That means we want our standard error to be no more than 0.223 divided by 1.96 (the Z-value corresponding to 955 confidence) or 0.114.
  7. To find out how much sample we need, we multiply the 0.16 from step 5 above by the square root of 400/n, where n is the sample size we seek. The “square root of 400/n” part just allows us to take what we learned for n=400 and adjust it up or down for larger or smaller sample sizes.
  8. Solving for n we find that for the 95% confidence interval, a sample of 788 will give us the desired standard error of 0.114. In other words, a sample of n=788 will give us a 95% confidence interval for the utility with a margin of error of +/- 0.223. 

Calculating MaxDiff Sample Size for Stat Testing

Now our goal is a little different. In statistical testing we want to balance the risks of false positives and the risks of false negatives. The confidence level (the 95% above) gives us our limit on the risk of a false positive (5%). The risk of a false negative is measured by power, and we usually choose 70% or 80% power, which correspond to 30% and 20%, chances of a false negative, respectively.

To balance both confidence and power (that is, the costs of both false positive and a false negative results) follow these steps: 

  1. Decide on the level of confidence and the level of statistical power that we want. For this example, let’s use the common 95% confidence level and the common 80% level of power. 
  2. Decide on the size of the difference we want our statistical test to be able to detect. Let’s say we again care about when an item is 25% more likely to be selected than some other item so the difference we want to be able to detect is a utility of 0.223. 
  3. We can use that same artificial data set we generated above and the standard error of 0.16 we discovered. 
  4. Now, because we care not only about the confidence level (whose Z-value for 95% we saw above was 1.96) but also about power, we need to add the Z-value corresponding to 80% power, which turns out to be 0.84. Now we want our standard error to be no more than 0.223 divided by 2.80 (2.80 = 1.96 from confidence and 0.84 from power). In other words, we want our standard error to be no more than 0.08.
  5. Just as in the confidence example above, to find out how much sample we need, we multiply our 0.16 observed standard error by the square root of 400/n and adjust n so that the result s 0.08. We find this happens at n=1,600. This means if the standard error around our utility is 0.16, we need a sample size of 1,600 to be able to detect a utility difference of 0.223 with 95% confidence (i.e., a 5% chance of a false positive) and 80% power (i.e., a 20% chance of a false negative). 

Get Started with Your Survey Research Today!

Ready for your next research study? Get access to our free survey research tool. In just a few minutes, you can create powerful surveys with our easy-to-use interface.

Start Survey Research for Free or Request Product Demo