Boosting Survey Efficiency with Relevant Items MaxDiff

Podcast

In this episode we dive into making surveys more efficient with the use of Relevant Items MaxDiff. Guest speaker Chris Chapman, founder and president of the Quantitative User Experience Association, shares his extensive experience and insights. Chris discusses the motivation behind Relevant Items MaxDiff, its implementation, and how it addresses issues in data quality, respondent experience, and actionable results. Using examples from his work at Google and a live demonstration involving movie ratings, Chris illustrates how the method can improve survey accuracy and focus. Learn about different MaxDiff options, the importance of pretesting, and how Sawtooth Discover can facilitate this process. Ideal for researchers looking to enhance their survey methodologies.

About Our Guest(s)

Chris Chapman is founder and president of the Quantitative User Experience Association, an industry educational nonprofit organization that offers classes and an annual conference, Quant UX Con. Previously he worked in user research and marketing analytics for 24 years at Google, Microsoft, and Amazon. Chris's books include R for Marketing Research and Analytics with Elea Feit (Springer, 2019) which has been used as a textbook in more than 160 colleges and universities and translated into Chinese and Japanese; Python for Marketing Research and Analytics with Jason Schwarz and Elea Feit (Springer, 2020); and most recently, Quantitative User Experience Research with Kerry Rodden (Apress, 2023).

Chris Chapman

Transcript

Automatically transcribed

Vanessa: Quick disclaimer before we start, this episode was originally recorded as a webinar and edited for podcast format. You can find the original webinar recording on our website at sawtooth.com or on our YouTube channel.

Now back to the show.

Brian McEwan: All right. We're really excited for today's topic. We've got a great guest with us, Chris Chapman. I've interacted with Chris for a number of years. He's been a frequent attendee at the Sawtooth Conference that we run. He has done a lot of great work for a lot of years. Previously at places like Google, Microsoft, and Amazon. Currently he's the founder and president of the Quantitative User Experience Association, quant ux.

They put on an absolutely excellent, conference. Lots of people, lots of topics. It's a really great resource for researchers. Um, and he is also written some nice books of using tools like R and Python, for, doing lots of really great types of analytical work. So, really great resources as well.

So I'm gonna go ahead and end now and turn the time over to Chris. Uh, take it away.

Chris Chapman: Thank you, Brian. Well, I'm delighted to be here today and thank you everyone for joining. And also, I'll put in a a thank you, Brian mentioned our, annual conference at the Quant UX Association, which is an online hybrid, a low cost conference we do in November.

And, we're delighted that sawtooth is one of our platinum sponsors for the conference. So, so thank you to SAPU for that. Let me share my slides here and

Okay, so I'm gonna talk about relevant items, max Dip and, how it came about, the motivation for it, some options to run it, and a demonstration of how it works in discover, both from the end user respondent side, as well as, how you would set one up on your own. And I'll, give a shout out to my, former colleague at Google, for more than a decade.

Eric Bonna. Eric was a. Product manager at Google and now a group product, management, manager there. And, it was some questions from Eric about how to collect information from customers that originally, spurred, the ideas here. And so we will talk about some of that and some of the slides, go back to research that I did with him.

Uh, okay. Let's see. Okay, so the problem for product management is we need to prioritize moderate to long lists of features that we might implement, initiatives, messages to customers, preferences, needs, use case scenarios, whatever it is. And we often are looking at many of these, you know, dozens, perhaps of things that we could work on.

As and, and it's very difficult to ask customers about all of these. And so what often happens is we want to create a rank list here. So in this case, we're seeing some features, feature requests, number two, number one, number four, and so forth. And the product managers need to prioritize these into, you know, very high priority, a P zero, moderate priority P one, or something to do later.

It might be a priority two. And we get sparse data from customers about this. So in talking to customer A, then they might tell us three things they want and customer B's focused on one thing and a support call and so forth. So we get very sparse data typically, from our information sources in product management.

But we would like to have dense data. So we think that out there in customer, the head of customer A, that there's some rank of priority across these features. And customer B, you know, would've a different rank. And even though perhaps, you know, they're talking to us about, feature number two today and the top, that it's actually low preference for them if we could get inside their head in some way.

So we wanna get denser data. From our customers. And, one way to do this is to use Max Diff. And so I'm sure many folks here are familiar with Max Diff. The general concept is we take this long list of items, often that's between a dozen, 20, 30, maybe up to 40, although there's no limit in principle, and we break it into something that.

Respondents answer a few at a time. So this is an actual survey that I did for the Quant Association. We were prioritizing 14 classes and we said, well, among

only these five at a time, consider these, which would you find most interesting and least interesting? And, this gave us great data, from our, attendees.

Of the conference about what they would like us to offer classes on. And so we used it to prioritize those. MaxDiff randomizes these sets to avoid order effects and it repeats, a few times so people answer multiple screens. We collect more data, and we can model the preference statistically, both for within respondent consistency as well as, you know, getting the distribution across entire group of respondent.

So it's a very powerful. Technique and it, it has some statistical advantages over say simply ranking. It's easier for respondents to do than trying to rank 20 things, and it also gives you better data. So what you get outta Max diff briefly to recap is, you know, uh, uh, a metric scale kind of report of these.

And so we're seeing in this case, for the classes that we offered in the association, segmentation was the most highly desired classes class. And then we had a few that were more or less tied in kind of second place about psychometrics choice modeling a survey masterclass, which I'll put in a plug.

Those two choice modeling, a survey masterclass will be teaching, next month. In Seattle, uh, and so forth. So we get this long list. The anchor option in MaxDiff tells us, whether everything is. Important or not. And so it could be that there's a rank order, but in fact very few of the things are things that that customers actually want.

Or it may be in this case that folks told us almost every class had posited interest. And so we can then know that probably any class we would offer would have some people interested and then we can prioritize that versus other, business requirements like how much it costs. Do we have instructors so forth.

So that's Max diff, but there's some problems with standard max diff. And so one of the problems, and I'll talk about, each of these is data quality, another's the respondent experience, and the third is results that can be difficult to act on. So for the data quality. We found, and I'll show a bit more about this later, is often in MaxDiff in the B2B space especially, but sometimes in the consumer space as well, people see an item and they say, I don't know.

Someone else does that or I just don't know. I don't know how to answer, but MaxDiff forces people to state a preference, whether they know about the item or not. This is actually an advantage of MaxDiff in most cases, that it forces a respondent. So they can't just say everything's important. You know, they have to pick and choose.

And there was one I did recently, around. Some initiatives to prioritize and all the initiatives are good, and so it was very difficult for respondents to have to choose a worst. That's actually a good thing. It's good to force 'em into the trade off. And in fact, the respondents in this case said that, that they appreciated being forced into the trade off.

But if it's something they don't know about, then it's, it's low quality data. And in the B2B space, we often see that tasks and, you know, vary by roles, especially as companies get larger. And so we may want to prioritize things in a B2B platform, for example, across, you know, an enterprise as a whole. But we'll find that engineers, sales, finance, operations, security, see different sets of things, but they're not necessarily totally separate sets of things.

So sometimes salespeople also do finance or operations may also do security, or management may also be engineering. And so there can be overlap among these things where we can't simply separate them into uniquely different surveys, but we need a way to adapt it to the roles. A second problem with, max Diff is that it can be tedious.

These surveys can be long and respondents, can complain about that. And they would often tell us, when I worked at Google, that it would be nice to be able to say, you know, I don't know, or no option, to particular set. The results can be difficult. It can feel like we're wasting participants' time again, especially if we're dealing with high value participants such as, enterprise decision makers or developers, physicians, folks like that.

But even consumers, we want to spend their time wisely and keep them engaged. And also we, we know, and this is something that has occurred in a lot of max diff research. Is that it's more important often to differentiate among the items at the top rather than those at the bottom. So we get a stack rank.

But you know, as a product management managers, generally we're not gonna spend a lot of time talking about is the worst, actually the absolute worst. Occasionally there's reasons to discuss that, but mostly we're trying to prioritize good things at, at the top around how important they are. So we wanted a way to focus more on that.

Briefly mention a couple other MaxDiff options. This is mostly for reference, uh, for you later. Adaptive MaxDiff allows a tournament style progression. Things you like kind of get carried forward. It doesn't solve the problem of relevance of the item, whether, I don't know, to rank an item. Express MaxDiff solves the kind of linked problem by subsetting the items.

And asking about a subset and sparse Mac stiff does the same thing. Rather than showing an item multiple times, it will show an item a small number of times, possibly only once, or maybe even not at all per respondent. So it makes the survey shorter by getting sparser data per respondent. But that's a different problem than, addressing the relevance of the items.

Okay, so now I'm talk about relevant items. Eric, Bonna and I originally called this constructed augmented Max diff, which is the reason I do not work in, product branding. So Brian, I believe it was Brian Norm, but Sawtooth suggested relevant items, was a better name than constructed augmented MaxDiff.

And I, completely agree with that. So the first case, where we used this was a B2B study. So the problem, uh, this was working in Google Cloud as specifically compute, engine of Google Cloud, where Eric was the product manager and, and also ran the, product, backlog and, prioritization across a number of, teams.

So we wanted it administrators to assess the importance of various features that we can work on. So both, new features that would be offered as well as features to improve Google Cloud, for it administrators and developers, but only ones that are relevant to their roles. So if they work on networking, then they should answer about networking.

But if they work on, you know, some other aspect of, of compute, then perhaps they wouldn't know about networking. So we needed them. If they didn't know about networking and they said, that's not important to me, was it not important because it's not important to them or it's not important to the product?

And so in Classic Mac, stiff, there's no way to disambiguate those two. So we wanted to focus in on items that were relevant to them. And because we had a fairly long list of potential features to save time, we wanted to focus on the ones that were at least somewhat important. To them. And so relevant items, screens this list before conducting max D.

And so what we did in this study was to ask, is this relevant to you? And so what you're seeing here are disguised, uh, this is from the survey, but we disguised the items, uh, here. Uh, so for feature 24, do they have visibility into it or not? Um, and if they said that they did have visibility, then we add it to the list to go into the max Diff.

But another option is to ask, is it important to you at all? And so if they said they have visibility, we had a second task that said, is it at least somewhat important to you? And if it was somewhat important or more, we put it into the max step because we wanted to prioritize things at the top. If it was not important to them, we didn't ask them about it.

Again, it was kept out of the MaxDiff to save time, and to allow focus. But we coded the data that's the augmented part of our, uh, non felicitous, uh, name here. And then we did the MaxDiff for the things that were tailored to the list of relevant items that they had. So I focused the task and made sure that we were collecting, data from them on the things that they knew about.

So result was that 55% of the items in this study of IT administrators were irrelevant to the median respondent. And so what you're seeing here on the right hand side, if it's read, that meant that they were saying that the item was not relevant to them. And you can see that for most of these folks, a majority of the items were not relevant, but it differed highly.

Some items were irrelevant to, basically everyone. Um. You know, and, but other items were, relevant to most of the respondents. And so, um, you know, we can use that data to understand respondent roles as a separate analysis, but it really tells us that our Mac step became more focused by tailoring this.

And we also looked at, how many items were shown that were important to them. And we found that if we. Did this construction and relevant items task then of the number of items that they saw on the max diff. A higher number of them were things that they, had rated as important. And so, we boosted the, um, you know, our ability to focus in on the items at the top and understand what was most important to these customers.

This led to changes in business priorities. So we felt like it got better data from the MaxDiff. And I have some slides in the appendices and you can check our original SA two research conference paper to compare sort of with and without the relevant items task. Um, but one thing I wanna highlight here is that by.

Getting higher quality data and understanding, really well things that were at the top of importance for the customers. Had a significant impact for us. And, I wanted to highlight one item in particular. This item feature request number six. So feature request number six was the second most important in overall priority, maybe tied with, items four and 30 here.

And behind item three, feature request number three. But item six was the least expensive in developer implementation time of all of the features being considered, whereas, number three, which was in first place, number three, was a quite expensive feature to develop. Uh, and item six, as I recall, was something like.

10th or a 20th of the cost, to develop of what, item number three would be. So by understanding the importance here and then adding in our, resources and effort needed to implement it, it was. We identified a clear winner here. So this immediately went to, you know, the number one thing that the team would do, and then we prioritize the other items again, by cost and importance, and ability to deliver down the list.

And so having this higher, quality data and focus, we thought gave us a much better. Insight into the top. And so we heard from respondents that the surveys felt easier. Some of these were respondents who had taken them in multiple, quarters over time, that it was shorter and easier, that they enjoyed the fact it was, implemented.

Um, and, you know, the executives liked this. So when I was, working at Google. This led to an initiative to, spread this method across, both Max Diff as well as relevant items across multiple product areas. And so this led to teaching, more than 25, internal classes, about Max Diff.

And so it became a very popular method. Among, not only, fellow analysts and researchers, but we had people doing it internally who were developers, product managers, folks from internal operations teams and so forth who were doing this for their own problems. Sometimes just collecting data from, you know, their own team members, to prioritize things.

Okay, so now I'm gonna talk about how to implement this in Sawtooth Discover. At a high level, there are two approaches, which we've already seen briefly here. One is to prescreen for relevance. Which is, does someone have understanding or experience or insight into something? And so, if we think about movies for example, we might say, well, which of these movies have you seen?

If I'm the Academy of Motion Pictures and I want to have people vote on Oscar, well I want them to have seen the movie before they can vote on it perhaps, or as in the B2B case that I showed. Which of these tasks do you perform before you rank them? I want to know if, you know, you understand it. Um. But we might also on a different note, prescreen for importance, which is an expectation or a preference or a liking or, or something like this.

And so in the movies. Space, I might say, you know, which of these movies do you believe would be good? Maybe you haven't seen it, but you have an

expectation or some knowledge about it. Which of these features would you like? Which of these features are at least somewhat important to you, even if you don't have experience perhaps?

Which destinations would you consider visiting for holiday and so forth? So this is a way of pre-screening the list according to a cutting point, and then focusing the max step on the things above it. And then I recommend, to choose, one of these paths when you're doing this. It's possible to put them together.

I'll talk about that, later. There's a cognitive difference in how you're focusing the task and how respondents will answer it. And so as I just mentioned, you can do both, but there's some challenges to that. I'll get to. So to drill into the movie ratings, a bit, this was inspired. The reason I chose this task, was a Sawtooth.

I shared a survey a few months ago where they asked about favorite Star Wars, characters and, uh, and, and you took a max stiff, it wasn't a relevant item, max diff, but it was just a standard max stiff of, you know, who's your favorite Star Wars, character and who's your, least favorite Star Wars character?

And, and I took the survey and. As a resident of the Star Trek universe, I had no idea. I knew maybe five Star Wars, characters, and I could not take the Max Diff because the first screen immediately was hit with I think three characters who I didn't even know. And, I felt like I couldn't do it.

So, relevance would be here asking which of these characters do I know, or maybe which, star Wars movies have I seen, and so forth. But you could also ask me among all movies, which I believe are best, whether I've seen it or not. So that'd be a different way to do this. I've created a survey. I took, uh, uh, some Oscar best picture winners from an online spreadsheet.

And, um, and the survey, uh, one. Branch is to ask about relevance. Which of the following movies have you seen? Then we're only gonna ask for the best and worst among the movies that people know. And then another branch is important for each of the movies below, whether you've seen it or not. Do you think it's good or not so good?

And so I might believe that a movie is not good, and that's why I have not seen it. And so this is a valid way to collect data, but it's a different question and we would probably use it for a different analytic decision that we wanna make. So at this point, I've got a survey here. I'm gonna invite you to try it live, and I'm gonna pause for about six minutes, to give folks to do this.

And it implements both of these in Sawtooth Discover. So it starts with a relevance. Screen for movies and then an important screen for movies. And so it'll be a live demonstration of both, which I hope will, uh, let you, uh, and will inspire some questions. So I'm gonna give us, at this point another, six minutes, to take that.

And then, we'll talk about some of the details. And eventually, we will look at, your data live.

I also went ahead and posted the, link in the chat, as well for another avenue for people to get there. Oh, thank you Brian.

Okay, so in discover the, create a relevant items, max Dip task. They're kind of two steps and each one has three parts. So I'm gonna briefly go through that. The first thing is we're gonna create a master list that we'll select items from. So in this case, all 15 movies that I showed. And then we'll select, uh, create a survey item that does the pre-screen to select things from that list, and then, move those items to a dynamic list.

And so discover there's a list manager in the tools. That you find. And basically you create a list and in this case, I called it movie list and then I 15 movies and I pasted them into the movie list. So that's, that's the master list. Then we add a survey item. So in this case, a, you know, I use two, but it could be a multiple selection.

Any number of columns are fine, here, because you're gonna add some logic to select from them. But I added an item that says, for this movie list, have they seen it or not? And then I create a dynamic list. So back to the list manager and I add a dynamic list and it says from the movie list, which is the master, I'm gonna carry items forward to the new dynamic list based on this question of have they seen it or not, and which columns they've selected.

So in this case, it's only a single column. Uh, yes, I have seen it. So if they say they've seen it. Then we've got a new dynamic list, called movie list scene. Next I add a max stiff, item, to use that list. So I add a max, stiff, item, exercise here. And I write the question text among the following films, that you have seen.

Which one did you like most and which did you like least? Then I set this rather than pasting in the items, I tell it to use this dynamic list that I created previously. Movie list scene. Then go to the advanced tab for MaxDiff and tell it that it's a relevant items exercise. That gets it to some of the backend.

Like it will know what you know to skip the MaxDiff exercise if there's not enough items selected. Some of you may have seen that. We're just skipped it. And also gets into some of the estimation logic we'll talk about. Later. So that's how to do it. Basically create this list and then a dynamic version of it, and then feed that into Max Dip.

Now after you collect data, other, the estimation settings, for the HB estimation depend on which of these two paths that you've chosen. So if you've done it for relevance where people only answer about things they know, then I recommend to set the hb. The question is what do we do for any respondents?

So if I answered about six movies that I've seen and there are another nine that I haven't seen, what should do about the nine that I haven't seen? Well, if I said I haven't seen them and we only wanna know what's best among the things I've seen, then I should say I just don't know about them at all.

So they're missing at random, have no data from me. Maybe I would like them, maybe I wouldn't. We don't know. It's random. So just ignore them. For purposes of estimating my preferences. However, if we've asked about importance, movies that I believe are good, the ones that I said are not good, we know because I said this in the response that the ones at the top.

The Max stiff asked about, are Putatively better in my estimation? In this case, we set the analysis to use those as inferior to the ones that were asked about, and so it's gonna add some implicit choice tasks or, I said that those movies all lost to the movies that were, that I said were better. And so, so you have to think through in designing the survey, which of these paths you want to go down and then set the HB estimation, appropriately.

Okay. So at this point, uh, at the risk of, uh, of live demonstration, I am going to switch over to discover and we will take a look at what folks have said. This survey. Okay. And so here's the survey, people have chosen live. You can see the task flow and how it worked. I've got, 98 complete respondents.

Thank you. That is fantastic. And, 42 who, have not finished the survey and switch to the analysis tab. So first, let's see, among the movies that people have. Seen, we collect, by the way, I I will say in asking these prescreening, you collect other interesting data besides Max steps, so we can look at which movie have people seen the most.

And for this audience, it looks like Oppenheimer is a movie, seen most everything, everywhere, all at once. Parasite, for example. And so whereas relatively few people have seen Nomad Land or Coda. And so we get that data, which is interesting. Now I switched to the MaxDiff relevance task.

This is gonna take, a minute to run hp. So all it runs that. I'm going to go ahead and get it started on the other task. And let's recalculate, there and, oops, error. Brian, McEwen. Something to work on there. Okay, so here we go. So on the relevance, it looks like there were 46 out of our 98 folks who saw, who had seen enough of the movies.

To go into the Max Diff, task. So the other, 52 folks I suppose had not seen enough, of the movies to be able to do that. And so I just skipped over the task, for them. So among the movies that folks have seen in this audience, looks like parasite, is their favorite. Everything everywhere.

All at once and Oppenheimer. And as always, in Discover, we could add confidence intervals if I had asked something like, you know, are you a member of the Academy of Motion Pictures or not? And we didn't have anything like that, but we could segment it and break these out by roles.

Here. And now switching over to the importance. So what people believed would be good movies, whether they've seen them or not. We'll see how similar the results, might be. Ah, so in this case, we're seeing somewhat different results. So. Oppenheimer is the movie that people believe are good. We have more folks who've answered 78 as opposed to 46.

So folks have awareness of what they think are good movies. We have effectively a tie here, among the King, speech, parasite, everything everywhere, all at once, and 12 Years a Slave. And then we have, as is typical for Max Stiff, you know, kind of. A winner. A few things, more or less tied, and then kind of a long tail of things, relatively less preferred, very common MaxDiff pattern.

But we see the difference in asking, through exactly the same movie list, asking about them in two different ways. And, relevant items allows us to really drill in, very well into exactly which way do we want to ask about this list. So thank you for the live data and,

so few discussion points. So there's a trade off here. So as I've argued, relevant items allows focused and more enjoyable. Max diff. Shorter surveys and fewer tasks, and higher quality data, depending on your, question. However, there are two costs to obtain, those benefits. One is that the screening task itself may become, a bit long, and I'll talk about that on the next slide.

Then the second is you need a survey platform that does this. And as far as I know, Sawtooth is the only survey platform where this is even possible. So if you wanted to do it, otherwise it would require custom programming and some r estimation code. So what if this list of prescreening things is too long?

One possibility is to break it into chunks and only rate a few at a time. So I might say, yeah, here are five movies. Have you seen 'em or not? Here are five more. Have you seen 'em or not? Cat photoed to refresh 'em, you know? And then a few more. Or, you say you work in network administration. Well now which of these tasks do you do?

You said you're not interested in, in, security auditing, so we'll just skip those items. So that's another way to break it into chunks and, group them programmatically, the fourth item here, and then kind of assign them programmatically. You could pretest the item list and trim it. If you need to shorten it from, you know, say a hundred down to 30 or something, you could randomly screen subsets of items.

That gets into, some questions about, you know. Are you collecting enough data and kind of a sparse version here where, you know, if we had a hundred movies might ask, have you seen these 20? And then feed the answer to that into the max step. So you can do various ways, according to, you know, your, your design goals here.

And one that I, you know, like assuming that you believe the groups are kind of, can be, separated in a clean way is to just have a higher order aggregate task where, for example, if you roll a security, then I add the eight security items. And you could do that by scripting the dynamic list. Okay.

So few questions. One, what if a respondent selects zero items? Uh, here, um, discover will skip the max D if there are not enough items, if it's a relevant items type. What if I have some that I want to appear every time, so I always want. To rate my new feature that we're super excited about, and I wanna make sure that appears regardless of whether they say it's important or not.

Yes, you can add that, uh, you can see in the list instructions that I have here on the lower right that you can add items, remove items, right, set the list length and so forth. So there are a number of things besides simply selecting them. Can I add random items to your coverage? So what if nobody ever selects my item because they don't know about it?

You can use the tool to do that, to always, uh, you know, to add some, specific items. I won't get into all the logic of how to do this, but a general approach is add the things they say are important, then add all the items, um. Let's see. You would add all the items, randomize it, cut it to a number like two, then add the items that they say are important.

And so now you've got two randomly selected items, and then all the things they said are important. So, so I'm thinking through the dynamic list, logic here. You can, you can do that. Then the question is, well, can I screen for both relevance and importance? So get all the tasks that are relevant. Cut line for the things that are important and then only do the things above the cut line.

I would say first probably better off to reconsider and simplify, the question to go down one path or the other. They're, they're cognitively and procedurally and in terms of the, estimation logic, somewhat different. So I like to simplify whenever possible.

or in our original, white paper to estimate it, or you can do it in discover and use an anchored relevant items. Max diff. So you do the relevant items, so they only see the things that they say are important to them, and then you add an anchored task after the max D, which says among a few of the things at the top, middle, and the bottom, is it important to you or not?

And that helps set this, anchor line for what's important. It doesn't make the list shorter. The anchor comes after the Max diff. And so the Max Diff will be about the relevant items, but then we add an anchor at the, uh, afterwards to determine this cut point for, so for example, in the movies, you know, which of these movies have you seen?

And then, you know, and then we might say, you know, relatively best and worst. Maybe a given respondent believes they're all great movies. So we add a, an anchored item at the end that says, was this movie, good or not. Or maybe they saw all the movies, but they only thought one of them was good. And so the stack rank, can be, we don't know what it's compared to.

So the anchor lets us add that kind of absolute comparison. So that's another approach here to do that. If you're saying, wow, that sounds a bit confusing and kind of tricky to write a survey, I guess to the most important point which is to pretest and pretest live, even if this just means that you have, three people in a sample of convenience, like, your partner or family members or some random person in your office or something, um, pretest it.

And I have never written a survey that I didn't change after pretesting, in some way to improve it. It's difficult to get the wording right in these relevant items.

And it's easy to make mistakes in these, constructed lists and putting them together and, you know, which columns feed into what and so forth.

So, so really pretest this and, and do it with live respondents, even if they're just, you know, convenient folks. By the way, um, before we get to the end, I'll say follow us at the Quant UX Association. I can visit quant uxa.org. Join our mailing list. We have a conference in November and various classes including, uh, we teach classes on MaxDiff and conjoin analysis and other, related topics.

So to review, um, today and, and how you might choose among some of these MaxDiff options if respondents understand. Every concept then great. Use standard max def. What are your favorite ice cream flavors? We can assume anyone can decode an ice cream fla, fine standard max def if they don't understand one or two things.

So we've got our new feature we're excited about and we want to prioritize it against existing features. You can use standard MaxDiff, but you want to add an educational task, whether that's an explainer screen or a concept video. I often do this live, using MaxDiff in focus groups, uh, and the focus group.

Make sure the people understand the concept space and any new things we want to show them before they take the max D. Um, that's kind of an expensive way to field things, but it. It gets extremely high quality data and qualitative insight as well as the quant. If they shouldn't rate things that don't apply to them, so we don't want them to put something at the bottom, just because they don't know about it.

Then use relevant items. With the relevance approach, if you want to survey a shorter survey because the item list is too long, you have several choices. Sparse max diff, where they see a small subset, express max diff. Again, they see a subset of items, relevant items, where they pre-select. If you have a very long list and you want to identify the top items from it.

So, an example that I remember Sawtooth showed some years ago was among 300 cities to potentially host a conference. You know, we wanna identify the top cities, um, but we don't wanna prematurely exclude anyone. There's a method called Bandit max diff, which progressively honed, it learns from people, it starts with everything.

And then as the survey goes, it learns from respondents. And then focuses more and more and more of the tasks on the things near the top so it never drops anything totally, but it zeroes in on estimation at the top. And so if you have a super long list, bandit's a good way to go. If you ever smart as you're tired of reporting about things at the bottom, they say, look, I told you it's not important to me.

Stop asking about it. The max dip is tedious and too long. I want to tell you about the things that I really care about. Relevant items with the importance approach is a good way to do that. So a number of these different things. I should have put a link here. Brian Norm has a great paper that's called something like, you know, bandit Express, sparse, what to Do about all those different Max Dips.

You can, um. You know, do a search for that paper. It's a white paper, explains these in more detail. Conclusion for random, for relevant items. I believe you get higher quality data respondents the items that are relevant to them. It keeps 'em more engaged and focused. You get more data, kind of depending on the estimation path.

And some of the details I won't go into here. You know, the data end up being augmented by more implicit choice tasks. It in the importance branch anyway. Things that they said were unimportant then they don't have to, you don't have to ask about them again, but the. The tasks get effectively boosted by that Other respondents are happier.

You know, the items are more relevant, the surveys are shorter. You, I, you probably saw that in the movie, survey that in just a few minutes. You can answer two different max depths about movies in two different approaches and, and, have a, a relatively good experience. Got a couple links here.

website out there. So when you get these slides later, you can, you can find these and there's an original technical white paper, that I did with Eric Bonna that was at the SAWTOOTH Research Conference back in 2018. And now you've seen this approach. It's a great thing about their conference, this approach, people kind of it and present and talk about things, and then when they prove out over time, they may end up in the platform.

So it's a great kind of partnership. Um, with industry and, um, you know, if you, I've got my email address here, chris@uxa.org, so you can, feel free to shoot me an email and I'll try, but, but I think for now that's, that covers it.

Brian McEwan: Well, we really appreciate your time, Chris. As you all have seen, Chris is one of the nicest guys in market research. He's always willing to, help people like me come to better understandings of things.

We really appreciate his willingness to share that with all of us. Thanks again Chris, and we'll see you around. Well, thanks for having me. Thanks everyone for listening today.