Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Latent Class shows different outputs than HB Analysis for different groups

Hello everyone,

We have a question. For our university project we got some outputs from a survey. We got outputs for the latent class analysis and CBC- HB. We took the groups from the LC and copied them to the HB output (individual utilities zcdiffs), then sorted the members of the HB into the different groups and calculated the average utilities and average importances and the standard deviation for each group ( 2 and 3 group solutions).
Then we compared the outcomes. In general we saw that the average utilities were somehow different, but more or less the same attribut charatcteristics were preferred (eg. The attribute characteristic "Ongoing Help" still had the highest utility). When looking at the average importances on the other hand, there were major differences. (eg. Latent Class for Group1: 80% importance, HB for Group 1: 50%).
Now we were wondering why there are those major differences. Can anyone help us?

We hope our problem is clear.

Thank you in advance.
asked Jun 26, 2018 by Rafaela, Bene, Jonas

2 Answers

0 votes
When you obtain utilities and importances from Latent Class, these are group utilities for each class, where respondents have probabilities of belonging to each class.  For example, a respondent might be 80% in class 1 and 20% in class 2.

With just a 2- or 3-group solution, there is a lot of smashing people together into groups who probably do not have identical preferences.

Now, if you take the group memberships implied by the latent class run, where respondents are fully assigned to the class they have the highest probability of membership...and then cut your HB results by those segments, you should expect to see sometimes different results than latent class--especially for the importance scores.

Why?  Because importance scores under HB analysis are computed for each respondent individually (to make them sum to 100% within each respondent) and then are averaged across respondents.  But for latent class, within the same 2- or 3-class segment, there can be quite strong differences in preference between people (difference in rank-order of levels within attributes).  This especially takes hold and makes a difference for non-ordered attributes, such as brand or color.  If within the same segment different respondents have different preference order for brand, then this can accentuate the differences between importance scores from the HB utilities vs. the averaged latent class run.

As a quick example, imagine that two respondents nearly 100% belong to the same latent class.  Also, imagine that there are just two levels of brand and the two respondents don't agree on which of the two brands is best.  Their preferences on average with respect to brand will cancel out one another, leading to brand having nearly 0 importance score when taking the average of those two respondents within latent class' importance calculation.  However, under HB, the importance scores are separately computed for these two respondents, so there is no canceling out and the brand importance comes through strong and irrespective of the fact that the two respondents don't agree on brand preference order.

So, you will often see the importances for non-ordered attributes be accentuated (larger) under HB than for latent class for the same segment of respondents.

If you run a much higher-dimension latent class solution (such as 24-group solution), then you will find that the importances for HB and latent class segments become much more similar.
answered Jun 26, 2018 by Bryan Orme Platinum Sawtooth Software, Inc. (174,415 points)
Thank you so much for your quick response. It has helped us a lot.
0 votes
Use LC to understand groupings but drop their utilities - use HB's ones instead to crosstabulate and report.

Or better re-run HB with LC as covariates.  I know it will be condemned for "double-dip" into heterogeneity.  But in fact any other covariate brought into analysis is an uniformed guess, expert opinion, which should also be  condemned for the same reasons. LC as a covariate is an informed and data-driven guess.

I would rather rely on data-driven covariates than on expert opinions about what covariates there should be
answered Jun 27, 2018 by furoley Bronze (885 points)