Lighthouse Studio

Zoom Window Out
Larger Text | Smaller Text
Hide Page Header
Show Expanding Text
Print Topic
Share This Topic
Save Permalink URL

How Missing Items Are Treated

With Relevant Items (Constructed List) MaxDiff, items on the pre-defined (master) list are sometimes missing from a respondent's MaxDiff questions. For example, maybe we decide to drop items the respondent says (in a previous question) are not at all important. Or, perhaps based on whether respondents are in an accounting vs. sales function we drop certain items regarding their role.

When items are dropped (not included) in a respondent's MaxDiff questionnaire, we have to tell the utility estimation procedure how to treat the missing items. You can specify the treatment of missing data as a general (global) setting, or even on a Custom item-by-item basis (i.e., some missing items could be treated differently from other items).

For the purposes of utility estimation, there are three treatments in our software for dropped items: Missing at Random, Missing Inferior, and Missing Unavailable. The three treatments may be applied whether using HB (individual-level) estimation, latent class MNL, or aggregate logit. In all cases, we obtain an imputed utility estimate for missing items.

1) Missing at Random: If the item was dropped at random from a respondent's list of items to evaluate in the MaxDiff questions (e.g., Express MaxDiff), there isn't a systematic reason that the item should be assumed to have a worse or better score because it is missing. Rather, the respondent provides no information regarding the item's score and the overall population (mean) preference should not be affected. No information is coded in the respondent's design matrix for missing items. It is interesting to note that if employing HB estimation (not necessarily recommended in the case of missing at random items), HB will estimate a utility value for a missing item for the respondent, based on draws from the population-level mean preferences and covariances. Therefore, if the majority of respondents who saw the item judged it as highly important relative to other items, a respondent who did not see that item in the questionnaire would likely receive an imputed utility estimate (via HB estimation) that is also relatively high.

2) Missing Inferior: If the item was dropped from a respondent's list of items to evaluate in the MaxDiff questions due to that respondent previously indicating it is worse (or less important) than the items taken into the MaxDiff questions, then we should inform utility estimation that the dropped items are inferior to the included items for this respondent. Otherwise, our model will be biased.

We use data augmentation to inform utility estimation that dropped items should have lower utility than included items for the respondent. We add a new reference item to the design matrix (a threshold item), so an extra parameter is estimated. We add new paired comparison choice tasks to each respondent's data matrix: one task for every item in the master pre-defined list. Dropped items are compared to the threshold item and lose to the threshold item. Included items are compared to the threshold item and win. This strongly influences (but does not constrain) the dropped items to be worse than the included items for a given respondent. The fact that the respondent dropped an item from the MaxDiff exercise influences the pooled population estimates (by informing the estimation that this respondent sees dropped items as relatively less preferred). The utility for missing items is strongly influenced (restrained) by the prior means and variances to avoid plummeting toward negative infinity. Rather, they tend to be quite low on the logit scale, but not extremely so.

Although the threshold item is included as a parameter in the utility estimation, we view the parameter as not meaningful in the final analysis and it is not displayed in the utility reports or reported as a new parameter in the utility run.

3) Missing Unavailable to be Chosen: This approach only applies to HB and Latent Class MNL, where individual-level utilities are estimated and stored in the utility run. It follows the same approach as Missing at Random, with one large difference: after utility estimation is done, the scores for missing items are replaced with a large user-specified negative utility, such as -20. This large negative utility value is low enough that simulated choices (such as for TURF or Share of Preference simulations) should give the respondent nearly a zero likelihood of choosing a missing item within a set including non-missing competitive items. One would only want to select this method if the fact that an item is missing for a respondent shouldn't influence other respondents' estimates via latent class MNL or HB's upper-level model; and if the likelihood of selecting a missing item should be near zero for the respondent.

Custom: This approach allows you to customize by item, or by respondent by item, the missing item treatment. For global treatment of items (same across all respondents), you can specify the missing item treatment per item. If different respondents should be treated differently regarding missing items, you can specify a variable for each item that contains a value indicating how to treat the missing item for that respondent. The values contained in that variable refer to: 1=missing at random, 2=missing inferior, or 3=missing unavailable.

Independently of the missing item treatment, some researchers may want to set the missing items back to to missing (blank) fields in a data export of the HB utility scores. Therefore, irrespective of which missing item treatment you specify for HB or latent class MNL estimation, you can subsequently choose to export the utility data and set missing items back to blank fields at the individual level.