External variables (data)

Introduction

Importing External variables enables you to add existing data from external sources to your Discover survey for analysis. This is useful if you already have a database of respondent information and want to append it to each record, avoiding the need to ask respondents for the same details again in a survey question.

Note that the variables you import are only accessible in the Analysis area (e.g., in crosstabs and segmentation) and cannot be used in the Create area for survey authoring.

Importing external variables

To begin importing External variables:

Step 1 - Navigate to the Data > Records area of Discover and click the Define variables button on the External variable card

Step 2 – Click the Import variable button (Plus/Add button icon) in the External variable manager to open the import dialog.

Import External Variable Manager

Step 3 – Indicate which respondent ID type (Record ID or Unique pass-in ID) you are using in your import file and select the Excel (.xlsx) file from your hard drive. Then press the Import button.

Note: See the file formatting section below for instructions on how to properly prepare your file for upload.

Import External Variable Files dialog in Discover

Step 4 – If the file imports successfully, you'll return to the External variable manager where you can see the imported variables in the left-hand sidebar. If desired, you can provide Internal labels for the variable values imported. The Internal labels will appear in analysis reports and downloads.

External Variable Manager with imported variables in Discover

Data and file formatting

The data that you import is treated as categorical data; meaning all data values will be bucketed or put into categories of similar data. For example, if the values 1, 2, 4, and 1 were imported across four respondents, the respondents with the value of 1 are bucketed together for that variable.

For the file to be read and imported correctly, it must follow the correct formatting. There are three primary rules that need to be followed:

  1. The top row of the file must contain the names of the variable(s) you wish to import, each column containing a new variable.
  2. The first column must contain the unique respondent identifier variable corresponding with the ID specified in the import dialog. If you indicate that you’re using Discover’s internal record ID in the import dialog, the IDs would correspond with the “Record ID” variable. If you indicate that you’re using your own Unique pass-in variable as the respondent ID, the name of the column would be the variable name that you defined in the Variable manager found in the Create or Share tabs.
    1. Each respondent in the survey should be included in the import file, identified by their unique ID and containing the data for each external variable to be imported.
  3. The data file must be less than or equal to 100,000 total cells. See the troubleshooting section below for more information. 

Troubleshooting

There are a small number of errors that may occur during the import process. Each error will result in an error dialog to help you diagnose the problem. See below: 

Additive imports

The nature of importing external data is additive. Meaning, if a file is imported with a variable name that matches the name of a previously imported variable the new data will be appended to the existing data. A confirmation dialog displays in this case and asks if you would like to continue and merge the existing variable.

Examples of additive imports:

  1. Importing a set of 1,000 respondent records for a defined external variable, then another set of 1,000 (different) respondent records will merge the two sets, not replace the existing records.
    1. However, note that if any records are repeated (have matching record IDs from the first set) in the second set then the existing data for those records will be overwritten by the new data.
  2. Importing a set of records with one set of variable values (i.e. 1, 2, 3, etc) then adding another set with different (or changed) values (i.e. One, two, three, etc) will merge both sets, not replace the existing values.

Unmatched IDs

The Unmatched IDs report indicates the number of unmatched IDs found, specifying whether they were in the imported variable file or the recorded survey database.

To address this error, remove the unmatched IDs from the import file and/or add the missing survey record IDs, then reupload the file.

Alternatively, as the dialog indicates, you can choose to do nothing. The software will ignore missing IDs in the import file and categorize missing IDs in the survey records as 'other' during analysis.

Formatting errors

There are a few formatting errors to be aware of: 

  1. Missing unique identifier column: If your import file does not include a unique respondent identifier column, the import is stopped, and an error dialog informs you of the missing column. To fix this, add the unique identifier column as described in the file formatting section above, then reimport.
  2. Missing respondent ID: If your import file contains a row of data without a respondent ID, the import is stopped, and an error dialog tells you which row contained the missing ID. To fix this, provide the missing respondent ID and reimport.
  3. Duplicate respondent IDs: If your import file contains multiple rows with the same respondent ID, the import is stopped, and an error dialog tells you which rows contained the duplicate IDs. To fix this, remove or merge one or both of the duplicates and reimport.
  4. Missing variable name: If your import file contains a column of data without a variable name in the first row, the import is stopped, and an error dialog tells you which column contained the missing name. To fix this, provide the missing variable name and reimport.
  5. Oversized file: If the file is too large an error dialog tells you to reduce your file size. Files must be less than 100,000 total cells, otherwise server connection time outs may occur. Examples of supported file sizes might be a file with 1 variable and 100,000 respondent records, or a file with 10 variables and 10,000 respondent records, or any variation in between. You may need to break up the data into multiple subsets to import all variables for all respondent records. As mentioned previously, importing variables is an additive action. This means, for example, if you have 20,000 respondents and 10 variables then you'd need to upload two files: one with the first 10,000 respondents and another with the second 10,000 respondents for the same variables. 

File reading error

If the file cannot be read due to a connection or technical issue, the import is stopped, and an error dialog informs you of the file reading error. To resolve this, try importing the file again. If the issue persists, please contact our technical support team.