Sample Size Calculator

Valid survey research requires samples that represent the population and that are large enough to allow the researcher to draw valid conclusions. The former is the topic of a separate document on sampling methods.

A well-calculated sample size not only bolsters the credibility of the research findings but also fortifies the decision-making process, ensuring that strategies and policies are based on robust results. Thus, calculating sample size is critical for researchers aiming to produce high-quality, dependable insights.

Calculate Your Sample Size

What is Sample Size?

Sample size refers to the number of individuals or observations included in a study or survey. Correctly calculating sample size avoids two opposing problems. A too-small sample size may lead to unreliable results that fail to represent the population accurately, while an unnecessarily large sample can waste resources and time.

Factors Influencing Sample Size Calculation

When calculating sample size, it's important to grasp the key factors that influence this calculation. These factors not only affect the precision and accuracy of your research findings but also dictate the level of confidence you can have in your survey results. Let's delve into these components:

Confidence Level: This indicates how certain we are that the population parameters fall within the range of the estimated values. Typically, researchers opt for a 95% or 99% confidence level, which conveys that if the same survey were repeated multiple times, the range of estimated values (the confidence interval) would contain the true population value 95% or 99% of the time, respectively.
Margin of Error: The margin of error on the other hand, represents the extent of deviation from the actual population parameter one is willing to tolerate. It is directly tied to the confidence interval, reflecting the range within which the true value is expected to lie. A smaller margin of error, signifying more precise results, requires a larger sample size. For instance, a margin of error of ±3% means you believe with the prescribed confidence level that the true population parameter lies within 3% of your sample estimate, either above or below.
Power: When planning to conduct statistical tests, a researcher usually needs larger samples than when merely sizing for precision. The reason has to do with the logic of statistical testing wherein we try to manage two types of error, the error of a false positive (captured by the confidence interval) and the error of a false negative (captured by the power). Whereas we typically want 90% or 95% or 99% confidence, the rule of thumb is to shoot for 70% or 80% power.
Population Size: Only in rare cases (you plan to sample more than 5% of the total population) does population size figure into sample size calculations. An illustration of why this is the case appears below.
Standard Deviation (Response Distribution): This statistic measures the variability or diversity of responses in your data. A higher standard deviation indicates a wider dispersion of responses, which, in turn, requires a larger sample size to accurately capture the population's characteristics. Understanding the variability in your data helps in tailoring the sample size to your specific research needs.

Each of these factors influences the sample size that balances precision, confidence, power and resource allocation. Ignoring these elements can lead to unreliable conclusions.

Methods for Calculating Sample Size

Navigating the complexities of sample size calculation can seem daunting, but various methods and tools are available to simplify the process. Let's explore some of the most effective techniques:

Manual Sample Size Calculation Using Formulas: For those who prefer a hands-on approach, manual calculation offers insight into the mechanics behind the numbers. The formula incorporates the z-score of both the level of confidence and (for statistical testing) the level of desired power, plus the standard deviation, and margin of error. While this method demands a deeper understanding of statistical principles, it provides flexibility and a thorough comprehension of the underlying processes.
Sample Size Calculators in Excel: Tools like Sawtooth Software’s free Excel sample size calculator streamline the calculation process, making it accessible to a broader audience. By inputting your desired levels of confidence and power, the margin of error, and estimated population standard deviation, the calculator offers an easy way to compute your sample size. The Sawtooth Software Excel sample size simulator allows the user to compute sample sizes for means and proportions, and for differences in means and proportions, for both precision and for power. To learn more about how to use this calculator watch our sample size webinar.
Online Sample Size Calculators: Online sample size calculators also provide instant calculations. These tools are designed to accommodate various research scenarios, offering tailored inputs for confidence level, margin of error, population size, and more. The online calculator from Sawtooth Software (see above) allows a user to calculate sample size for means and proportions for precision, but not for power, so, like most other online calculators, its usefulness is more restricted than that of our Excel calculator.

Whether you favor manual calculations or the convenience of automated tools, these options will allow you to right-size your samples.

Importance of Sampling Method Quality

All sample size calculations in this document assume we are drawing sample elements randomly from the population. To the extent our samples are not random (and hence not representative of the population) they can be biased regardless of sample size. So the quality of how we draw sample is an equally important, but separate topic from sample size calculations. Please see our document about sampling methods.

Why Population Size (Usually) Doesn’t Matter

To illustrate this concept, let's consider a practical example. A sample of 400 people can provide the same precision for a country with a population of 250,000,000 as it would for a city of 50,000, assuming the same sampling methodology is applied. This counterintuitive principle is grounded in statistical theory, which shows that the accuracy of estimates from a sample depends more on the sample size itself than on the overall size of the population.

Below is a table that outlines how different confidence levels and margins of error relate to various population sizes. This table assumes a simple random sample is being taken from a larger population.

Population Confidence Level Table

Confidence Level	90%		95%		99%
Margin of Error	5.00%	2.50%	5.00%	2.50%	5.00%	2.50%

Population Size
10	10	10	10	10	10	10
25	23	25	24	25	25	25
50	43	48	45	49	47	50
75	59	71	63	72	68	73
100	74	92	80	94	88	97
150	97	132	109	137	123	143
200	116	169	132	178	154	187
250	131	204	152	216	182	229
300	143	236	169	252	207	270
350	153	265	184	286	230	310
400	162	293	197	318	250	348
450	170	319	208	349	269	385
500	176	343	218	378	286	421
750	200	444	255	505	353	585
1,000	214	520	278	607	400	727
5,000	257	890	357	1,176	586	1,734
10,000	264	977	370	1,333	623	2,098
25,000	268	1,038	379	1,448	647	2,400
50,000	270	1,060	382	1,491	655	2,521
100,000	270	1,071	383	1,514	660	2,586
500,000	271	1,080	384	1,532	663	2,640
1,000,000	271	1,082	384	1,535	664	2,647
2,500,000	271	1,082	385	1,536	664	2,652
10,000,000	271	1,083	385	1,537	664	2,654
100,000,000	271	1,083	385	1,537	664	2,654
250,000,000	271	1,083	385	1,537	664	2,654

This table simplifies the process of figuring out (determining) an appropriate sample size for your research project. It illustrates that, for most practical purposes, the concern isn't the total population size but rather ensuring your sample size is sufficient to achieve your desired confidence level and margin of error.

Remember, the key takeaway is that a well-chosen sample of a few hundred can be highly representative of a population in the millions, provided that the sampling method is sound and biases are minimized. This principle allows market researchers to conduct studies that are both cost-effective and statistically reliable.

Example: Sample Size in Action

Consider you're conducting a survey to gauge customer satisfaction among users of a digital service platform. Whether your target population is 50,000 or 50 million users, a sample size of 400 respondents might be sufficient to achieve a 95% confidence level with a 5.0% margin of error, assuming the sample is randomly selected and represents the population well.

This example highlights the importance of focusing on the quality of your sampling process and the size of your sample, rather than being overly concerned with the total size of the population from which the sample is drawn.

Calculating Sample Sizes for Different Research Methods

The calculation of sample sizes vary significantly across different research objectives. The calculators discussed above apply only for confidence intervals and power for means, proportions, differences between means and differences between proportions. More complex objectives have different requirements that influence how sample size is determined. Understanding these differences is crucial for researchers to ensure the validity of their findings. Let's explore how sample size calculation varies across several research methods:

Regression Analysis/Driver Analysis

Common advice for regression analysis or driver analysis, is having at least 10 observations for each variable included in the model. This minimum applies only when you have well-conditioned data (i.e., when your predictor variables are not correlated among themselves, a condition known as multicollinearity). To the extent you have multicollinearity you will want a larger sample size to untangle the interdependencies among variables.

Logit Analysis

Logistic regression analysis, analyzes predictions of binary outcomes. It requires more sample than regression analysis because instead of estimating the slope of a straight line, it models an S-shaped curve. The recommended approach for determining sample size in logit analysis involves having a sample size that is at least ten times the number of variables divided by the smaller percentage representation of the binary outcome. For example, in a scenario where you have 10 predictors with a response ratio of 60/40, a minimum sample size of about 150 would be advisable.

Segmentation Analysis

Segmentation analysis has seen an evolution in recommendations regarding optimal sample size. The current best practice suggests aiming for 100 respondents for each “basis” variable (each variable input to the segmentation analysis). For instance, employing 20 basis variables in your segmentation analysis would necessitate a robust sample size of 2,000 respondents. You may also want to think backwards from any expectation you may have about the number of segments that will result, so that you have enough sample size for powerful comparisons of the differences between segments.

Factor Analysis

Factor analysis is a technique that requires a nuanced approach to sample size, with general guidelines suggesting that less than 100 is "poor," 200 is "fair," and 300 or more is "good." If in doubt, err on the side of larger sample size to account for the unpredictability and complexity inherent in the data.

Tree-Based Segmentation

Tree-based segmentation, characterized by its iterative process of creating segments through successive splits in the dataset, typically demands a larger sample size. To accommodate the method's need for multiple levels of pairwise splits and ensure the stability of the segments created, we recommend a minimum of 1,000 respondents.

Conjoint Analysis/MaxDiff

For conjoint analysis or MaxDiff, establishing the right sample size is critical for achieving the desired level of precision in preference or difference measurements. A general guideline is to have at least 300 respondents, or 200 per reportable subgroup if the study aims to make subgroup comparisons. The specific sample size, however, is often determined based on the number of attributes, levels per attribute, profile per question and the number of questions, the precision needed for estimating shares or preferences and the desired level of power, guiding researchers to tailor their sample size according to the requirements of their particular study or analysis.

Each of these research methods brings its own set of considerations for calculating sample size, underscoring the importance of a methodical approach to ensure the accuracy and reliability of research findings.

For further information about sample size considerations for conjoint analysis see this white paper: Sample Size Issues for Conjoint Analysis

And for information about sample size considerations for MaxDiff, see our MaxDiff sample size calculator page.

Conclusion

In conclusion, calculating the appropriate sample size is a critical step in ensuring the validity of your research findings. Whether you're comparing means or proportions, conducting regression analysis, logit analysis, segmentation analysis, factor analysis, tree-based segmentation, or conjoint analysis/MaxDiff, understanding the specific requirements and best practices for sample size calculation is essential. By applying the guidelines outlined in this article, researchers can enhance the accuracy, confidence and power of their results, making informed decisions.

We encourage readers to explore Sawtooth Software’s comprehensive tools and resources, designed to support you in conducting effective surveys and research endeavors. Making informed decisions regarding sample size calculation is within your reach, and with the right tools and knowledge, you can achieve high quality results in your research projects.

What margin of error can you accept?	%
What confidence level do you need?	%
What is the population size?
What is the response distribution?	%
Your recommended sample size is:	377

With a sample size of:
Your margin of error would be:	9.78%	6.89%	5.62%

With a confidence level of:
Your sample size would need to be:	267	377	643