Sample Size: Stratified Random Samples

The precision and cost of a stratified design are influenced by the way that sample elements are allocated to strata.

How to Assign Sample to Strata

One approach is proportionate stratification. With proportionate stratification, the sample size of each stratum is proportionate to the population size of the stratum. Strata sample sizes are determined by the following equation :

where nh is the sample size for stratum h, Nh is the population size for stratum h, N is total population size, and n is total sample size.

Another approach is disproportionate stratification, which can be a better choice (e.g., less cost, more precision) if sample elements are assigned correctly to strata. To take advantage of disproportionate stratification, researchers need to answer such questions as:

Although a consideration of all these questions is beyond the scope of this tutorial, the remainder of this lesson does address the first two questions. (To answer the other questions, as well as the first two questions, consider using the Sample Size Calculator.)

Sample Size Calculator

Stat Trek's Sample Size Calculator can help you find the right sample allocation plan for your stratified design. You specify your main goal - maximize precision, minimize cost, stay within budget, etc. Based on your goal, the calculator prompts you for the necessary inputs and handles all computations automatically. It tells you the best sample size for each stratum. The calculator creates a summary report that lists key findings, including the margin of error. And it describes analytical techniques. And the calculator is free. You can find the Sample Size Calculator in Stat Trek's main menu under the Stat Tools tab. Or you can tap the button below.

How to Maximize Precision, Given a Stratified Sample With a Fixed Budget

The ideal sample allocation plan would provide the most precision for the least cost. Optimal allocation does just that. Based on optimal allocation, the best sample size for stratum h would be:

where nh is the sample size for stratum h, n is total sample size, Nh is the population size for stratum h, σh is the standard deviation of stratum h, and ch is the direct cost to sample an individual element from stratum h. Note that ch does not include indirect costs, such as overhead costs.

The effect of the above equation is to sample more heavily from a stratum when

How to Maximize Precision, Given a Stratified Sample With a Fixed Sample Size

Sometimes, researchers want to find the sample allocation plan that provides the most precision, given a fixed sample size. The solution to this problem is a special case of optimal allocation, called Neyman allocation.

The equation for Neyman allocation can be derived from the equation for optimal allocation by assuming that the direct cost to sample an individual element is equal across strata. Based on Neyman allocation, the best sample size for stratum h would be:

where nh is the sample size for stratum h, n is total sample size, Nh is the population size for stratum h, and σh is the standard deviation of stratum h.

Test Your Understanding

This section presents a sample problem that illustrates how to maximize precision, given a fixed sample size and a stratified sample. (In a subsequent lesson, we re-visit this problem and see how stratified sampling compares to other sampling methods.)

Problem 1

At the end of every school year, the state administers a reading test to a sample of 36 third graders. The school system has 20,000 third graders, half boys and half girls. The results from last year's test are shown in the table below.

Stratum Mean score Standard deviation
Boys 70 10.27
Girls 80 6.66

This year, the researchers plan to use a stratified sample, with one stratum consisting of boys and the other, girls. Use the results from last year to answer the following questions?

Solution: The first step is to decide how to allocate sample in order to maximize precision. Based on Neyman allocation, the best sample size for stratum h is:

where nh is the sample size for stratum h, n is total sample size, Nh is the population size for stratum h, and σh is the standard deviation of stratum h. By this equation, the number of boys in the sample is:

nboys = 36 * ( 10,000 * 10.27 ) / [ ( 10,000 * 10.27 ) + ( 10,000 * 6.67 ) ]

Therefore, to maximize precision, the total sample of 36 students should consist of 22 boys and (36 - 22) = 14 girls.

The remaining questions can be answered during the process of computing the confidence interval. Elsewhere on this website, we described how to compute a confidence interval. We employ that process below.