Sampling -- Stratified Sample
Return to the Sampling page
A "Stratified Sample" is a sample that is purposefully constructed
to insure the inclusion of items from identified "slices" or "partitions" of the population.
For example, consider the situation where we have a manufacturing
plant that produces a particular part.
Over the years the demand for this part has increased. In order to meet that demand,
over time we have acquired new machines to make this part. At the moment we have
five (5) different machines all making the same part. We want to get a sample,
of size 60, of
the products produced in one week. All of the output from the week is stored in our
warehouse. We could get a "simple random sample" by numbering all of those products and then selecting
60 random numbers from 1 to the number of products. However, if we did this there is
no certainty that our sample will contain products from each of our five machines.
We could generate a "stratified sample" by deciding to choose 12 random items
from the products produced by each of the five machines. By doing this we are certain that we have
representative sample products produced
each of the machines that we are using.
This approach is not the same as a "simple random sample" because in an SRS there is indeed the possibility
that our sample will not have any items that come from one particular machine.
In fact, it would be possible, though not likely, for the SRS to have all 60 items in the sample come
from one machine.
Our "stratified sample" insures that the sample will contain items from each machine.
This has a great appeal to us. As long as our selection of items within those produced by each machine
is truly random, this is not a terrible approach. It is open to some
new problems however. For example, in our situation, we chose to take
12 items from the products produced by each of the five machines. The balance of
sample sizes across machines implies that the output of the five machines is equal.
If, on the other hand, one of our machines made half the products while another
made a fifth of the products, and the last three machines each made a tenth of the products,
then we should sample in the same proportion,
taking a sample of size 30 from the first machine's
output, a sample of size 12 from the output of the second machine, and samples of size 6 from the
output of each of the remaining machines.
As a second example, consider getting a sample of the 11,413 credit students
at the community college in the fall term. We could select the students at
random, but we are concerned because we want to be sure that we have
some older, age 45 and above, students in our sample. With a simple random sample we may
have older students in the sample but we cannot be sure that we will. We could adopt a
stratified sample methodology to be sure that we have the distribution that we want in the
sample.
We happen to know that 90% of the credit students at the college are younger than 45.
Therefore, as a stratified sample, to get a total sample of size 60 we force our
choice to have 54 students randomly selected from students who are
younger than 45, and 6 students
randomly selected from the students who are 45 years old or older.
Return to the Sampling page
©Roger M. Palay
Saline, MI 48176 December, 2015