Hypothesis Testing: Known Population Standard Deviation

Return to the Main Math 160 Chapter 9 Topics page Revised November, 2014
Some images on this page have been generated via AsciiMathML.js.
For more information see: www.chapman.edu/~jipsen/asciimath.html.

For this page we will assume that we have a large population and that we can take some measurement on that population. If you need some context for this we could say that the population is all current WCC male students and the measurement is the percent fat of the students as measured by a Beurer Glass Body Analysis Scale. [As a small note: getting "percent body fat" measurements with such a scale may or may not be all that accurate but that is not a concern here. What is a concern is that this is a measurement and it is a reproducable measurement. That is, if you measure your percent body fat, get off the scale and get back on, it will give you the same measurement.]

We have a null hypothesis that the mean percent body fat for current WCC male students is 24 and an alternative hypothesis that the mean percent body fat for current WCC male students is greater than 24. We could get all 6,000 plus current WCC male students to come in, get measured, and from that data we could determine the exact mean percent fat of that population. However this is not practical. Instead, we will take a random sample of 38 such students, take percent body fat measurements on those students, using the Beurer Glass Body Analysis Scale, and then look at the sample mean of those measurements. Based on that sample mean we will see if we have evidence to reject the null hypothesis in favor of the alternative hypothesis.

We will assume that the the standard deviation of percent fat of current male WCC students is 3.5 and that we somehow know this to be true.

First Version:
For this version we will set our level of significance to be 5%. That is, as we judge the null hypothesis we are willing to be wrong in rejecting the null hypothesis 5% of the time.
So far we have:
`H_0: mu=24`
`H_1: mu>24`
`sigma=3.5`
`alpha=0.05`
`n=38`

Although we know we going to take one random sample of 38 current male students, we also know that if we took repeated random samples of size 38 from the population and if we computed the sample mean percent body fat for each of those multitude of samples, then we would have a data list of many such sample means values. Furthermore, since we have taken a samples of size 38, which is larger than 30, we can feel assured that the distribution of those sample mean values will be approximately Normal with a mean that is essentially equal to the population mean and a standard deviation that is approximately equal to the population standard deviation divided by the square root of the sample size, in this case, 38.

Critical Value Method:
Therefore, since we know, for this version of the problem, that the standard deviation of the underlying population is 3.5, we know that the standard deviation of the distribution of the sample means will be approximately `3.5/sqrt(38)` or about `0.5677`, or in symbols `sigma_barx=0.5677`. Since these sample means are Normally distributed, we can ask "What value of a distribution that is `bbN(24,0.5677)` will have `5%` of the area under the cummulative probability distribution to the right of that value?" On our calculator we use the command invNorm(0.95,24,0.5677) to find that value. The result is approximately `24.9338`. This becomes our critical value.

Now we actually draw our random sample of size 38 from the population. We get the measurements of percent fat from each of these 38 current male WCC students. We compute the sample mean, `barx`, and we find that `barx=25.032`. This value is more extreme than our critical value of `24.9338` indicating that getting such an extreme value for the sample mean would happen, by chance, for less than 5% of the samples of size 38 that we might draw from the population. Thus, if our null hypothesis, `H_0: mu=24`, were true then it would be unlikely (happen less often than our previously determined level of significance) for us to get a sample such as the one we just took. Therefore, we reject the null hypothesis in favor of the alternative hypothesis.

Attained Significance Level Method:
Again, since we know, for this version of the problem, that the standard deviation of the underlying population is 3.5, we know that the standard deviation of the distribution of the sample means will be approximately `3.5/sqrt(38)` or about `0.5677`, or in symbols `sigma_barx=0.5677`. Knowing this we can draw our random sample of 38 current male WCC students, measure the percent body fat for those students, and compute the sample mean, `barx`, for those values. Then we can ask "How strange is it to get a sample mean that is this extreme or even more extreme for a distribution that is `bbN(24,0.5677)`?"

It turns out that our sample gives `barx=25.032`. We can use our calculator to answer our question by using the command normalcdf(25.032,99999,24,0.5677), with the result being `0.0346`. This tells us that for samples of our size from a population with known standard deviation, `sigma=3.5`, if the null hypothesis, `H_0: mu=24`, is true, then we would expect to get a sample mean of `25.032` or higher in only `3.45%` of the such samples. Because this is less than our stated level of significance, namely, `alpha=0.05`, we reject the null hypothesis in favor of the alternative hypothesis.

Doing the calculation above is relatively straight forward. It is fomulating the normalcdf( command after computing the standard deviation of the sample means, `S_barx`. The calculator does have a different command, the Z-Test... command, that merely asks for values for `mu_0, sigma, barx, n,` and the choice of `mu: !=mu_0 <mu_0 >mu_0`. Here are the images of setting up and running that test:
Note that the result screen shows the attained significance, p, has the value `0.0346`, the same value that we got as the resut of our earlier command.

Second Version:
The only thing we will change for the second version of the problem is our declared level of significance. For this version we say that we are only willing to incorrectly reject the null hypothesis in favor of the alternative hypothesis `1%` of the time. Everything else remains the same for this version so:
`H_0: mu=24`
`H_1: mu>24`
`sigma=3.5`
`alpha=0.01`
`n=38`
Critical Value Method:
Just as above, we know that `sigma_barx=sigma/sqrt(n)=3.5/sqrt(38)` which evaluates to about `0.5677`. Our question becomes "What value of a distribution that is `bbN(24,0.5677)` will have `1%` of the area under the cummulative probability distribution to the right of that value?" On our calculator we use the command invNorm(0.99,24,0.5677) to find that value. The result is approximately `25.3207`. This becomes our critical value.

We take our sample and find that the sample mean, `barx`, is `25.032`. This value, although larger than our `H_0: mu=24`, is not as or more extreme than our critical value. That is to say, because we are only willing to be wrong `1%` of the time, and because we know that `99%` of all samples of our size taken from the population that is `bbN(24,0.5677)` will have a sample mean that is less than or equal to `25.3207`, finding a sample that has `barx=25.032` falls into that group of `99%` of all samples. It is just not extreme enough to reject the null hypothesis. Therefore, based on our sample, we can not reject the null hypothesis in favor of the alternative hypothesis. In essence, we need to accept the null hypothesis.

Attained Significance Level Method:
Just as above, we know that `sigma_barx=sigma/sqrt(n)=3.5/sqrt(38)` which evaluates to about `0.5677`. We take our sample and we find that the sample mean is `25.032`. Then we can ask "How strange is it to get a sample mean that is this extreme or even more extreme for a distribution that is `bbN(24,0.5677)`?"

We can use our calculator to answer our question by using the command normalcdf(25.032,99999,24,0.5677), with the result being `0.0346`. [Note that this is exactly the calculation that we did in the first version above.] This tells us that for samples of our size from a population with known standard deviation, `sigma=3.5`, if the null hypothesis, `H_0: mu=24`, is true, then we would expect to get a sample mean of `25.032` or higher in only `3.45%` of the such samples. Because this is greater than our stated level of significance, namely, `alpha=0.01`, we can not reject the null hypothesis in favor of the alternative hypothesis. In essence, we need to accept the null hypothesis.

Alternatively, we could have run the Z-Test..., but the input data for that would have been identical to the run for version 1 and the results would have been the same, namely, the attained significance is 0.0346, which is not less than our stated 1% and, therefore, we would not reject the null hypothesis.

Third Version:

For the third version of this problem we will return to our original problem but change the sample size from 38 to 87. Thus, the significant values are:
`H_0: mu=24`
`H_1: mu>24`
`sigma=3.5`
`alpha=0.05`
`n=87`
Critical Value Method:
The change from a sample of size 38 to one of size 87 means that we need to get a new value for the standard deviation of the sample mean, `sigma_barx=sigma/sqrt(n)=3.5/sqrt(87)~=0.375`. Our question becomes "What value of a distribution that is `bbN(24,0.375)` will have `5%` of the area under the cummulative probability distribution to the right of that value?" On our calculator we use the command invNorm(0.95,24,0.375) to find that value. The result is approximately `24.617`. This becomes our critical value.

We take our sample and find that the sample mean, `barx`, is `25.032`. This value is more extreme than our critical value. That is to say, because we are only willing to be wrong `5%` of the time, and because we know that `95%` of all samples of our size taken from the population that is `bbN(24,0.375)` will have a sample mean that is less than or equal to `24.617`, finding a sample that has `barx=25.032` falls outside of that group of `95%` of all samples. It is just too extreme. If the null hypothesis were true, we would get such a random sample less than `5%` of the time. Therefore, based on our sample, we reject the null hypothesis in favor of the alternative hypothesis.

Attained Significance Level Method:
With this new sample size we know that `sigma_barx=sigma/sqrt(n)=3.5/sqrt(87)~=0.375`. We take our sample and we find that the sample mean is `25.032`. Then we can ask "How strange is it to get a sample mean that is this extreme or even more extreme for a distribution that is `bbN(24,0.375)`?"

We can use our calculator to answer our question by using the command normalcdf(25.032,99999,24,0.375), with the result being `0.00296`. This tells us that for samples of our size from a population with known standard deviation, `sigma=3.5`, if the null hypothesis, `H_0: mu=24`, is true, then we would expect to get a sample mean of `25.032` or higher in only `0.296%` of the such samples. Because this is less than our stated level of significance, namely, `alpha=0.05`, we reject the null hypothesis in favor of the alternative hypothesis.

Alternatively, we could have used the Z-Test..., as before, but with the sample size set to 87, as in the first of the two images:

Notice that the attained significance is almost identical to that found above. The difference is the result of our using 0.375 as the standard deviation of the sample means and the Z-Test... using the complete value of `3.5/sqrt(87)`.

Fourth Version:

For the fourth version of this problem we will change the third version to use a level of significance set to `1%`. As in the third version, we will still have a samle size of 87. . Thus, the significant values are:
`H_0: mu=24`
`H_1: mu>24`
`sigma=3.5`
`alpha=0.01`
`n=87`
Critical Value Method:
The value for the standard deviation of the sample mean remains `sigma_barx=sigma/sqrt(n)=3.5/sqrt(87)~=0.375`. Our question becomes "What value of a distribution that is `bbN(24,0.375)` will have `1%` of the area under the cummulative probability distribution to the right of that value?" On our calculator we use the command invNorm(0.99,24,0.375) to find that value. The result is approximately `24.872`. This becomes our critical value.

We take our sample and find that the sample mean, `barx`, is `25.032`. This value is more extreme than our critical value. That is to say, because we are only willing to be wrong `1%` of the time, and because we know that `99%` of all samples of our size taken from the population that is `bbN(24,0.375)` will have a sample mean that is less than or equal to `24.872`, finding a sample that has `barx=25.032` falls outside of that group of `99%` of all samples. It is just too extreme. If the null hypothesis were true, we would get such a random sample less than `1%` of the time. Therefore, based on our sample, we reject the null hypothesis in favor of the alternative hypothesis.

Attained Significance Level Method:
With this sample size we know that `sigma_barx=sigma/sqrt(n)=3.5/sqrt(87)~=0.375`. We take our sample and we find that the sample mean is `25.032`. Then we can ask "How strange is it to get a sample mean that is this extreme or even more extreme for a distribution that is `bbN(24,0.375)`?"

We can use our calculator to answer our question by using the command normalcdf(25.032,99999,24,0.375), with the result being `0.00296`. This tells us that for samples of our size from a population with known standard deviation, `sigma=3.5`, if the null hypothesis, `H_0: mu=24`, is true, then we would expect to get a sample mean of `25.032` or higher in only `0.296%` of the such samples. Because this is less than our stated level of significance, namely, `alpha=0.01`, we reject the null hypothesis in favor of the alternative hypothesis.

Fifth Version:

For the fifth version of this problem we will return to our original problem but change the sample size from 38 to 14. Note that having a smaple size less than 30 brings into question the extent to which the population of sample means (from repeated samoling of size 14) conformas to a normal distribution. If we have reason to believe that the underlying poputation is normally distributed, then we can use such a small sample size. We will continue using the assumption that the underlying population really does have a normal distribution of the percent fat of current male WCC students. Thus, the significant values are:
`H_0: mu=24`
`H_1: mu>24`
`sigma=3.5`
`alpha=0.05`
`n=14`
Critical Value Method:
The change from a sample of size 38 to one of size 14 means that we need to get a new value for the standard deviation of the sample mean, `sigma_barx=sigma/sqrt(n)=3.5/sqrt(14)~=0.9354`. Our question becomes "What value of a distribution that is `bbN(24,0.9354)` will have `5%` of the area under the cummulative probability distribution to the right of that value?" On our calculator we use the command invNorm(0.95,24,0.9354) to find that value. The result is approximately `25.5386`. This becomes our critical value.

We take our sample and find that the sample mean, `barx`, is `25.032`. This value is not more extreme than our critical value. That is to say, because we are only willing to be wrong `5%` of the time, and because we know that `95%` of all samples of our size taken from the population that is `bbN(24,0.9354)` will have a sample mean that is less than or equal to `25.5386`, finding a sample that has `barx=25.032` falls inside of that group of `95%` of all samples. It is just not extreme. If the null hypothesis were true, we would get such a random sample as part of the `95%` of ll samles that have a mean less than our critical value, `25.5386`. Therefore, based on our sample, we can not reject the null hypothesis in favor of the alternative hypothesis. In essence, we are left with accepting the null hypothesis.

Attained Significance Level Method:
With this new sample size we know that `sigma_barx=sigma/sqrt(n)=3.5/sqrt(87)~=0.9354`. We take our sample and we find that the sample mean is `25.032`. Then we can ask "How strange is it to get a sample mean that is this extreme or even more extreme for a distribution that is `bbN(24,0.9354)`?"

We can use our calculator to answer our question by using the command normalcdf(25.032,99999,24,0.9354), with the result being `0.135`. This tells us that for samples of our size from a population with known standard deviation, `sigma=3.5`, if the null hypothesis, `H_0: mu=24`, is true, then we would expect to get a sample mean of `25.032` or higher in only `13.5%` of the such samples. Because this is not less than our stated level of significance, namely, `alpha=0.05`, we can not reject the null hypothesis in favor of the alternative hypothesis. In essence, we are left with accepting the null hypothesis.

As before, we could have done the same thing using the Z-Test... as shown in the following images:

Return to the Main Math 160 Chapter 9 Topics page

©Roger M. Palay
Saline, MI 48176
November, 2014