Explore Confidence Interval, Population Mean, σ known
The script below provides a way to
- Create a populatios of values with specified mean and standard deviation.
- Verify the mean and standard deviation of each population.
- Specify the size of a sample to be taken from the population.
- Specify the confidence level to use.
- Specify the number of times to take such samples.
- Perform the sampling and, for each sample, generate a confidence interval
for the population means based on the sample mean.
- Keep track of the number of times that the generated confidence interval
actually contains the true mean.
- Report that count.
- Report the standard deviation of the collection of the sample means.
By asking for a significant number of samples, say 10,000, we can see that
we really do get close to the specified confidence level of successes.
Furthermore, we get a confirmation that the standard deviation of the
sample means that we found is really close to the
predicted standard deviation based on the population standard deviation and the sample size.
In the folder containing the function scripts for this course create a new directory, copy the model.R file to that directory,
rename the file in the new directory, double click on the file to open Rstudio.
Then copy all of the text below the line and paste it into your Rstudio editor pane. Then, you can highlight the entire
script and run it to use the default values. After that you can go back and change parameters and run the script again to
explore the consequences of those changes.
# We look at the confidence interval for the
# population mean when we know
# the standard deviation of the population.
# first set up a goal population
goal_mean <- 23.2
goal_sd <- 5.3
# then create the population, with
# 1000 values in it
# First generate an approximate standard normal
pop <- rnorm( 1000 )
# Then get its mean and standard deviation
mu_pop <- mean( pop )
# we want the standard deviation of the population
source("../pop_sd.R")
sd_pop <- pop_sd( pop )
# Then create the distribution we want
pop <- ( (pop-mu_pop)/sd_pop )* goal_sd + goal_mean
# finally, verify that we have the right population
mean( pop )
pop_sd( pop )
# Now set the confidence level
c_level <- 0.93
# Now we want to repeat the following process of
# getting a sample from the population,
# and then generating the confidence interval
# for the population mean.
# While we are at this, and because we know what that
# mean is, we can count the number of times
# that the true mean is inside our interval.
# Furthermore, let us keep track of the observed
# sample means so that later we can compare the
# standard deviation of those sample means to the
# predicted value
samp_size <- 12
num_reps <- 100
num_success <- 0
num_fail <- 0
true_mean <- goal_mean
predicted_sd <- goal_sd/sqrt( samp_size )
samp_means <- (1:num_reps) # to hold the sample means
# since the confidence level is set we can find
# z value with half the area to its right
alpha_div_2 <- (1 - c_level)/2
z_val <- qnorm( alpha_div_2, lower.tail=FALSE)
for( i in (1:num_reps) )
{ # choose samples from pop one get sample mean
index_1 <- as.integer( runif( samp_size, 1, 1001))
samp <- pop[ index_1 ]
xbar <- mean( samp )
samp_means[i] <- xbar
# get the confidence interval
ci_low <- xbar - z_val*predicted_sd
ci_high <- xbar + z_val*predicted_sd
in_ci <- (ci_low <= goal_mean ) &&
( goal_mean <= ci_high )
if( in_ci )
{ num_success <- num_success+1} else
{ num_fail <- num_fail + 1}
}
# report the number of successes
num_success
# report the standard deviation of our sample means
sd( samp_means )
# and our predicted value
predicted_sd