Two Populations; Proportions
Return to Topics page
This is a situation where we have two populations.
Furthermore, within
those populations we can distinguish characteristics such
that we can say that the proportion of population 1
that has a specific characteristic is
p1 and the
proportion of population 2 that has the same
characteristic is
p2.
We are interested in the difference of those proportions,
that is, p1 - p2.
We will start, as we have for other situations, by
looking at the process that we need to use to create
a confidence interval, at some level of confidence,
for the difference of two proportions.
Later we will look at the process that we need to use
to test the null hypothesis,
at some significance level, that the difference of the
two proportions is zero. That translates to having the
null hypothesis state that the two proportions are equal.
The confidence interval for the difference of two proportions
We create the confidence interval for the difference of two proportions
from two samples, one from each population. We can recognize that
the samples are of size n1
and n2.
Each sample will have a number of its items that have the
specified characteristic for that population. We will say that
x1 is the number of items
in the first sample that exhibit the characteristic
of population one.
In the same way, we will say that
x2 is the number of items
in the second sample that exhibit the characteristic of
population two.
Using those values we see that we have
and
.
Then is an estimate of
p1,
is an estimate of
p2, and
is an estimate of
p1 - p2.
For our confidence interval we need a point estimate of
p1 - p2
and we have that in
.
Then we need to know the distribution of the point estimate.
Under certain conditions we can consider
to be normal with mean
p1 - p2
and standard deviation, called the standard error,
.
The required conditions are
- These are random samples
- The samples are independent
- The samples are big enough to have at least 10 successes
and 10 failures in each sample
- The samples are small enough so that they represent less
than 5% of their respective populations.
Of course, in any particular problem, we will need to specify
the confidence level. For a 95% confidence level we will want
to allow the remaining 5% to fall outside of the confidence interval.
Because this is a normal distribution it is
symmetric.
If we let α represent that outside
5%, then want zα/2 to be either
the value in the standard normal distribution
that has α/2 of the area below it or
α/2 of the area above it.
With all of that we have the confidence interval specified as
(point estimate) ± zα/2 * std err
or
Case 1: We have two large populations and they each exhibit
characteristics that we will call "successes" and "failures".
We have a random sample of size 66 from population 1, of which 43
are successes, thus there are 23 failures in this sample.
We have a random sample of size 87 from population 2, of which 68
are successes, thus there are 19 failures in this sample.
We want to construct a 95% confidence interval
for the difference in the population proportions,
p1 - p2.
Our computations become
= 43/66 ≈ 0.6515
= 68/87 ≈ 0.7816
≈ 0.6515 - 0.7816 = -0.1301
α = 1-0.95 = 0.05
α/2 = 0.025
zα/2 ≈ -1.96
≈
≈ 0.735
margin of error = zα/2 * std err
≈ -1.96 * 0.735 ≈ -0.144
interval =
± moe ≈ -0.1301 ± -0.144 ≈
(-0.2741, 0.0139)
Of course we could do these calculations in R
via the commands
n_one <- 66
x_one <- 43
phat_one <- x_one / n_one
phat_one
n_two <- 87
x_two <- 68
phat_two <- x_two / n_two
phat_two
pe <- phat_one - phat_two
pe
alpha=1-0.95
alphadiv2<-alpha/2
alphadiv2
z <- qnorm(alphadiv2, lower.tail=FALSE)
z
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
std_err
moe <- z*std_err
moe
pe - moe
pe + moe
Figure 1 gives the console view of these commands.
Figure 1
The pattern for finding such a confidence interval
does not change. About the only wrinkle that might be added
is to start with the raw data.
Case 2:
Consider the data in Table 1 and in
Table 2. In those tables a success is an item
with the value 2, anything else is a failure.
Assuming that Table 1 and Table 2 represent
random samples from two populations that have
more than 1480 and 1780 members respectively,
we can create a 90% confidence interval via the computations
# second example
n_one <- 74
x_one <- 24
phat_one <- x_one / n_one
phat_one
n_two <- 89
x_two <- 32
phat_two <- x_two / n_two
phat_two
pe <- phat_one - phat_two
pe
alpha=1-0.90
alphadiv2<-alpha/2
alphadiv2
z <- qnorm(alphadiv2, lower.tail=FALSE)
z
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
std_err
moe <- z*std_err
moe
pe - moe
pe + moe
The console view of those commands is shown in Figure 2.
Figure 2
Figure 2 shows all of the computations
resulting in a 90% confidence interval of
(-0.1578, 0.0873).
At this point it is pretty clear that the computations
should be captured in a function. One such function
is
ci_2popproportion <- function(
n_one, x_one, n_two, x_two, cl=0.95)
{
phat_one <- x_one / n_one
phat_two <- x_two / n_two
pe <- phat_one - phat_two
alpha=1-cl
alphadiv2<-alpha/2
z <- qnorm(alphadiv2, lower.tail=FALSE)
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
moe <- z*std_err
ci_low <- pe - moe
ci_high <- pe + moe
result <- c( ci_low, ci_high, moe,
std_err, z, alphadiv2,
phat_one, phat_two)
names( result ) <-
c("ci low", "ci_high", "M of E",
"Std. Err", "z-value", "alpha/2",
"p hat 1", "p hat 2")
return( result )
}
Once this function is defined, we can load it and use it
to solve the two problems presented above
via commands such as
source("../ci_2popproport.R")
# do the first problem
ci_2popproportion(66,43,87,68,0.95)
# do the second problem
ci_2popproportion(74,24,89,32,0.90)
which produce the results shown in Figure 3.
Figure 3
As expected, the result of the function call shown in Figure 3
gives us all the information that we so carefully constructed in the
earlier figures.
This is a good place to look at the effect of having larger
samples. To do this we will use the command
ci_2popproportion(74*12,24*12,89*12,32*12,0.90)
to generate a 90% confidence interval
from samples that are 12 times the size of the
Case 2 samples but that have exactly 12
times the number of successes in each sample.
Thus, the sample proportions will not change
even though the samples are much larger.
To help look at the effect of having a larger sample
Figure 4 first repeats the output of Case 2
and then shows the result of this new command.
Figure 4
Comparing the two results we can see that having that
larger sample size reduces the
standard error, which reduces
the margin of error which results in
a narrower confidence interval.
One small addition to this discussion is getting
the count of successes in the two samples.
We could do this in R
via the commands
gnrnd4( key1=956347307, key2=7943 )
#get the count which all we need
table(L1)
gnrnd4( key1=753758807, key2=8853 )
# get the count which all we need
table(L1)
Which produce the console view shown in Figure 5.
Figure 5
Clearly, Figure 5 indicates that there were 24 instances of
a 2 in Table 1 and 32 instances of a 2
in Table 2.
Hypothesis test on the difference of two proportions
Recall that we are in a situation where we have two
populations and each population has a proportion
of its members that have some characteristic.
We say that p1 is the
proportion of the first population that has the first population characteristic,
while p2
is the
proportion of the first population that has
the second population characteristic.
Here we are interested in running a statistical test
of the null hypothesis
H0: p1 = p2,
which has the equivalent and more useful form
H0: p1 - p2 = 0.
We will run this test against an alternative
hypothesis that is one of
- H1: p1 - p2 ≠ 0
- H1: p1 - p2 < 0
- H1: p1 - p2 > 0
Finally, we will run this test at some specified
level of significance.
To do this we want two independent random samples, one from
each of the populations.
In addition, in order to run this kind of test we
need
- The samples are big enough to have at least 10 successes
and 10 failures in each sample
- The samples are small enough so that they represent less
than 5% of their respective populations.
In such a situation the distribution of the
difference of the proportions will be normal.
That means that we can use the standard normal distribution
to find either critical values for the stated level of significance
or to find the attained significance of a sample statistic.
Remember that in order to make this
test we assume that the null hypothesis,
H0: p1 = p2,
is true. Under that condition our best
approximation for the proportion of successes
will be .
This is called the pooled sample proportion.
Using that pooled sample proportion gives us the standard error
defined by .
You may note that this is
algebraically equivalent to
,
a formula that is often used simply because it
is a slightly more efficient computation.
With all of this in hand we are ready to look at
using either the critical value or the
attained significance approach.
(It is worth noting that since this is a normal distribution,
and since the null hypothesis has the mean of the
difference of the proportions be 0, the two approaches
will be remarkably similar.)
Case 3: From two very large populations
we have taken two samples and found the
number of sample members with the specified characteristic
in 37 of the 92 items in sample 1
and 28 of the 83 items in sample 2.
Our null hypothesis is that the proportion of the
members with the specified characteristic in the two populations
is the same, i.e.,
H0: p1 - p2 = 0.
Our alternative hypothesis is that the proportion of the
members in the first population is greater than
the proportion in the second population,
i.e.,
H1: p1 > p2,
or its equivalent form
H1: p1 - p2 > 0.
We want to run this test at the 0.05 significance level.
Critical value approach
For a 5% significance level we need to find the z-value
that has 5% of the area under the standard normal curve to the right
of that value. We can use a table, a calculator, or the computer
to find this. It is approximately z=1.645.
From the data it is interesting to note that
= 37/92 ≈0.402
and
= 28/83 ≈0.337.
However, what we really want is the pooled sample proportion
= (37+28)/(92+83) ≈0.3714.
These computations can be done in R via the commands
# case 3 == hypothesis test
z_high <- qnorm(.05,lower.tail=FALSE)
z_high
n_one <- 92
x_one <- 37
phat_one <- x_one / n_one
phat_one
n_two <- 83
x_two <- 28
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
The result of those commands is given in Figure 6.
Figure 6
Using those results we can find the
= 0.0731.
Then, the critical value will be the
product of our z value with 5% above it times this standard error,
or about 1.645*0.0731 ≈ 0.1203.
Therefore, we will reject H0 if the
difference in the sample proportions is greater than
the critical value 0.1203.
However, in this case the difference in the proportions is
about 0.0648, a value that is not greater than the
critical value and we say that we do not have sufficient evidence
to reject H0 at the 0.05 level of significance.
These last calculations can be done in R using
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
crit_high <- z_high * std_err
crit_high
diff <- phat_one - phat_two
diff
Figure 7 holds the console image of those commands.
Figure 7
Attained significance approach
We will do almost all of the above calculations
for the attained significance approach.
In particular, we compute the
sample proportions, the
pooled sample proportion, and
the standard error.
Then we find the difference between the two
proportions and divide that difference by the
standard error. We have already seen
the computation of all those values except
for the final division. In this example, that becomes
0.0648/0.07314 ≈ 0.8862.
This quotient is a z-score and all we need to do is to
find the area to the right of that value
under the standard normal curve.
That value is about 0.1877, giving us about 18.77 of the area to
the right of 0.8862. With an 18.77% attained significance
we do not have sufficient information to reject
H0 at the 0.05 significance level.
Repeating some of the earlier commands, the following R
statements compute these values.
# redo those for the attained significance
n_one <- 92
x_one <- 37
phat_one <- x_one / n_one
phat_one
n_two <- 83
x_two <- 28
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
diff <- phat_one - phat_two
diff
z <- diff/std_err
z
pnorm( z, lower.tail=FALSE)
Figure 8 holds the console view of those statements.
Figure 8
Case 4:
From two very large populations
we have taken two samples and found the
number of sample members with the specified characteristic
in 54 of the 306 items in sample 1
and 108 of the 422 items in sample 2.
Our null hypothesis is that the proportion of the
members with the specified characteristic in the two populations
is the same, i.e.,
H0: p1 - p2 = 0.
Our alternative hypothesis is that the proportion of the
members in the first population is different from
the proportion in the second population,
i.e.,
H1: p1 ≠ p2,
or its equivalent form
H1: p1 - p2 ≠ 0.
We want to run this test at the 0.05 significance level.
Critical value approach
Because this is a 2-tail test we need to split that 5% to have 2.5%
below and 2.5% above.
We need to find the z-value
that has 2.5% of the area under the standard normal curve to the right
of that value.
And we need to find the z-value
that has 2.5% of the area under the standard normal curve to the left
of that value.
Of course, since the standard normal distribution
is symmetric these two values will just be additive inverses.
We can use a table, a calculator, or the computer
to find these values. They are approximately z=1.96 and, of course,
z= -1.96.
From the data it is interesting to note that
= 54/306 ≈0.1765
and
= 108/422 ≈0.2559.
However, what we really want is the pooled sample proportion
= (54+108)/(306+422) ≈0.2226.
These computations can be done in R via the commands
# case 4 == hypothesis test 2-sided
z_high <- qnorm(.025,lower.tail=FALSE)
z_high
n_one <- 306
x_one <- 54
phat_one <- x_one / n_one
phat_one
n_two <- 422
x_two <- 108
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
The result of those commands is given in Figure 9.
Figure 9
Using those results we can find the
= 0.0312.
Then, the critical values will be the
product of our z values times this standard error,
or about -1.96*0.0312 ≈ -0.0612
and about 1.96*0.0312 ≈ 0.0612.
Therefore, we will reject H0 if the
difference in the sample proportions is
less than the critical value -0.0612 or
greater than
the critical value 0.0.0612.
In this case the difference in the proportions is
about -0.0794, a value that is less than the lower
critical value and we say that we
reject H0 at the 0.05 level of significance.
These last calculations can be done in R using
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
crit_high <- z_high * std_err
crit_high
crit_low <- -z_high * std_err
crit_low
diff <- phat_one - phat_two
diff
Figure 10 holds the console image of those commands.
Figure 10
Attained significance approach
We will do almost all of the above calculations
for the attained significance approach.
In particular, we compute the
sample proportions, the
pooled sample proportion, and
the standard error.
Then we find the difference between the two
proportions and divide that difference by the
standard error. We have already seen
the computation of all those values except
for the final division. In this example, that becomes
-0.07945/0.0312 ≈ -2.544.
This quotient is a z-score and all we need to do is to
find the area
in the appropriate extreme tail, and then multiply
that area by 2.
We do this because the alternative hypothesis was
H1: p1 - p2 ≠ 0
which means we are looking for the probability of being this
or more extreme. We coud be this or more extreme by being in
either the lower or the upper tail. Thus, we need to multiply
the value here by 2 to account for both tails.
In this case, our z-score is negative so we will find the area
under the standard normal curve to the left of that value.
That value is about 0.00548, giving us about 0.548% of the area to
the left of -2.544.
When we double that we get about 0.01096 for the
attained significance. This is below our stated 0.05 level
of significance.
Therefore,
we reject
H0 at the 0.05 significance level.
Repeating some of the earlier commands, the following R
statements compute these values.
# redo those for the attained significance
n_one <- 306
x_one <- 54
phat_one <- x_one / n_one
phat_one
n_two <- 422
x_two <- 108
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
diff <- phat_one - phat_two
diff
z <- diff/std_err
z
if ( z > 0 )
{half_area <- pnorm( z, lower.tail=FALSE)} else
{half_area <- pnorm( z )}
half_area
half_area*2
Figure 11 holds the console view of those statements.
Figure 11
As we have noted before, it makes little sense
to keep reconstructing these commands. We are far better off if we
capture the algorithm in a function.
Consider the following possible function definition.
hypoth_2test_prop <- function(
x_one, n_one, x_two, n_two,
H1_type, sig_level=0.05)
{ # perform a hypothsis test for the difference of
# two proportions based on two samples.
# H0 is that the proportions are equal, i.e.,
# their difference is 0
# The alternative hypothesis is
# != if H1_type =0
# < if H1_type < 0
# > if H1_type > 0
# Do the test at sig_level significance.
phat_one <- x_one / n_one
phat_two <- x_two / n_two
phat <- (x_one+x_two) / (n_one+n_two)
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
diff <- phat_one - phat_two
if( H1_type==0)
{ z <- abs( qnorm(sig_level/2))}
else
{ z <- abs( qnorm(sig_level))}
to_be_extreme <- z*std_err
decision <- "Reject"
if( H1_type < 0 )
{ crit_low <- - to_be_extreme
crit_high = "n.a."
if( diff > crit_low)
{ decision <- "do not reject"}
attained <- pnorm( diff, mean=0, sd=std_err)
alt <- "p_1 < p_2"
}
else if ( H1_type == 0)
{ crit_low <- - to_be_extreme
crit_high <- to_be_extreme
if( (crit_low < diff) & (diff < crit_high) )
{ decision <- "do not reject"}
if( diff < 0 )
{ attained <- 2*pnorm(diff, mean=0, sd=std_err)}
else
{ attained <- 2*pnorm(diff, mean=0, sd=std_err,
lower.tail=FALSE)
}
alt <- "p_1 != p_2"
}
else
{ crit_low <- "n.a."
crit_high <- to_be_extreme
if( diff < crit_high)
{ decision <- "do not reject"}
attained <- pnorm(diff, mean=0, sd=std_err,
lower.tail=FALSE)
alt <- "p_1 > p_2"
}
result <- c( alt, n_one, x_one, phat_one,
n_two, x_two, phat_two,
phat, std_err, z,
crit_low, crit_high, diff,
attained, decision)
names(result) <- c("H1:",
"n_one","x_one", "phat_one",
"n_two","x_two", "phat_two",
"pooled", "Std Err", "z extreme",
"critical low", "critical high",
"difference",
"attained", "decision")
return( result )
}
Once this has been placed in the parent directory
under the name hypo_2popproport.R we can load and run
that function for the data in Case: 3 via the commands
source("../hypo_2popproport.R")
hypoth_2test_prop(37,92,28,83,1,0.05)
The console image of those two commands is shown in Figure 12.
Figure 12
The results shown in Figure 12 are the same as those found
above in Figures 6, 7, and 8.
To do Case 4
we use the command
hypoth_2test_prop(54,306,108,422,0,0.05)
to produce the output shown n Figure 13.
Figure 13
The results shown in Figure 123 are the same as those found
above in Figures 9, 10, and 11.
To do another example, this one another one-tail test,
we use the command
hypoth_2test_prop(54,306,108,422,-1,0.005)
the result of which appears in Figure 14.
Figure 14
Below is a listing of the R commands used in generating this page.
n_one <- 66
x_one <- 43
phat_one <- x_one / n_one
phat_one
n_two <- 87
x_two <- 68
phat_two <- x_two / n_two
phat_two
pe <- phat_one - phat_two
pe
alpha=1-0.95
alphadiv2<-alpha/2
alphadiv2
z <- qnorm(alphadiv2, lower.tail=FALSE)
z
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
std_err
moe <- z*std_err
moe
pe - moe
pe + moe
# second example
n_one <- 74
x_one <- 24
phat_one <- x_one / n_one
phat_one
n_two <- 89
x_two <- 32
phat_two <- x_two / n_two
phat_two
pe <- phat_one - phat_two
pe
alpha=1-0.90
alphadiv2<-alpha/2
alphadiv2
z <- qnorm(alphadiv2, lower.tail=FALSE)
z
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
std_err
moe <- z*std_err
moe
pe - moe
pe + moe
ci_2popproportion <- function(
n_one, x_one, n_two, x_two, cl=0.95)
{
phat_one <- x_one / n_one
phat_two <- x_two / n_two
pe <- phat_one - phat_two
alpha=1-cl
alphadiv2<-alpha/2
z <- qnorm(alphadiv2, lower.tail=FALSE)
std_err <- sqrt( phat_one*(1-phat_one)/n_one +
phat_two*(1-phat_two)/n_two)
moe <- z*std_err
ci_low <- pe - moe
ci_high <- pe + moe
result <- c( ci_low, ci_high, moe,
std_err, z, alphadiv2,
phat_one, phat_two)
names( result ) <-
c("ci low", "ci_high", "M of E",
"Std. Err", "z-value", "alpha/2",
"p hat 1", "p hat 2")
return( result )
}
source("../ci_2popproport.R")
# do the first problem
ci_2popproportion(66,43,87,68,0.95)
# do the second problem
ci_2popproportion(74,24,89,32,0.90)
# look at the effect of having a larger sample
ci_2popproportion(74*12,24*12,89*12,32*12,0.90)
gnrnd4( key1=956347307, key2=7943 )
#get the count which all we need
table(L1)
gnrnd4( key1=753758807, key2=8853 )
# get the count which all we need
table(L1)
# case 3 == hypothesis test
z_high <- qnorm(.05,lower.tail=FALSE)
z_high
n_one <- 92
x_one <- 37
phat_one <- x_one / n_one
phat_one
n_two <- 83
x_two <- 28
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
crit_high <- z_high * std_err
crit_high
diff <- phat_one - phat_two
diff
# redo those for the attained significance
n_one <- 92
x_one <- 37
phat_one <- x_one / n_one
phat_one
n_two <- 83
x_two <- 28
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
diff <- phat_one - phat_two
diff
z <- diff/std_err
z
pnorm( z, lower.tail=FALSE)
# case 4 == hypothesis test 2-sided
z_high <- qnorm(.025,lower.tail=FALSE)
z_high
n_one <- 306
x_one <- 54
phat_one <- x_one / n_one
phat_one
n_two <- 422
x_two <- 108
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
crit_high <- z_high * std_err
crit_high
crit_low <- -z_high * std_err
crit_low
diff <- phat_one - phat_two
diff
# redo those for the attained significance
n_one <- 306
x_one <- 54
phat_one <- x_one / n_one
phat_one
n_two <- 422
x_two <- 108
phat_two <- x_two / n_two
phat_two
phat <- (x_one+x_two) / (n_one+n_two)
phat
std_err <- sqrt( phat*(1-phat)*(1/n_one +1/n_two) )
std_err
diff <- phat_one - phat_two
diff
z <- diff/std_err
z
if ( z > 0 )
{half_area <- pnorm( z, lower.tail=FALSE)} else
{half_area <- pnorm( z )}
half_area
half_area*2
source("../hypo_2popproport.R")
hypoth_2test_prop(37,92,28,83,1,0.05)
hypoth_2test_prop(54,306,108,422,0,0.05)
hypoth_2test_prop(54,306,108,422,-1,0.005)
Return to Topics page
©Roger M. Palay
Saline, MI 48176 February, 2016