Return to Topics page

The

The

As we might expect, if we took repeated samples of size

It would also be the case that the

There is an interplay between the size of

For example, if we know that 74% of the students in introductory engineering courses are males, then, in a random sample of 94 students in such classes what is the probability that the sample will have 80% or more males? The 74% in the population makes

For

Then again, because we have been paying attention and because we have R, we could just form the command

Clearly, using the

pprop <- function( phat, p, n, lower.tail=TRUE) { if(p*n < 10) {return("n*p < 10, will not compute this")} if(n*(1-p) < 10) {return("n*(1-p)<10, will not compute this")} psd <- sqrt( p*(1-p)/n) prob <- pnorm(phat, p, psd, lower.tail) return( prob ) }And then the problem becomes one of just giving the command

You might note that the function definition uses

Were we to have a case where 74% of the students were male and we had a sample of size 36, then the value of 36*(1-0.74) would not be greater than or equal to 10. That tells us that we cannot apply the

Some note should be made here about problem statements that do not give you the probabilities for having the characteristic or not having it, that is, they do not give you

Here are three more examples, but we will use

Figure 5 gives data taken from the Center for Disease Control web page.

According to Figure 5 about 20% of people ages 18 to 24 years in Michigan reported smoking every day or some days in 2013. In a random sample of size 36 of people who lived in Michigan in 2013 and who were 18 to 24 then, what is the probability that 5 or fewer of them smoked every day of some days? We will just fill in the

We see that the first parameter is

The system has saved us from making a mistake!

If we change the problem, slightly, and ask in a sample of size 72 of these same people what is the probability of getting 10 or fewer smokers, then the command becomes the

It is interesting to note that we doubled the sample size and left the proportion the same (5/36=10/72) but now we have enough in the sample to use the approximation.

Figure 9 presents data taken from another CDC page.

If 31% of all traffic-related deaths in 2013 were in alcohol-impaired driving crashes and we take a sample of 58 driving crashes from 2013, what is the probability that between 15 and 45 of those crashes will have been deemed alcohol-impaired? We will do this by saying that the answer will be

Wait a minute! Why use ≤ in one place and < in the other? This immediately opens our eyes to an issue that we ignored in doing the previous problem. In particular, since the

Figure 10 gives an improved interpretation to the earlier problem (resulting in changing that answer to 12.5%), and an answer to this problem, namely, 83.84%.

The information in Figure 11 was taken from the Bureau of Transportation Statistics.

With that information in mind, for a sample of 85 random Delta flights between 01/01/2015 and 11/30/2015 with a destination of Detroit (DTW), what is the probability that there will be less than 5 or more than 80 delayed flights in the sample? We use the

Those are pretty extreme values, so we are not shocked to see such a small probability, 1.588%, as the answer.

©Roger M. Palay Saline, MI 48176 December, 2015