In general there are at least five "typical" distributions that we classify with special names. These are a

These 260 values are spread out between 36 and 80 in a fairly uniform manner. That is, if we break the range up into evenly spaced intervals we expect to see about the same number of values in each interval. The histogram of the data, shown in Figure 2, demonstrates this.

We do not expect that there will be exactly the same number of values in each interval, but we do expect that there will be approximately the same number of values for each interval.

We can look at the box plot for that data, shown in Figure 3.

There is not much in Figure 3 to distinguish this data from data we will see later for the normal and bimodal characteristics. We might note, however that the box shows that each 1/4 of the data is spread out over about the same span. That is, the two whiskers are about as long as are the two halves of the box.

However, the values are bunched up at the lower end of that range. We can see that when we look at the histogram of the data, shown in Figure 5.

This is an example of data that is said to be

Figure 6 holds a box plot of the data.

It is not a shock to see the box from Q

We can see this by looking at the histogram shown in Figure 8. Again, the "long tail" extends to the left and we call this kind of distribution

Figure 9 shows the box plot.

As expected the box is over on the right with the long whisker extending to the left.

If we look at a histogram of those values, shown in Figure 11, we will see the concentration of values close to the middle but with values trailing off in either direction.

Some note should be made here to the effect that the impression of the histogram is that of a bell shape. It is essentially balanced around its middle value and that middle range has the most number of values in it. In our language, that middle range is the "modal" range, having the highest frequency of values. Because the values are balanced around the middle, we also expect that the

Again, a look at the box plot, in Figure 12, is instructive.

The box plot is similar to the one we saw in Figure 3. The box is in the middle and the whiskers are approximately the same length. However, the box in Figure 12 is narrower than was the one in Figure 3. That is because we have a high concentration of values close to the median of the data.

However, this time we have two modal areas. We can see this when we look at the histogram shown in Figure 14.

We have one concentration in the interval from 48 to 52 and another popular area from 64 to 68. This is called a

Again, we look at the box plot, this time in Figure 15.

The bar plot of this distribution, in this example, is not that different from the plot shown in Figure 3. The box is wider than the box of Figure 12, but much of that is related to the particular example here where the two modal regions are similar and not two far apart.

The discussion above illustrates the usefulness of the histogram in characterizing data sets. We also saw that although the box plot was helpful for identifying skewed data sets, both left and right, it was not of much help with the other styles.

Below is a listing of the R commands used to produce the data values and graphs used in this page.

# let us try to create some example distributions # uniform source( "../gnrnd5.R") gnrnd5(key1=186754025901, key2=438000361) summary(L1) L1 hist(L1,ylim=c(0,80), xlim=c(32,84), breaks=seq(36,80,by=4), main="", xaxp=c(36,80,11), xlab="") boxplot(L1,horizontal=TRUE, ylim=c(32,84),xaxp=c(36,80,11) ) # skewed right gnrnd5(key1=187854025902, key2=438000361) summary(L1) L1 hist(L1,ylim=c(0,80), xlim=c(32,84), breaks=seq(36,80,by=4), main="", xaxp=c(36,80,11), xlab="") boxplot(L1,horizontal=TRUE, ylim=c(32,84),xaxp=c(36,80,11) ) # skewed left gnrnd5(key1=187854025903, key2=438000361) summary(L1) L1 hist(L1,ylim=c(0,80), xlim=c(32,84), breaks=seq(36,80,by=4), main="", xaxp=c(36,80,11), xlab="") boxplot(L1,horizontal=TRUE, ylim=c(32,84),xaxp=c(36,80,11) ) # normal gnrnd5(key1=183454025904, key2=52000580) summary(L1) L1 hist(L1,ylim=c(0,80), xlim=c(32,84), breaks=seq(36,80,by=4), main="", xaxp=c(36,80,11), xlab="") boxplot(L1,horizontal=TRUE, ylim=c(32,84),xaxp=c(36,80,11) ) # bimodal gnrnd5( key1=178854025905, key2=45000500, key3=44000660) summary(L1) L1 hist(L1,ylim=c(0,80), xlim=c(32,84), breaks=seq(36,80,by=4), main="", xaxp=c(36,80,11), xlab="") boxplot(L1,horizontal=TRUE, ylim=c(32,84),xaxp=c(36,80,11) )

©Roger M. Palay Saline, MI 48176 January, 2019