Frequency Tables -- Discrete Values

This page presents issues related building and interpreting frequency tables for discrete values. [There is a link at the end of the page to another page that demonstrates creating these same values in R.] We start with an example using the data in Table 1 given below.

Just a quick look at the values in Table 1 reveals that there are not all that many different values. We want to see how many different values are in the table and how often each value appears. A table that provides such information is called a frequency table. Here is such a table for the values in Table 1 above.

As you may have observed, relative frequency is really the decimal form of the percent of times that each value occurs in the original data table. If we multiply the relative frequency by 100 (i.e., move the decimal point two places to the right) we get that value in percent form. Thus, in Table 3 we see that

Another commonly computed value is the cumulative frequency of the values. Remembering that the Data Values in Table 2 are arranged in increasing order, as we read from left to right across the frequency values, we could keep a running total and record that sum as we go along.

Just as we found the relative frequency by dividing the frequency by the number of values in the data table, we can find the relative cumulative frequency by dividing the cumulative frequency by that total number of values. Table 5 displays this new value for our example.

There are four things to note about the relative cumulative frequency.
1. The first value is always the same as the first value in the relative frequency because the first cumulative frequency is the same as the first frequency and to get the relative values we divide each by the same number of total items.
2. The final value is always 1.000, because the final cumulative frequency is always the total number of values in the data table, which is the same number that we use in the denominator for the calculation.
3. The relative cumulative frequency is also the cumulative relative frequency. That is, instead of doing our divisions to find these values, we could have just kept a running total of the relative frequenies. The two processes are mathematically identical.
4. We can use the relative cumulative frequency to answer questions that mirror those discussed above for using the cumulative frequency. For example, the question "What percent of the values is less than or equal to some value?" can be read right from this new line in the table.

There is another line that we can add to the table but we should add it with some caution. In particular, it is unfortunately common for people to be asked to construct a pie chart to show the distribution of values. The unfortunate part, as discussed in the page about pie charts is that people actually have a hard time comparing areas in pie charts and, more importantly, pie charts are easily manipulated to sway those impressions. Nonetheless, the task of making a pie chart is a common assignment.

To make a pie chart we need to determine the number of degrees in the central angle of each slice. We can do that by multiplying 360 (the number of degrees in the whole circle) by the relative frequency of each item. So, as long as we are making a frequency chart, we might as well add that computation to it. Table 5 displays this new value for our example.

Just so that you can see another example right here let us consider the data in Table 7.

From that data we can generate a new Frequency table as

Before leaving this topic, we note that the frequency tables given so far have been organized by row. We could do the same thing organized by column. Table 9 below simply repeats the information of Table 8 above, but organized by column.

For some people it is easier to read a vertical table such as Table 9 than it is to read a horizontal table such as Table 8. Also, the vertical table tends to be more compact, especially when the number of distinct items gets big. Let us say that we have 20 different values in our data. We wouldn't want a table with 20 columns because it would be hard to fit on a printed page. However a table with 20 rows would not be a problem.

Computing these frequency values, and, in effect, constructing a frequency table in R is discussed on the Frequency Tables in R page.