Frequency Tables -- Discrete Values
Return to Topics page
This page presents issues related building and interpreting
frequency tables for discrete values. [There is a link at the end of the page to
another page that demonstrates creating these same values in R.]
We start with an example using the data in Table 1
Just a quick look at the values in Table 1 reveals
that there are not all that many different values.
We want to see how many different values are in the
table and how often each value appears.
A table that provides such information is called a frequency table.
Here is such a table for the values in Table 1 above.
As you may have observed,
relative frequency is really the decimal form of the percent of
times that each value occurs in the original data table. If we multiply
the relative frequency by 100 (i.e., move the decimal point two places to the right)
we get that value in percent form. Thus, in Table 3
we see that
Another commonly computed value is the cumulative frequency of the values.
Remembering that the Data Values in Table 2 are arranged in increasing order,
as we read from left to right across the frequency values, we could keep a running total
and record that sum as we go along.
Just as we found the relative frequency by dividing the frequency
by the number of values in the data table, we can find the
relative cumulative frequency by dividing the cumulative frequency
by that total number of values. Table 5 displays this new value
for our example.
There are four things to note about the relative cumulative frequency.
- The first value is always the same as the first value in the relative frequency
because the first cumulative frequency is the same as the first frequency and
to get the relative values we divide each by the same number of total items.
- The final value is always 1.000, because the final cumulative frequency
is always the total number of values in the data table, which is the same number that
we use in the denominator for the calculation.
- The relative cumulative frequency
is also the cumulative relative frequency. That is, instead of doing our divisions
to find these values, we could have just kept a running total of the
relative frequenies. The two processes are mathematically identical.
- We can use the relative cumulative frequency to answer questions that mirror
those discussed above for using the cumulative frequency. For example,
the question "What percent of the values is less than or equal to some
value?" can be read right from this new line in the table.
There is another line that we can add to the table but we should add it with
some caution. In particular, it is unfortunately common for people to
be asked to construct a pie chart to show the distribution of values.
The unfortunate part, as discussed in the page about
pie charts is that
people actually have a hard time comparing areas in
pie charts and, more importantly, pie charts are
easily manipulated to sway those impressions. Nonetheless,
the task of making a pie chart is a common assignment.
To make a pie chart we need to determine the number of
degrees in the central angle of each slice. We can do that by
multiplying 360 (the number of degrees in the whole circle) by
the relative frequency of each item. So, as long as we
are making a frequency chart, we might as well add that computation to it.
Table 5 displays this new value
for our example.
Just so that you can see another example right here let us
consider the data in Table 7.
From that data we can generate a new Frequency table as
Before leaving this topic, we note that the frequency tables given so far have been
organized by row. We could do the same thing organized by column.
Table 9 below simply repeats the information of
Table 8 above, but organized by column.
For some people it is easier to read a vertical table such as Table 9
than it is to read a horizontal table such as Table 8.
Also, the vertical table tends to be more compact, especially when the number of distinct
items gets big. Let us say that we have 20 different values in our data.
We wouldn't want a table with 20 columns because it would be hard to fit
on a printed page. However a table with 20 rows
would not be a problem.
Computing these frequency values, and, in effect, constructing a frequency table
in R is discussed on the Frequency Tables in R page.
Return to Topics page
©Roger M. Palay
Saline, MI 48176 November, 2015