source( file="http://courses.wccnet.edu/~palay/math160r/gnrnd4.R") gnrnd4( key1=859459203, key2=800065 ) L1 tabulate(L1)which we use to generate the data values, verify that we have the same values, and then attempt to use the R command tabulate() to see if that produces the desired result. Figure 1 holds the Console image from an RStudio session where we performed those commands.
freq_vals<-as.vector(freq) freq_names<-names(freq)not only extract those lists but also assign them to freq_vals and freq_names, respectively. If we follow thse two commands by just giving the two variables, freq_vals and freq_names, we see what is now assigned to those two variables. All of this is shown in Figure 6.
freq_size <- length(L1) freq_rel <- freq_vals/freq_size freq_relwill compute the size of L1, store that value in freq_size, divide each value in freq_vals by that size, store the results as values in freq_rel, and finally, display the values in freq_rel. All of this is shown in Figure 8.
freq_cumul <- cumsum( freq_vals ) freq_cumulshown in Figure 8, compute the cumulative frequency, store those values in freq_cumul, and then display those values. All of this is shown in Figure 8.
freq_rel_cumul <- freq_cumul/freq_size freq_rel_cumuland the use of those commands in R is shown in Figure 9.
freq_pie <- round(360*freq_rel,1) freq_pieand the use of those commands in R is shown in Figure 10.
df_freq <- data.frame( freq) df_freqto create a data frame called df_feq from the table freq. Recall that we know freq has both labels and values in it. (We saw that back in Figure 6.) When we perform the commands just noted R takes the table freq and puts it into the data frame structure called df_freq. In doing so, R has created df_freq with two columns, one for the labels and one for the values. All of this is shown in Figure 12.
|
A small aside. The work in preparation for this web page was done in a RStudio session. The View(df_freq) command in just a straight forward R session behaves in a slightly different fashion. In that case, the View(df_freq) command opens a new window with the values in it. A display of such a window is given in Figure 15a.
This window is not nearly as powerful as is the window in RStudio. [We will see some of that power later on this page.] However, it does look nice. |
df_freq$rel<-freq_relThis will create a new column in df_freq, called rel, and assign the values found in freq_rel to that new column. Please note that although we kept the names pretty similar, there is no requirement to do so. Figure 16 shows the command from the Console pane in our RStudio session.
df_freq$cumul<-freq_cumul df_freq$rel_cumul<-freq_rel_cumul df_freq$pie<-freq_pieas shown in Figure 19.
make_freq_table <- function( lcl_list )
{
## This function will create a frequency table for
## the one variable sent to it where that
## table gives the items, the frequency, the relative
## frequeny, the cumulative frequency, the relative
## cumulative frequency, and the number of degrees to
## allocate in a pie chart.
##
## The actual result of this function is a data frame
## holding that table.
lcl_freq <- table( lcl_list )
lcl_size <- length( lcl_list )
lcl_df <- data.frame( lcl_freq )
names( lcl_df ) <- c("Items","Freq")
lcl_values <- as.numeric( lcl_freq )
lcl_df$rel_freq <- lcl_values / lcl_size
lcl_df$cumul_freq <- cumsum( lcl_values )
lcl_df$rel_cumul_freq <- cumsum( lcl_values ) / lcl_size
lcl_df$pie <- round( 360*lcl_df$rel_freq, 1 )
lcl_df
}
The lines are provided above so that you can, if desired, just copy them from this web page
and paste them into your new, blank workspace.
Alternatively, you could just type them into the workspace.
make_freq_table <- function( lcl_list )Assigns to the name make_freq_table a function
that will be defined by the rest of this line and all the rest of the lines
those enclosed by the { and } pair of characters.
Furthermore,
this function will have a single argument which we will call
lcl_list for the duration of the function definition. Our intent
is to be able to call this function and send to it a list of values.
Most likely that list will be in the variable L1 but it could
be in any variable. If the values are in L1 then we will call the function by
using the command make_freq_list(L1) in shich case lcl_list will be
assigned a copy of L1.
{ The squiggly brace on line 2 marks the start of the body of the function definition.
It will have to be matched by a closing squiggly brace at the end of the definition.
lcl_freq <- table( lcl_list )
Use the table() function to
get a count of the differrent values that are
stored in lcl_list. Put that result in
lcl_freq.
lcl_size <- length( lcl_list )
Use the length() function to determine the
number of values in the lcl_list. Put that result in lcl_size,
lcl_df <- data.frame( lcl_freq )
Use the data.frame()
function to convert the 'table' that we created
in lcl_freq into a data frame.
names( lcl_df ) <- c("Items","Freq")
This is a command that we did not use originally,
but it was included here to force
the names of the two columns in lcl_df to
be Items and Freq, respectively.
lcl_values <- as.numeric( lcl_freq )
Use the function as.numeric()
to pull out the values that make up
the table that we had created.
We do this because it will make the next statement more clear.
lcl_df$rel_freq <- lcl_values / lcl_size
Compute the relative frequency by dividing the
frequency values by the number
of values in the original list.
Store this group of values in a new column of
lcl_df called rel_freq.
lcl_df$cumul_freq <- cumsum( lcl_values )
Use the
cumsum() function to get the cumulative sums and store those in
a new column of
lcl_df called cumul_freq.
lcl_df$rel_cumul_freq <- cumsum( lcl_values ) / lcl_size
Use the cumsum() function to find the cumulative
sum of values (this is a bit wasteful
since we had made this computation before,
but it jsut a wasted bit of machine time) and then divide those values by
the number of values in the original list.
Then store the results in
a new column of
lcl_df called rel_cumul_freq.
lcl_df$pie <- round( 360*lcl_df$rel_freq, 1 )
Compute 360 times the relative frequency values,
round the answers t 1 decimal place, and store the
results in
a new column of
lcl_df called pie.
lcl_df
Make the value of the function be the data frame
that we have created. This is important
in that if, later, we just call the
function make_freq_table() then the
result will be the data frame and R will display the
values in that data_frame.
However, if we call the function
make_freq_table() and assign it to a variabe,
then that variable will be assigned the value of the
data frame that we created in the function.
} Finally, the closing brace marking the end of our function definition.
dd<-make_freq_table( L1 ) ddcause R to run our newly defined function make_freq_table() using L1 to give values to lcl_list in the function. The result of the computations within the function, namely the data frame constructed within the function, is then assigned to the variable dd. The second line, dd causes R to display the values now in dd. All of this is shown in Figure 32.
source("make_freq_table.R")
to tell R to read the contents of that file
as if we had typed them into
our R session. This is done in Figure 36.
Note that the command here tells R
to load the function from the current working directory. This works because we saved
that function to this directory earlier. If we had wanted to load the function
from the parent directory which contains the functions I have provided,
then we would use the command
source("../make_freq_table.R") instead.
|
source("make_freq_table.R")
source( file="http://courses.wccnet.edu/~palay/math160r/gnrnd4.R")
gnrnd4( key1=546789202, key2=1200034 )
L1
new_df <- make_freq_table( L1 )
new_df
View( new_df )
to load the required functions, generate the data,
run the make_freq_table( L1 ) function
and store the result in new_df, display the contents of new_df
in the Console area, and finally via the view(nnew_df )
command, open a new window in our RStudio
session to display the table.
source("make_freq_table.R") was discussed above.
Figure 38 show executing the second and third lines of code in the
Cpnsole window.
L1, just displays the
data values that we have generated. This is shown in Figure 39
and we can verify those values against the values in Table 5.
new_df <- make_freq_table( L1 ) new_dfjust call our function, passing the values in L1 to that function, assign the result of the function to the variable new_df, and finally display the contents of that new variable. This is shown in Figure 40.
View( new_df ).
There is no result of this in the Console window, as seen in Figure 41.
# Frequency tables in R
#
# For this script, rather than look for our files
# in our "parent" folder, we will load them from
# Palay's website.
source( file="http://courses.wccnet.edu/~palay/math160r/gnrnd4.R")
# generate the list of value shown on the web page
gnrnd4( key1=859459203, key2=800065 )
L1 #verify that we have the right values
# now try to use the built-in tabulate() function
tabulate(L1)
# That gave us more than we want
# shift over to use the built-in table() function
table( L1 )
# That gives us just the values that we want.
# However, let us store those values in a new variable
freq <- table( L1 )
freq # and then look at what we have stored
# We notice that our variable freq holds both
# the names of the items and the values of the
# frequencies. Let us pull those out, separately,
# and store them in their own variables
freq_vals<-as.vector(freq)
freq_names<-names(freq)
# then look at what we have stored
freq_vals
freq_names
# now we wwant to move on to finding the relative
# frequencies. To do that we need to divide each
# frequency by the total number of items.
# first get the total number of items
freq_size <- length(L1)
# then compute the relative frequencies and save
# those computed values in a new variable
freq_rel <- freq_vals/freq_size
# now look at those values
freq_rel
# Now we are ready to find the cumulative frequencies.
# To do this we can use the built-in function cumsum().
# And, we will store those values before we look
# at them.
freq_cumul <- cumsum( freq_vals )
freq_cumul
# Now it is an easy step to generate and then
# look at the relative cumulative frequencies.
# We just divide the cumulative frequencies by
# the number of items, which we computed and
# saved earlier.
freq_rel_cumul <- freq_cumul/freq_size
freq_rel_cumul
# And, even though we know it is a bad idea to
# make and use a pie chart, and even though R
# would do that for us, it is a easy step to
# compute the number of degrees to allocate
# in a pie chart for each of the different
# values in our data. We just multiply the
# relative frequencies by 360. In this case we
# take a further step and round that to one
# decimal place.
freq_pie <- round(360*freq_rel,1)
freq_pie
# So far we have computed all of the values that
# we would include in a frequency table. What we
# have not done is to put all of those values
# into a construct that will display the
# completed frequency table in R.
# We will do that now but putting copies of
# the desired variables into a dataframe.
df_freq <- data.frame( freq)
df_freq
# With that simple start we have the beginning
# of our "vertical" frequency table.
# We will take a small step sideways here to
# look at another way that R can display that
# table. We can use the View() function to
# do that. Be sure to note the capital V.
View(df_freq)
# Now we can return to our task of building
# our complete frequency table. We can add the
# relative frequencies to our dataframe
df_freq$rel <- freq_rel
# and we could get a new view of that dataframe
View(df_freq)
# Now complete our build by adding the other
# three columns.
df_freq$cumul <- freq_cumul
df_freq$rel_cumul <- freq_rel_cumul
df_freq$pie <- freq_pie
# again, we can use View() to see this table
View(df_freq)
# or we could go back to our old method and
# just look at it in the console display
df_freq
# the web page on this topic goes through the step
# to create a new function that captures
# all of the steps that we have taken to
# make a frequency table.
# Here we will just load that function, again this
# time from Palay's web page rather than from
# the parent directory.
source( file="http://courses.wccnet.edu/~palay/math160r/make_freq_table.R")
# now that the function make_freq_table() is loaded
# into our environment we can use it to duplicate
# all of the painful work that we did above in lines
# 21 through 108.
dd <- make_freq_table( L1)
dd
# Or we could use View() to get the nicer
# looking table.
View( dd )
# The we page goes on, building on the fact that
# there were instructions on the web page for
# actually creating and saving the function in
# our current directory, to show an alternative
# way to load the function. We did not create
# that local version of the function, but we
# do have a version in our parent folder. So,
# we will demonstrate here how to load functions
# from our parent folder.
# First, we will use the dangerous but effective
# command rm() to wipe out our entire environment.
rm( list=ls() )
# Notice the environment is now empty.
# First we want to use gnrnd4() to generate some
# values. To do this we need to load gnrnd4()
# into our environment.
source("../gnrnd4.R")
# now generate the values in Table 5 of the
# web page.
gnrnd4( key1=546789202, key2=1200034 )
L1 # just to verify the values
# now we want a full frequency table for those
# values. We can use make_freq_table() to do
# this, but first we need to load the function.
source("../make_freq_table.R")
new_df <- make_freq_table( L1 )
new_df
View( new_df )
# Having seen this, with just a few commands we
# now have a way to generate full frequency
# tables.
# in fact, given that we have make_freq_table()
# in our environment, we can generate a large
# data set and then apply our function to that
# to get a new frequency table.
source("../gnrnd5.R")
gnrnd5(78034095603, 13000045)
head( L1,20)
tail(L1, 20)
make_freq_table( L1 )